this post was submitted on 28 Feb 2024
41 points (100.0% liked)
LocalLLaMA
2841 readers
1 users here now
Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.
Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.
As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I think we're already getting there. Lots of newer phones include AI accelerators. And all the companies advertise for AI. I don't think they're made to run LLMs, but anyways. Llama.cpp already runs on phones. And the limiting factor seems to be the RAM. I've tried Microsoft's "phi-2", quantized and on slow hardware, it's surprisingly capable for such a small model. Something like a ternary model would significantly cut down on the amount of RAM that is being used which allows to load larger models while also making it faster, everywhere. So I'd say yes. And it would also allow me to load a more intelligent model on my PC.
I think the doing away with matrix multiplications is also a big deal, but has little consequences as of today. You'd first need to re-design the chips to take advantage of that. And local inference is typically limited by memory bandwidth, not multiplication speed. At least as far as I understand.
I'd say if this is true, it allows for a big improvement in parameter count for all kinds if use-cases. But I've also come to the conclusion that there might be a caveat to that. Maybe the training is prohibitively expensive. I don't really know, at this point there is too much speculation going on and I'm not really an expert.
Yeah I knew about the AI chips being more common but this is a really good write up, thanks!