this post was submitted on 08 Jun 2023
5 points (100.0% liked)
LocalLLaMA
2792 readers
7 users here now
Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.
Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.
As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I could help with moderation, but I have a question, how to set up LLAma on my mac computer? any tips?
Hi, sure, thank you so much for helping out! As for LLaMA, I would point you at llama.cpp, (https://github.com/ggerganov/llama.cpp) which is the absolute bleeding edge, but also has pretty useful instructions on the page (https://github.com/ggerganov/llama.cpp#usage). You could also use Kobold.cpp, but I don't have any experience with it, so I can't help you if you have issues.
Adding to this: text-generation-webui (https://github.com/oobabooga/text-generation-webui) works with the latest bleeding edge llama.cpp via llama-cpp-python, and it has a nice graphical front-end. You do have a manually tell pip to install llama.cpp-python with the right compiler flags to get GPU acceleration working but the llama-cpp-python github and ooba github explain how to do this.
You can even set up GPU acceleration through metal on m1 Macs I've seen some fucking INSANE performance numbers online for the higher RAM MacBook pros (20+ tokens/sec, I think with a 33b model, but it might have been 13b, either way, impressive.)