this post was submitted on 14 Aug 2023
11 points (100.0% liked)

LocalLLaMA

2813 readers
58 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

founded 2 years ago
MODERATORS
 

Still pretty new to local LLMs, and there's been a lot of development since I dipped my toe in. Suffice to say I'm fairly swamped and looking for guidance to the right model for my use

I want to feed the model sourcebooks, so I can ask it game mechanic questions and it will respond with reasonable accuracy (including page references). I tried this with privateGPT a month or two back, and it kinda worked but it was slow and wonky. It seems like things are a bit cleaner now

top 3 comments
sorted by: hot top controversial new old
[–] [email protected] 3 points 2 years ago (1 children)

A lot of the speed depends on hardware. Generally, in my experience so far the most accurate models are the largest you can run in 4 bit. I can barely run a Llama2 70B GGML in 4 bit with a 16 layers on a 3080Ti and everything else on 64GB of DDR5. A solid block of 200-300 words takes about 3 minutes to complete, but the quality of the results is well worth it. I also use a WizardLM 30B with 4 bit GGML. It takes around 2 minutes for an equivalent output. Anything in the 7-20B range is like asking the average teenager for technical help. It is possible to have a functional smalltalk conversation with one, but don't hand them your investment portfolio, ask them to toss a new clutch in the car, or secure a corporate server rack even if they claim expertise. Maybe with some embeddings and a bunch of tuning better results are possible. I have only tried 2 13B's a dozen 7B's half a dozen ~30B's and 2 70B's.

[–] [email protected] 1 points 2 years ago

I'm still on 3060Ti, but then speed isn't my biggest concern. I'm primarily focused on reasonably accurate "understanding" of the source material. I got pretty good results with GPT 4, but I feel like focusing my training data could help avoid irrelevant responses.

[–] [email protected] 1 points 2 years ago* (last edited 2 years ago)

Can you give us a link to the software you use? Or is the ttrpg assistant something you develop?

Edit: Sry. I didn't get it. You write you're feeding rulebooks into privateGPT. I somehow thought you had something roleplay specific.