LocalLLaMA

2747 readers

18 users here now

Welcome to LocalLLama! This is a community to discuss local large language models such as LLama, Deepseek, Mistral, and Qwen.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support eachother and share our enthusiasm in a positive constructive way.

founded 2 years ago

MODERATORS

[email protected]

What is a good model that runs on 6GB Vram? (discuss.online)

submitted 1 month ago by [email protected] to c/[email protected]

10 comments fedilink hide all child comments

Should be good at conversations and creative, it'll be for worldbuilding

Best if uncensored as I prefer that over it kicking in when I least want it

I'm fine with those roleplaying models as long as they can actually give me ideas and talk to be logically

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 1 points 1 month ago

At a certain point, layers will be pushed to RAM leading to incredibly slow inference. You don't want to wait hours for the model to generate a single response.