LocalLLaMA

2739 readers

41 users here now

Welcome to LocalLLama! This is a community to discuss local large language models such as LLama, Deepseek, Mistral, and Qwen.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support eachother and share our enthusiasm in a positive constructive way.

founded 2 years ago

MODERATORS

[email protected]

Loaded benchmark for 1-3-4-7b models? (lemm.ee)

submitted 6 days ago by [email protected] to c/[email protected]

4 comments fedilink hide all child comments

I don't care a lot about mathematical tasks, but code intellingence is a minor preference but the most anticipated one is overall comprehension, intelligence. (For RAG and large context handling) But anyways any benchmark with a wide variety of models is something I am searching for, + updated.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 3 points 6 days ago (1 children)

i use pageassist with Ollama

[–] [email protected] 2 points 5 days ago* (last edited 5 days ago) (1 children)

Cool, page assist looks neat I'll have to check it out sometimes. My llm engine is kobold.cpp, and I just user the openwebui in internet browser to connect.

Sorry I don't really have good suggestions for you beyond to just try some of the more popular 1-4bs in a very high quant if not full f8 and see which works best for your use case.

Llama 4b, mistral 4b, phi-3-mini, tinyllm 1.5b, qwen 2-1.5b, ect ect. I assume you want a model with large context size and good comprehension skills to summarize youtube transcripts and webpage articles? At least I think thats what the add-on you mentioned suggested was its purpose.

So look for models with those things over ones that try to specialize in a little bit of domain knowledge.

[–] [email protected] 2 points 4 days ago

I checked mostly all of em out from the list, but 1b models are generally unusable for RAG.