this post was submitted on 28 Apr 2025
17 points (100.0% liked)

LocalLLaMA

2935 readers
23 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

founded 2 years ago
MODERATORS
 

Qwen3 was apparently posted early, then quickly pulled from HuggingFace and Modelscope. The large ones are MoEs, per screenshots from Reddit:

screenshots

Including a 235B/22B active and a 30B/3B active.

Context appears to 'only' be 32K unfortunately: https://huggingface.co/qingy2024/Qwen3-0.6B/blob/main/config_4b.json

But its possible they're still training them to 256K:

from reddit

Take it all with a grain of salt, configs could change with the official release, but it appears it is happening today.

top 1 comments
sorted by: hot top controversial new old
[–] [email protected] 1 points 6 days ago

Seems that there are both dense and sparse models with this launch, like the 1.5 release. This "leak" (for instance) references what appears to be a real Qwen3 32B:

https://huggingface.co/second-state/Qwen3-32B-GGUF