Seems there's not a lot of talk about relatively unknown finetunes these days, so I'll start posting more!
Openbuddy's been on my radar, but this one is very interesting: QwQ 32B, post-trained on openbuddy's dataset, apparently with QAT applied (though it's kinda unclear) and context-extended. Observations:
-
Quantized with exllamav2, it seems to show lower distortion levels than nomal QwQ. Its works conspicuously well at 4.0bpw and 3.5bpw.
-
Seems good at long context. Have not tested 200K, but it's quite excellent in the 64K range.
-
Works fine in English.
-
The chat template is funky. It seems to mix up the and <|think|> tags in particular (why don't they just use ChatML?), and needs some wrangling with your own template.
-
Seems smart, can't say if it's better or worse than QwQ yet, other than it doesn't seem to "suffer" below 3.75bpw like QwQ does.
Also, I reposted this from /r/locallama, as I feel the community generally should going forward. With its spirit, it seems like we should be on Lemmy instead?
See, you say this, but (with all due respect, as I get the perspective), you might vehemently disagree with me asserting that "boycotting AI is playing right into tech bro's hands."
So yes I agree with the irony, but I also feel like nuance and sub-arguments (from my perspective) get drowned out, too. Not every single Democrat politician is an oligarch, the US has done some good abroad, and using ML as a FOSS tool you own and host is not necessarily bad.