Throwaway4669332255

joined 2 years ago
[–] [email protected] 9 points 2 months ago (4 children)

But i'll trickle down? Right? RIGHT??

[–] [email protected] 15 points 6 months ago (1 children)

Idk man I've yet to know anyone who died from drinking magma.

[–] [email protected] 6 points 9 months ago (14 children)

How does the Nemo 12B compare to the Llama 3.1 8B?

[–] [email protected] 2 points 1 year ago

Apparently I am an idiot and read the wrong paper. The previous paper mentioned that "comparable with the 8-bit models"

https://huggingface.co/papers/2310.11453

[–] [email protected] 1 points 1 year ago (2 children)

They said their's is "comparable with the 8-bit models". Its all tradeoffs. It isn't clear to me where you allocate your compute/memory budget. I've noticed that full 7b 16 bit models often produce better results for me than some much larger quantied models. It will be interesting to find the sweet spot.

[–] [email protected] 2 points 1 year ago (4 children)

So are more bits less important than more paramters? Would a higher paramter or higher bit count matter more if the models ended up the same size?

[–] [email protected] 1 points 1 year ago

I'm so glad I work for a medium-small company. We moved to a smaller office and only require to go in twice a month

[–] [email protected] 1 points 1 year ago

Thank you! I had no idea this existed.

[–] [email protected] 21 points 1 year ago

I am surprised reddit hasn't removed this post yet.

I got an account banned for saying "lemmy dot world" when someone asked "Are there even any good alternatives?"

[–] [email protected] 7 points 1 year ago

He's opinionated but a pretty good science communicator.

view more: next ›