In addition to pushing something before it’s ready and where it’s not welcome, Apple’s own stinginess completely screwed them over.
What do LLMs need to be smart? RAM, both for their weights and holding real data to reference. What has apple relentlessly price gouged and skimped on for years? Yeah, I’ll give you one guess…
But its too slow for the weights. What generative models fundamentally do is run a full pass through the multi-gigabyte weights for every 'word' or diffusion step, so even 128-bit DDR5 like you find on desktop CPUs is too slow.
In addition to pushing something before it’s ready and where it’s not welcome, Apple’s own stinginess completely screwed them over.
What do LLMs need to be smart? RAM, both for their weights and holding real data to reference. What has apple relentlessly price gouged and skimped on for years? Yeah, I’ll give you one guess…
SSDs?
For RAG data? It works.
But its too slow for the weights. What generative models fundamentally do is run a full pass through the multi-gigabyte weights for every 'word' or diffusion step, so even 128-bit DDR5 like you find on desktop CPUs is too slow.