Just an fyi, llama is float 16 under the hood I believe. Stable diffusion is float32. Basically no machine learning model I’ve ever heard of is float64 based.. the only people using float64 on gpus is physicists/applied math people on doe supercomputers.
(Weirdly enough, commodity hardware is now on the opposite end: they used to struggle with float64 when people were trying to put physics models on commodity hardware, now they struggle with float16/float8 support when people are trying to put language models on commodity hardware.)
This isn’t an argument against the standard way of doing things, it is an argument to follow the xdg standard, and use xdg environment variables, rather than creating a new unconfigurable directory in $HOME.