this post was submitted on 18 Feb 2025
20 points (100.0% liked)
LocalLLaMA
2791 readers
19 users here now
Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.
Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.
As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I've tried enabling Vulkan on my Intel laptop without a dedicated GPU. But that just makes everything slower.
Did you try running it on the CPU only (BLAS)? Or run it just on the faster and more modern GPUs and see what they do, to compare the numbers to some sort of baseline? Or old GPU only, without more modern ones in the mix? I mean I don't really see the point here. Your computer must be splitting everything up and doing most of the compute somewhere else, if you attach a graphics card with only 1GB of VRAM and the model needs about 8GB. And I'm not sure if the added complexity just makes it slower, or whether it adds something to it. And I'm not sure if I'm missing something or if the output doesn't even show how it gets split up, and what gets executed on which GPU.
There isn't one. I guess I should have made that more clear. Sorry. 🫤
Nope, just a guy with too much time on his hands. I mean, I hope someone out there found it a little informative. There are a lot of people thinking "If Ollama doesn't work then I'm out of luck." I'm just trying to let people know there are other options.
Yes, the Nvidia cards get 30+ t/s together or individually, but the point of this was to see if AMD and Nvidia could work together. Now that this works, I might actually buy an AMD GPU.