Article: https://proton.me/blog/deepseek
Calls it “Deepsneak”, failing to make it clear that the reason people love Deepseek is that you can download and it run it securely on any of your own private devices or servers - unlike most of the competing SOTA AIs.
I can’t speak for Proton, but the last couple weeks are showing some very clear biases coming out.
im not an expert at criticism, but I think its fair from their part.
I mean, can you remind me what are the hardware requirements to run deepseek locally?
oh, you need a high-end graphics card with at least 8 GB VRAM for that*? for the highly distilled variants! for more complete ones you need multiple such graphics card interconnected! how do you even do that with more than 2 cards on a consumer motherboard??
how many do you think have access to such a system, I mean even 1 high-end gpu with just 8 GB VRAM, considering that more and more people only have a smartphone nowadays, but also that these are very expensive even for gamers?
and as you will read in the 2nd referenced article below, memory size is not the only factor: the distill requiring only 1 GB VRAM still requires a high-end gpu for the model to be usable.
https://www.tomshardware.com/tech-industry/artificial-intelligence/amd-released-instructions-for-running-deepseek-on-ryzen-ai-cpus-and-radeon-gpus
https://bizon-tech.com/blog/how-to-run-deepseek-r1-locally-a-free-alternative-to-openais-o1-model-hardware-requirements#a6
https://codingmall.com/knowledge-base/25-global/240733-what-are-the-system-requirements-for-running-deepseek-models-locally
so my point is that when talking about deepseek, you can’t ignore how they operate their online service, as most people will only be able to try that.
I understand that recently it’s very trendy, and cool to shit on Proton, but they have a very strong point here.
Here is how you can run the 671B model without using graphic cards for about $6.000. Here is the post on X.
The 1.5B version that can be run basically on anything. My friend runs it in his shitty laptop with 512MB iGPU and 8GB of RAM (inference takes 30 seconds)
You don’t even need a GPU with good VRAM, as you can offload it to RAM (slower inference, though)
I’ve run the 14B version on my AMD 6700XT GPU and it only takes ~9GB of VRAM (inference over 1k tokens takes 20 seconds). The 8B version takes around 5-6GB of VRAM (inference over 1k tokens takes 5 seconds)
The numbers in your second link are waaaaaay off.
Just because the average consumer doesn’t have the hardware to use it in a private manner does not mean it’s not achievable. The article straight up pretends self hosting doesn’t exist.