Introduction
Back when DeepSeek went viral, I was curious to see if it could be run natively on a supercomputer, specifically Setonix. The DeepSeek team provides several deployment methods. But, as you might expect, installing software on a supercomputer is difficult — you’re typically limited to using the pre-installed modules. Fortunately, most supercomputers support containerised workflows.
After many rounds of testing (side note: SGLang’s Docker image was missing dependencies — not exactly plug-and-play), I eventually decided to use the AMD-packaged vLLM Docker image, since the GPU in Setonix is the MI250X.
This post mainly serves as a command log so I can refer back to it later. In the end, I wasn’t successful in running the full version of DeepSeek, because it uses 8-bit quantisation, which isn’t supported on the MI250X. It requires an MI300 series GPU. It’s theoretically possible to convert the model to 16-bit and run it across four nodes, but that’ll have to wait for another time.
Step-by-step Instructions
Load the Singularity module:
|
|
Pull the Docker image:
|
|
After the pull completes, a .sif
file will be created in the current directory.
Run the container:
|
|