Running ollama on Pelle¶
Summary¶
- Ollama needs to be installed in the project folder.
- The location of the ollama models needs to be redirected in to a folder within the project
export OLLAMA_MODELS=/proj/naiss-XXXXX/.... ollama servemust run on an allocated GPU or CPU node.- Interaction with your
ollama servecould be done- on the allocated node itself
- on the login node
- on your own computer
- Running non-interactive tasks
User installation of ollama¶
- Login to Pelle and select a folder under your project allocation
-
Download
ollama -
Make folder where you will unpack
Yourollamabinary is in/proj/naiss-XXXXX/userid/nobackup/bin/ollama
Running ollama on a compute node¶
-
Start interactive node with GPU https://docs.uppmax.uu.se/cluster_guides/slurm_on_pelle/
interactive -A nais-XXXXX -p gpu -t 1:00:00 --gpus=l40s:1 # or on the old haswell nodes interactive -A naiss-XXXXX -p haswell -c 16 -t 1:00:00 --gpus=t4:1Take a note of the# exampe output from running the last command salloc: Pending job allocation 4068884 salloc: job 4068884 queued and waiting for resources salloc: job 4068884 has been allocated resources salloc: Granted job allocation 4068884 salloc: Waiting for resource configuration salloc: Nodes p2033 are ready for jobJOBID 4068884and the node address:p2033 -
Start
ollama servein the interactive session - Look at the end that ollama dicovered the GPU
... time=2026-03-20T07:58:02.030+01:00 level=INFO source=types.go:42 msg="inference compute" id=GPU-92ada8a8-9cda-effe-830e-0bdfcd3e0a96 filter_id="" library=CUDA compute=7.5 name=CUDA0 description="Tesla T4" libdirs=ollama,cuda_v13 driver=13.1 pci_id=0000:81:00.0 type=discrete total="15.0 GiB" available="14.6 GiB" time=2026-03-20T07:58:02.030+01:00 level=INFO source=routes.go:1832 msg="vram-based default context" total_vram="15.0 GiB" default_num_ctx=4096 - Find the JOBID of the reservation if you lost track
- Get additional shell in the same allocation This will start bash shell in the same job reservation.
- Test that you can communicate with the ollama service
/proj/naiss-XXXXXX/userid/nobackup/ollama/bin/ollama list NAME ID SIZE MODIFIED qwen2.5-coder:14b 9ec8897f747e 9.0 GB 23 hours ago gemma3:latest a2af6cc3eb7f 3.3 GB 36 hours ago mistral:latest 6577803aa9a0 4.4 GB 37 hours ago llama3.2-vision:latest 6f2f9757ae97 7.8 GB 37 hours ago llama4:latest bf31604e25c2 67 GB 39 hours ago llama3.2:latest a80c4f17acd5 2.0 GB 39 hours ago llama4:scout bf31604e25c2 67 GB 40 hours ago
Expose ollama serve to Pelle login node or personal computer.¶
Warning: This approach exposes the ollama instance to everyone on Pelle - Use with caution.
-
Start interactive node with GPU as before https://docs.uppmax.uu.se/cluster_guides/slurm_on_pelle/
-
On the node start
ollama servewith additionalexport OLLAMA_HOST=0.0.0.0 -
On the login node you can, check on which node you got allocation
- Run ollama client on Pelle login node
- Run on personal computer with
ollama serverunning on Pelle GPU node To exposeollama serveto you personal computer run in a shell Note that in the example above we select port 21434 on the local computer, to avoid potential conflict with local ollama installations running on your computer. Make sure you changep2034to the node with yourollama serve. On your computer, make sure you connect tolocalhost:21434when running ollama.NAME ID SIZE MODIFIED qwen2.5-coder:14b 9ec8897f747e 9.0 GB 23 hours ago gemma3:latest a2af6cc3eb7f 3.3 GB 36 hours ago mistral:latest 6577803aa9a0 4.4 GB 37 hours ago llama3.2-vision:latest 6f2f9757ae97 7.8 GB 37 hours ago llama4:latest bf31604e25c2 67 GB 39 hours ago llama3.2:latest a80c4f17acd5 2.0 GB 39 hours ago llama4:scout bf31604e25c2 67 GB 40 hours ago
Running non-interactive jobs¶
Example sbatch submit.sh template
#!/bin/bash -l
#SBATCH -A naiss-XXXX
#SBATCH -p gpu --gpus=l40s:1
#SBATCH -t 10:00:00
# Get the first free port within a range
OPORT=$(comm -23 <(seq 21434 21444 | sort) <(ss -tan | awk '{print $4}' | cut -d':' -f2 | sort -u) | head -n 1)
export OLLAMA_HOST=localhost:$OPORT
# start the service
export OLLAMA_MODELS=/proj/nais-XXXXXX/user/nobackup/ollama_models
ollama serve &
# Wait a bit for the service to be ready
sleep 5
# Run!
ollama list
# or your python program
source venv/activate.sh
python myprogram.py