Skip to content

GPU Jobs

Exercises

  1. Request 1 GPU and verify the allocation

Submit a job to the GPU partition requesting 1 GPU. Inside the job, run nvidia-smi to confirm a GPU was allocated, and print $CUDA_VISIBLE_DEVICES to see which GPU device was assigned.

Hint / Solution
sbatch --partition=gpu --gres=gpu:1 --cpus-per-task=4 --mem=16G \
    --time=00:10:00 --output=gpu_test_%j.out --job-name=gpu_test \
    --wrap='echo "CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES"; nvidia-smi'
  1. Request a specific GPU type

First, check what GPU types are available on your cluster using sinfo. Then submit a job requesting a specific GPU type (e.g., A100 or V100).

Hint / Solution
# Check available GPU types
sinfo -p gpu -o "%N %G"

# Request a specific type (adjust the type to match your cluster)
sbatch --partition=gpu --gres=gpu:a100:1 --cpus-per-task=4 --mem=16G \
    --time=00:10:00 --output=gpu_type_%j.out --job-name=gpu_type \
    --wrap='nvidia-smi --query-gpu=name --format=csv,noheader'
  1. Check CUDA_VISIBLE_DEVICES with multiple GPUs

Submit a job requesting 2 GPUs. Inside the job, print $CUDA_VISIBLE_DEVICES and use nvidia-smi to list the allocated GPU details (name, memory, index).

Hint / Solution
sbatch --partition=gpu --gres=gpu:2 --cpus-per-task=8 --mem=32G \
    --time=00:10:00 --output=multi_gpu_%j.out --job-name=multi_gpu \
    --wrap='echo "CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES"; nvidia-smi --query-gpu=index,name,memory.total --format=csv'
  1. Use sacct to see GPU allocation details

After a GPU job completes, use sacct with the AllocTRES format field to see exactly what trackable resources (including GPUs) were allocated. Compare this with the CPU and memory allocation.

Hint / Solution
# After your GPU job completes:
sacct -j <jobid> --format=JobID,JobName,AllocTRES%60,Elapsed,State

# The AllocTRES column will show something like:
# billing=12,cpu=4,gres/gpu=1,mem=16G,node=1

References