Configuration
Exercises¶
- Find a specific configuration parameter with scontrol
Use scontrol show config to determine the current value of SchedulerParameters. Identify which backfill settings are active on the running cluster.
Hint / Solution
- Change the slurmctld debug level temporarily
Increase the controller debug level to debug2 for troubleshooting, verify it took effect, then set it back to info. Do this without editing slurm.conf or restarting any daemon.
Hint / Solution
# Increase debug level (takes effect immediately)
scontrol setdebug debug2
# Verify the change
scontrol show config | grep SlurmctldDebug
# Should show: SlurmctldDebug = debug2
# Watch logs briefly to confirm verbose output
tail -5 /var/log/slurm/slurmctld.log
# Reset to normal
scontrol setdebug info
scontrol show config | grep SlurmctldDebug
- Add a new node definition
Add a new high-memory node bigmem5 with 128 CPUs and 2 TB of RAM to slurm.conf, place it in the highmem partition, and apply the change. Verify the node appears in sinfo.
Hint / Solution
# Edit slurm.conf -- add to the node definitions section:
# NodeName=bigmem5 CPUs=128 RealMemory=2048000
# Update the highmem partition line:
# PartitionName=highmem Nodes=bigmem[1-4],bigmem5 ...
# Apply the change (adding a node requires a restart, not just reconfigure)
systemctl restart slurmctld
# On the new node, start slurmd
ssh bigmem5 systemctl start slurmd
# Verify
sinfo -n bigmem5
scontrol show node bigmem5
- Reconfigure the cluster without a restart
Change DefaultTime on the batch partition from 01:00:00 to 02:00:00 in slurm.conf, then apply the change using scontrol reconfigure. Verify the new default is active.