Slurm Overview # Slurm Overview ## What is Slurm? **Slurm** (Simple Linux Utility for Resource Management) is an open-source workload manager for Linux clusters. It is the dominant job scheduler in HPC, running on many of the world's largest supercomputers and thousands of research clusters. Slurm provides three key functions: 1. **Resource allocation** -- grants users exclusive or shared access to compute nodes for a duration of time 2. **Job execution** -- provides a framework for starting, running, and monitoring work on allocated nodes 3. **Queue management** -- arbitrates contention for resources by managing a queue of pending jobs --- ## Key Concepts ### Nodes The physical or virtual servers in a cluster. Slurm categorizes them by role: - **Head/Login node** -- where users log in, write scripts, and submit jobs (does not run compute jobs) - **Compute nodes** -- where jobs actually execute - **Management node** -- runs the Slurm controller daemon ### Partitions Partitions are **groups of compute nodes** organized for different purposes. They replace the "queue" concept from other schedulers (SGE, PBS). Examples: | Partition | Purpose | Typical Limits | |-----------|---------|---------------| | `batch` | General-purpose compute | 7-day max walltime | | `gpu` | GPU-equipped nodes | GPU jobs only | | `highmem` | Large-memory nodes | 1TB+ RAM | | `interactive` | Short interactive sessions | 4-hour max, 1 node | | `debug` | Quick testing | 30-min max, 2 nodes | Your cluster administrators define what partitions are available. ### Jobs A **job** is a resource allocation request plus the work to be done. Jobs can be: - **Batch jobs** -- scripts submitted with `sbatch` that run unattended - **Interactive jobs** -- real-time sessions obtained with `salloc` or `srun` - **Job arrays** -- many instances of the same job with different parameters ### Job Steps Within a job allocation, you can run one or more **steps** using `srun`. Each step can use a subset of the job's allocated resources. ### Accounts Accounts group users for **resource tracking and fairshare**. Your administrator assigns you to one or more accounts, which map to labs, departments, or projects. --- ## Architecture at a Glance Slurm runs as a set of daemons (background services): | Daemon | Runs On | Purpose | |--------|---------|---------| | `slurmctld` | Management node | Central controller -- manages jobs, nodes, partitions | | `slurmd` | Each compute node | Node daemon -- launches and monitors jobs | | `slurmdbd` | Database server | Accounting daemon -- stores job history in MySQL/MariaDB | | `slurmrestd` | API server (optional) | REST API for programmatic access | | `sackd` | Login nodes (optional) | Authentication on configless login nodes | ``` Users | v [Login Node] --submit--> [slurmctld] --dispatch--> [slurmd on compute nodes] | v [slurmdbd] --> [MySQL/MariaDB] ``` As a user, you only interact with the login node. The rest is handled transparently. --- ## Slurm vs. Other Schedulers | Feature | Slurm | SGE/UGE | PBS/Torque | LSF | |---------|-------|---------|------------|-----| | License | Open source (GPL) | Mixed | Open/Commercial | Commercial | | Scalability | Millions of cores | Thousands | Thousands | Millions | | Market position | Dominant in HPC | Legacy/declining | Common in academic | Enterprise/finance | | Cloud support | ParallelCluster, PCS, GCP | Limited | Limited | IBM Cloud | | GPU scheduling | Native GRES | Add-on | Basic | Native | Slurm has become the standard for new HPC deployments, particularly in life science, because of its open-source license, scalability, and native GPU and cloud support. --- ## Where Slurm Runs ### On-Premise Clusters Traditional bare-metal servers in a data center, connected by a high-speed network with shared storage. This is the classic HPC deployment. ### AWS ParallelCluster A self-managed, open-source tool that deploys Slurm on AWS. Key feature: **dynamic scaling** -- compute nodes launch on demand when jobs are submitted and terminate when idle, so you only pay for what you use. ### AWS PCS (Parallel Computing Service) A fully managed AWS service that provides Slurm without the operational overhead of running the controller infrastructure yourself. > **ParallelCluster Note:** ParallelCluster supports **scale-to-zero** -- compute nodes are launched on demand and terminated when idle. This means lower cost but a short startup delay when new nodes must spin up. > **PCS Note:** PCS abstracts away the Slurm control plane entirely. AWS manages the controller infrastructure, so you focus on submitting jobs rather than maintaining daemons. > **The core Slurm commands and concepts are the same across all three environments.** Deployment-specific differences are called out throughout this training. --- ## Essential Commands Preview You'll learn these commands in detail in upcoming modules: | Command | Purpose | SGE Equivalent | |---------|---------|---------------| | `sbatch` | Submit a batch job | `qsub` | | `squeue` | View job queue | `qstat` | | `scancel` | Cancel a job | `qdel` | | `sinfo` | View cluster/partition status | `qhost` | | `sacct` | View completed job info | `qacct` | | `srun` | Run interactive commands/steps | `qrsh` | | `salloc` | Request interactive allocation | `qlogin` | | `scontrol` | View/modify job and system details | `qstat -j` / `qalter` | References¶ SchedMD: Slurm Overview SchedMD: Quick Start User Guide SchedMD: Rosetta Stone of Workload Managers AWS ParallelCluster Slurm Overview