SGE to Slurm

# SGE to Slurm Migration Guide

This guide is for users and administrators transitioning from Sun Grid Engine (SGE), Son of Grid Engine, Open Grid Scheduler, or Univa Grid Engine to Slurm.

## Command Mapping

| Action | SGE | Slurm | Notes |
|--------|-----|-------|-------|
| Submit batch job | `qsub script.sh` | `sbatch script.sh` | |
| Submit binary directly | `qsub -b y /path/to/binary` | `sbatch --wrap="/path/to/binary"` | Slurm prefers scripts |
| Delete job | `qdel job_id` | `scancel job_id` | |
| Delete all my jobs | `qdel -u $USER` | `scancel --me` | |
| Job status (active) | `qstat` | `squeue` | |
| Job status (my jobs) | `qstat -u $USER` | `squeue --me` | |
| Detailed job info | `qstat -j job_id` | `scontrol show job job_id` | |
| Completed job info | `qacct -j job_id` | `sacct -j job_id` | |
| Queue/partition info | `qstat -f` | `sinfo` | |
| Node/host info | `qhost` | `sinfo -N -l` | |
| Interactive shell | `qlogin` | `srun --pty bash` | |
| Run command interactively | `qrsh command` | `srun command` | |
| Interactive with X11 | `qsh` | `srun --pty --x11 bash` | |
| Modify pending job | `qalter job_id` | `scontrol update JobId=job_id` | |
| Hold job | `qhold job_id` | `scontrol hold job_id` | |
| Release held job | `qrls job_id` | `scontrol release job_id` | |
| View queue config | `qconf -sq queue_name` | `scontrol show partition` | |
| List all queues | `qconf -sql` | `sinfo` or `scontrol show partition` | |
| Admin config tool | `qconf` | `sacctmgr` / `scontrol` | |

## Job Script Directive Mapping

| SGE Directive | Slurm Directive | Description |
|---------------|-----------------|-------------|
| `#$ -N name` | `#SBATCH --job-name=name` | Job name |
| `#$ -o path` | `#SBATCH --output=path` | Stdout file |
| `#$ -e path` | `#SBATCH --error=path` | Stderr file |
| `#$ -j y` | `#SBATCH --output=file` | Merge stdout/stderr (Slurm default) |
| `#$ -q queue` | `#SBATCH --partition=partition` | Target queue/partition |
| `#$ -l h_rt=HH:MM:SS` | `#SBATCH --time=HH:MM:SS` | Walltime limit |
| `#$ -l mem_free=4G` | `#SBATCH --mem=4G` | Memory request |
| `#$ -l h_vmem=4G` | `#SBATCH --mem=4G` | Memory limit |
| `#$ -pe smp N` | `#SBATCH --cpus-per-task=N` | Threads (shared memory) |
| `#$ -pe openmpi N` | `#SBATCH --ntasks=N` | MPI processes |
| `#$ -l exclusive=true` | `#SBATCH --exclusive` | Exclusive node |
| `#$ -l gpu=1` | `#SBATCH --gres=gpu:1` | GPU request |
| `#$ -t 1-100` | `#SBATCH --array=1-100` | Job array |
| `#$ -tc 50` | `#SBATCH --array=1-100%50` | Array throttle |
| `#$ -P project` | `#SBATCH --account=account` | Project/account |
| `#$ -M email` | `#SBATCH --mail-user=email` | Email address |
| `#$ -m bea` | `#SBATCH --mail-type=BEGIN,END,FAIL` | Email notifications |
| `#$ -m n` | `#SBATCH --mail-type=NONE` | No email |
| `#$ -cwd` | *(default behavior)* | Use current working directory |
| `#$ -V` | `#SBATCH --export=ALL` | Export all environment (Slurm default) |
| `#$ -v VAR=val` | `#SBATCH --export=VAR=val` | Export specific variables |
| `#$ -S /bin/bash` | `#!/bin/bash` | Shell specification |
| `#$ -hold_jid job_id` | `#SBATCH --dependency=afterok:job_id` | Job dependency |
| `#$ -hard -l resource=val` | `#SBATCH --constraint=feature` | Hard resource request |
| `#$ -soft -l resource=val` | `#SBATCH --prefer=feature` | Soft resource request |

## Environment Variable Mapping

| SGE Variable | Slurm Variable | Description |
|-------------|----------------|-------------|
| `$JOB_ID` | `$SLURM_JOB_ID` | Job ID |
| `$JOB_NAME` | `$SLURM_JOB_NAME` | Job name |
| `$SGE_TASK_ID` | `$SLURM_ARRAY_TASK_ID` | Array task index |
| `$SGE_TASK_FIRST` | `$SLURM_ARRAY_TASK_MIN` | First array index |
| `$SGE_TASK_LAST` | `$SLURM_ARRAY_TASK_MAX` | Last array index |
| `$SGE_TASK_STEPSIZE` | *(N/A)* | Array step size |
| `$HOSTNAME` | `$SLURMD_NODENAME` | Execution hostname |
| `$NHOSTS` | `$SLURM_JOB_NUM_NODES` | Number of hosts |
| `$NSLOTS` | `$SLURM_NTASKS` | Number of slots/tasks |
| `$PE_HOSTFILE` | `$SLURM_JOB_NODELIST` | Host list (format differs) |
| `$TMPDIR` | `$TMPDIR` or `/tmp` | Temp directory (site-specific) |
| `$SGE_O_WORKDIR` | `$SLURM_SUBMIT_DIR` | Submission directory |

## Job Script Translation: Side by Side

### SGE Version

```bash
#!/bin/bash
#$ -N blast_analysis
#$ -o logs/blast_$JOB_ID.out
#$ -e logs/blast_$JOB_ID.err
#$ -q all.q
#$ -l h_rt=04:00:00
#$ -l mem_free=8G
#$ -pe smp 16
#$ -cwd
#$ -V

source /common/sge/default/common/settings.sh
module load blast/2.15

echo "Running on host: $HOSTNAME"
echo "Job ID: $JOB_ID"

blastn -query input.fasta \
    -db /data/nt \
    -out results.txt \
    -num_threads $NSLOTS \
    -evalue 1e-5
```

### Slurm Version

```bash
#!/bin/bash
#SBATCH --job-name=blast_analysis
#SBATCH --output=logs/blast_%j.out
#SBATCH --error=logs/blast_%j.err
#SBATCH --partition=batch
#SBATCH --time=04:00:00
#SBATCH --mem=8G
#SBATCH --cpus-per-task=16

module purge
module load blast/2.15

echo "Running on host: $(hostname)"
echo "Job ID: $SLURM_JOB_ID"

blastn -query input.fasta \
    -db /data/nt \
    -out results.txt \
    -num_threads $SLURM_CPUS_PER_TASK \
    -evalue 1e-5
```

### Key Changes

1. `#$` becomes `#SBATCH`
2. `$JOB_ID` becomes `$SLURM_JOB_ID`, `%j` in filenames
3. `$NSLOTS` becomes `$SLURM_CPUS_PER_TASK` (or `$SLURM_NTASKS` for MPI)
4. `-pe smp N` becomes `--cpus-per-task=N`
5. `-cwd` is default behavior in Slurm (no flag needed)
6. No need to source Slurm settings (environment is set up automatically)
7. `--time` is generally required in Slurm (no equivalent of SGE's unlimited default)

### Array Job Translation

**SGE:**
```bash
#!/bin/bash
#$ -N array_job
#$ -t 1-100
#$ -tc 50
#$ -o logs/task_$TASK_ID.out

INPUT=samples/sample_${SGE_TASK_ID}.fasta
./analyze.sh $INPUT > results/result_${SGE_TASK_ID}.txt
```

**Slurm:**
```bash
#!/bin/bash
#SBATCH --job-name=array_job
#SBATCH --array=1-100%50
#SBATCH --output=logs/task_%a.out

INPUT=samples/sample_${SLURM_ARRAY_TASK_ID}.fasta
./analyze.sh $INPUT > results/result_${SLURM_ARRAY_TASK_ID}.txt
```

## Concept Mapping

| SGE Concept | Slurm Equivalent | Key Differences |
|-------------|------------------|-----------------|
| Queue | Partition | Nodes can be in multiple partitions |
| Parallel Environment (PE) | `--ntasks` / `--cpus-per-task` | No separate PE object to configure |
| Complex/Resource | GRES / Features / Constraints | Simpler to define custom resources |
| Host group (@allhosts) | Node features + constraints | More flexible node selection |
| Cell | Cluster (via slurmdbd) | Multi-cluster via federation |
| Share tree (tickets) | Fair Tree fairshare | Different algorithm, similar intent |
| Functional policy (tickets) | Multifactor priority | Nine factors instead of ticket pools |
| Override tickets | QOS priority | More granular control |
| qmaster | slurmctld | Similar role |
| execd | slurmd | Similar role |
| Shadow master | Backup slurmctld | Built-in HA |
| qconf | sacctmgr + scontrol | Two tools replace one |
| ARCo | sacct + sreport | Built-in accounting queries |
| `$SGE_ROOT/default/common/settings.sh` | *(automatic)* | No need to source Slurm environment |
| `.sge_request` (default request file) | *(no equivalent)* | Use `#SBATCH` in scripts instead |
| `.sge_qstat` (default qstat args) | `SQUEUE_FORMAT` env var | Similar concept |

## Key Behavioral Differences

### 1. Time Limits Are Expected

SGE jobs could run indefinitely by default. **Slurm expects a `--time` limit.** If you don't specify one, you get the partition's `DefaultTime`, which may be short (1 hour on some clusters). Always set `--time`.

### 2. No Soft vs. Hard Resource Distinction

SGE had soft requests (best-effort) and hard requests (mandatory). Slurm resources are mandatory by default. Use `--prefer` for soft feature requests (Slurm 23.02+).

### 3. Working Directory

SGE required `-cwd` to run in the submission directory. Slurm does this by default (`$SLURM_SUBMIT_DIR` is set and the job starts there).

### 4. Environment Propagation

SGE required `-V` to pass your environment. Slurm exports the submission environment by default (`--export=ALL`).

### 5. Output File Patterns

SGE used `$JOB_ID` in filenames. Slurm uses `%j` (job ID), `%x` (job name), `%A` (array job ID), `%a` (array task ID) in `--output`/`--error` patterns. These are expanded by Slurm at submission time, not by the shell.

### 6. Parallel Environments Don't Exist

SGE's PE concept (with `$pe_slot`, `$fill_up`, `$round_robin` allocation rules) has no direct equivalent. In Slurm:
- **Threads on one node:** `--ntasks=1 --cpus-per-task=N`
- **MPI across nodes:** `--ntasks=N` or `--nodes=M --ntasks-per-node=K`
- **Fill-up behavior:** Default (block distribution)
- **Round-robin:** `srun --distribution=cyclic`

### 7. Admin Configuration Model

SGE used `qconf` with interactive editor sessions and template files. Slurm splits administration between:
- `slurm.conf` (file-based, static configuration)
- `scontrol` (runtime changes)
- `sacctmgr` (accounting database: accounts, users, QOS, fairshare)

## For Administrators: SGE Config Translation

| SGE Admin Task | SGE Command | Slurm Equivalent |
|---------------|-------------|-------------------|
| Add execution host | `qconf -ae` | Add node to `slurm.conf`, `scontrol reconfigure` |
| Add host group | `qconf -ahgrp @name` | Node features in `slurm.conf` |
| Create queue | `qconf -aq` | Partition in `slurm.conf` |
| Modify queue | `qconf -mq` | Edit `slurm.conf` + `scontrol reconfigure` |
| Add PE | `qconf -ap` | N/A (use `--ntasks`, `--cpus-per-task`) |
| Add complex resource | `qconf -mc` | GRES in `slurm.conf` + `gres.conf`, or features |
| Add project | `qconf -aprj` | `sacctmgr add account` |
| Add user | `qconf -auser` | `sacctmgr add user` |
| Modify share tree | `qconf -mstree` | `sacctmgr modify account/user set fairshare=N` |
| Set functional tickets | `qconf -mu` | `sacctmgr modify qos set priority=N` |
| Disable queue | `qmod -d queue_name` | `scontrol update PartitionName=X State=DOWN` |
| Clear error state | `qmod -c queue@host` | `scontrol update NodeName=X State=RESUME` |
| Show scheduler config | `qconf -ssconf` | `scontrol show config` |
| Trigger scheduler trace | `qconf -tsm` | `scontrol setdebug debug` (temporarily) |

=====
## References

- [SchedMD: Rosetta Stone of Workload Managers](https://slurm.schedmd.com/rosetta.html)
- [SchedMD: Quick Start User Guide](https://slurm.schedmd.com/quickstart.html)
- [SchedMD: sbatch man page](https://slurm.schedmd.com/sbatch.html)
- BioTeam SGE Training Materials (original source)