LSF to Slurm

# LSF to Slurm Migration Guide

This guide is for users and administrators transitioning from IBM Spectrum LSF (formerly Platform LSF) to Slurm.

## Command Mapping

| Action | LSF | Slurm | Notes |
|--------|-----|-------|-------|
| Submit batch job | `bsub < script.sh` | `sbatch script.sh` | LSF reads script from stdin by default |
| Submit with script arg | `bsub -i script.sh` | `sbatch script.sh` | |
| Submit inline | `bsub -o out.log "command"` | `sbatch --wrap="command"` | |
| Delete job | `bkill job_id` | `scancel job_id` | |
| Kill all my jobs | `bkill 0` | `scancel --me` | |
| Job status | `bjobs` | `squeue` | |
| My jobs | `bjobs -u $USER` | `squeue --me` | |
| All jobs | `bjobs -u all` | `squeue` | |
| Detailed job info | `bjobs -l job_id` | `scontrol show job job_id` | |
| Pending reason | `bjobs -p job_id` | `squeue -j job_id -o "%R"` | |
| Job history | `bhist job_id` | `sacct -j job_id` | |
| Queue info | `bqueues` | `sinfo` | |
| Detailed queue info | `bqueues -l queue` | `scontrol show partition` | |
| Host status | `bhosts` | `sinfo -N -l` | |
| Host details | `lshosts` | `scontrol show node` | |
| Load info | `lsload` | `sinfo -N -o "%N %O %e %m"` | |
| Interactive job | `bsub -Is bash` | `srun --pty bash` | |
| Interactive with X11 | `bsub -Is -XF command` | `srun --pty --x11 command` | |
| Modify pending job | `bmod job_id` | `scontrol update JobId=job_id` | |
| Suspend job | `bstop job_id` | `scontrol suspend job_id` | |
| Resume job | `bresume job_id` | `scontrol resume job_id` | |
| Move to queue | `bswitch queue job_id` | `scontrol update JobId=job_id Partition=part` | |
| Cluster admin | `badmin` / `lsadmin` | `scontrol` / `sacctmgr` | |
| Cluster config | `lsf.conf` / `lsb.queues` | `slurm.conf` | |

## Job Script Directive Mapping

| LSF Directive | Slurm Directive | Description |
|--------------|-----------------|-------------|
| `#BSUB -J name` | `#SBATCH --job-name=name` | Job name |
| `#BSUB -o path` | `#SBATCH --output=path` | Stdout file |
| `#BSUB -e path` | `#SBATCH --error=path` | Stderr file |
| `#BSUB -oo path` | `#SBATCH --output=path --open-mode=truncate` | Overwrite stdout |
| `#BSUB -q queue` | `#SBATCH --partition=partition` | Queue/partition |
| `#BSUB -W HH:MM` | `#SBATCH --time=HH:MM:00` | Walltime (LSF uses HH:MM) |
| `#BSUB -We HH:MM` | `#SBATCH --time-min=HH:MM:00` | Estimated runtime |
| `#BSUB -n N` | `#SBATCH --ntasks=N` | Number of tasks/slots |
| `#BSUB -R "span[hosts=1]"` | `#SBATCH --nodes=1` | Force single node |
| `#BSUB -R "span[ptile=P]"` | `#SBATCH --ntasks-per-node=P` | Tasks per node |
| `#BSUB -R "rusage[mem=M]"` | `#SBATCH --mem-per-cpu=M` | Memory per slot (MB in LSF) |
| `#BSUB -M mem_limit` | `#SBATCH --mem=mem_limit` | Memory limit per process |
| `#BSUB -R "select[ngpus>0]"` | `#SBATCH --gres=gpu:1` | GPU request |
| `#BSUB -gpu "num=N"` | `#SBATCH --gres=gpu:N` | GPU count |
| `#BSUB -x` | `#SBATCH --exclusive` | Exclusive host access |
| `#BSUB -J "name[1-100]"` | `#SBATCH --array=1-100` | Job array |
| `#BSUB -J "name[1-100]%50"` | `#SBATCH --array=1-100%50` | Array with throttle |
| `#BSUB -P project` | `#SBATCH --account=project` | Project/account |
| `#BSUB -u email` | `#SBATCH --mail-user=email` | Email address |
| `#BSUB -B` | `#SBATCH --mail-type=BEGIN` | Email on start |
| `#BSUB -N` | `#SBATCH --mail-type=END` | Email on end |
| `#BSUB -w "done(id)"` | `#SBATCH --dependency=afterok:id` | Dependency |
| `#BSUB -w "ended(id)"` | `#SBATCH --dependency=afterany:id` | Dependency (any exit) |
| `#BSUB -w "exit(id)"` | `#SBATCH --dependency=afternotok:id` | Dependency (failure) |
| `#BSUB -R "select[feature]"` | `#SBATCH --constraint=feature` | Node feature |
| `#BSUB -app profile` | *(no equivalent)* | Application profile |
| `#BSUB -cwd /path` | `#SBATCH --chdir=/path` | Working directory |
| `#BSUB -env "all"` | `#SBATCH --export=ALL` | Export environment (Slurm default) |

## Environment Variable Mapping

| LSF Variable | Slurm Variable | Description |
|-------------|----------------|-------------|
| `$LSB_JOBID` | `$SLURM_JOB_ID` | Job ID |
| `$LSB_JOBNAME` | `$SLURM_JOB_NAME` | Job name |
| `$LSB_JOBINDEX` | `$SLURM_ARRAY_TASK_ID` | Array index |
| `$LSB_MAX_NUM_PROCESSORS` | `$SLURM_NTASKS` | Total slots/tasks |
| `$LSB_HOSTS` | `$SLURM_JOB_NODELIST` | Allocated hosts |
| `$LSB_DJOB_NUMPROC` | `$SLURM_NTASKS` | Process count |
| `$LSB_MCPU_HOSTS` | *(parse from SLURM_JOB_NODELIST)* | Hosts and counts |
| `$LSB_QUEUE` | `$SLURM_JOB_PARTITION` | Queue/partition |
| `$LSB_SUBCWD` | `$SLURM_SUBMIT_DIR` | Submission directory |
| `$LS_SUBCWD` | `$SLURM_SUBMIT_DIR` | Submission directory |

## Job Script Translation: Side by Side

### LSF Version

```bash
#!/bin/bash
#BSUB -J md_simulation
#BSUB -o logs/md_%J.out
#BSUB -e logs/md_%J.err
#BSUB -q gpu
#BSUB -W 48:00
#BSUB -n 4
#BSUB -R "span[hosts=1]"
#BSUB -R "rusage[mem=16000]"
#BSUB -gpu "num=2:mode=exclusive_process"

module load gromacs/2024

cd $LS_SUBCWD

gmx mdrun -s production.tpr \
    -ntomp $LSB_MAX_NUM_PROCESSORS \
    -gpu_id 0,1 \
    -o traj.trr
```

### Slurm Version

```bash
#!/bin/bash
#SBATCH --job-name=md_simulation
#SBATCH --output=logs/md_%j.out
#SBATCH --error=logs/md_%j.err
#SBATCH --partition=gpu
#SBATCH --time=48:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=16G
#SBATCH --gres=gpu:2

module purge
module load gromacs/2024

gmx mdrun -s production.tpr \
    -ntomp $SLURM_CPUS_PER_TASK \
    -o traj.trr
```

### Key Changes

1. `#BSUB` becomes `#SBATCH`
2. `%J` (LSF job ID in filenames) becomes `%j`
3. LSF's `-n 4` is ambiguous (could be tasks or threads). Slurm makes you choose: `--ntasks=4` (MPI) or `--cpus-per-task=4` (threads)
4. LSF's `-R "rusage[mem=16000]"` (memory in MB per slot) becomes `--mem-per-cpu=16G`
5. LSF's `-W HH:MM` becomes `--time=HH:MM:SS` (Slurm includes seconds)
6. GPU: LSF's `-gpu "num=2"` becomes `--gres=gpu:2`
7. `cd $LS_SUBCWD` not needed (Slurm starts in submit directory by default)
8. Slurm handles `CUDA_VISIBLE_DEVICES` automatically with GRES; no need for `-gpu_id`

### Array Job Translation

**LSF:**
```bash
#!/bin/bash
#BSUB -J "analysis[1-100]%50"
#BSUB -o logs/task_%I.out
#BSUB -W 2:00

INPUT=data/sample_${LSB_JOBINDEX}.csv
./process.sh $INPUT > results/output_${LSB_JOBINDEX}.txt
```

**Slurm:**
```bash
#!/bin/bash
#SBATCH --job-name=analysis
#SBATCH --array=1-100%50
#SBATCH --output=logs/task_%a.out
#SBATCH --time=02:00:00

INPUT=data/sample_${SLURM_ARRAY_TASK_ID}.csv
./process.sh $INPUT > results/output_${SLURM_ARRAY_TASK_ID}.txt
```

## Concept Mapping

| LSF Concept | Slurm Equivalent | Key Differences |
|-------------|------------------|-----------------|
| Queue | Partition | Similar concept |
| `span[hosts=1]` / `span[ptile=N]` | `--nodes=1` / `--ntasks-per-node=N` | |
| `rusage[mem=N]` | `--mem-per-cpu=N` | Per-slot vs per-CPU |
| `select[feature]` | `--constraint=feature` | |
| Resource requirements (`-R`) | Individual `--` flags | Slurm doesn't combine in one string |
| Application profile (`-app`) | QOS or partition defaults | Similar but different mechanism |
| Fairshare | Fair Tree fairshare | Different algorithm |
| SLA (service level agreement) | QOS priority + limits | |
| `mbatchd` | `slurmctld` | Central controller |
| `sbatchd` | `slurmd` | Node daemon |
| `lim` | *(built into slurmd)* | Load monitoring |
| `lsf.conf` | `slurm.conf` | Main config |
| `lsb.queues` | Partition definitions in `slurm.conf` | |
| `lsb.users` | `sacctmgr` user/account | |
| ELIM (external load info) | Node features / GRES | Dynamic features available |
| License Scheduler | `--licenses` / GRES | Built-in license tracking |
| Advance Reservation | `scontrol create reservation` | Similar concept |
| Preemption (queue-based) | QOS or partition preemption | |
| Job groups | Slurm accounts | For resource tracking |
| Chunk jobs (`bsub -K`) | `sbatch; squeue --me -t RUNNING` | No direct equivalent |

## Key Behavioral Differences

### 1. Submission Syntax

LSF reads the job script from **stdin** by default (`bsub < script.sh`). Slurm takes the script as a **file argument** (`sbatch script.sh`). This is a common source of early confusion.

### 2. Resource Specification Style

LSF combines resource specs in a single `-R` string:
```
-R "span[hosts=1] rusage[mem=4000] select[gpus>0]"
```

Slurm uses separate flags:
```
--nodes=1 --mem-per-cpu=4G --gres=gpu:1
```

### 3. Memory Units

LSF defaults to **KB** for memory. Slurm defaults to **MB** (and accepts `K`, `M`, `G`, `T` suffixes). This is a critical migration trap:
```bash
# LSF: 16000 MB (rusage in MB)
#BSUB -R "rusage[mem=16000]"

# Slurm: 16G explicitly
#SBATCH --mem-per-cpu=16G
```

### 4. Slot vs. Task vs. CPU

LSF's "slot" concept maps differently depending on context:
- For **MPI:** LSF slot ≈ Slurm task (`--ntasks`)
- For **threads:** LSF slot ≈ Slurm CPU (`--cpus-per-task`)
- LSF `-n` is overloaded; Slurm forces you to be explicit

### 5. Working Directory

LSF jobs start in `$HOME` by default (must `cd $LS_SUBCWD`). Slurm starts in the submission directory by default.

### 6. GPU Handling

LSF has a dedicated `-gpu` flag with mode options. Slurm uses GRES:
- LSF: `-gpu "num=2:mode=exclusive_process"`
- Slurm: `--gres=gpu:2` (cgroup enforcement handles exclusivity)

### 7. Dependency Syntax

LSF uses expression-based dependencies. Slurm uses typed dependencies:

```bash
# LSF
#BSUB -w "done(12345) && done(12346)"
#BSUB -w "done(12345) || done(12346)"

# Slurm
#SBATCH --dependency=afterok:12345,afterok:12346     # AND
#SBATCH --dependency=afterok:12345?afterok:12346     # OR
```

## For Administrators: LSF Config Translation

| LSF Admin Task | LSF Method | Slurm Equivalent |
|---------------|------------|-------------------|
| Define queue | `lsb.queues` | Partition in `slurm.conf` |
| Set queue limits | `lsb.queues` (RUNLIMIT, MEMLIMIT) | Partition limits + QOS |
| Define hosts | `lsf.cluster` | Node definitions in `slurm.conf` |
| Host groups | `lsb.hosts` | Node features |
| User groups | `lsb.users` | `sacctmgr` accounts |
| Fairshare config | `lsb.queues` (FAIRSHARE) | `sacctmgr` fairshare + `PriorityWeight*` |
| Preemption | Queue priority in `lsb.queues` | QOS preemption or partition PriorityTier |
| License management | LSF License Scheduler | `Licenses=` in slurm.conf or GRES |
| Application profiles | `lsb.applications` | QOS + partition defaults |
| Resource limits | `lsb.resources` | `sacctmgr` limits (GrpTRES, MaxTRES, etc.) |
| External LIM | ELIM scripts | Node features, load sensors (community) |
| Cluster admin | `badmin hclose/hopen` | `scontrol update NodeName=X State=DRAIN/RESUME` |

=====
## References

- [SchedMD: Rosetta Stone of Workload Managers](https://slurm.schedmd.com/rosetta.html)
- [SchedMD: Quick Start User Guide](https://slurm.schedmd.com/quickstart.html)
- [SchedMD: sbatch man page](https://slurm.schedmd.com/sbatch.html)