Skip to content

Slurm user guide

SLURM is the queue manager used on the NNCR HPC cluster. You must use SLURM to submit jobs to the cluster.

All the commands presented in this guide are to be used from the host core.cluster.france-bioinformatique.fr

SLURM partitions and nodes

The NNCR HPC cluster is organized into several SLURM partitions. Each partition gathers a set of compute nodes that have similar usage.

The default partition used by SLURM (fast) contains a huge number of nodes and is suitable for most jobs.

To view all partitions available on the cluster run :

sinfo

Please note that you may not have the rights to use all partitions

To view all available nodes run :

sinfo -Nl

Submitting a job to the cluster

They are two commands to submit a job to the cluster:

srun to run jobs interactively sbatch to submit a batch job

Submit a job using srun

To learn more about the srun command, see the official documentation

Usage

The job will start immediately after you execute the srun command. The outputs are returned to the terminal. You have to wait until the job has terminated before starting a new job. This works with ANY command.

Example:

srun hostname

Example if an interaction is needed:

module load r
srun --mem 20GB --pty R
  • --pty: will keep the interaction possible
  • --mem 20GB: will allow 20GB of memory to your job instead of the 2GB by default

Submit a job using sbatch

To learn more about the sbatch command, see the official documentation

Usage

The job starts when resources are available. The command only returns the job id. The outputs are sent to file(s). This works ONLY with shell scripts. The batch script may be given to sbatch through a file name on the command line, or if no file name is specified, sbatch will read in a script from standard input.

Batch scripts rules

The script can contain srun commands. Each srun is a job step. The script must start with shebang (#!) followed by the path of the interpreter

#!/bin/bash
#!/usr/bin/env python

The execution parameters can be set:

At runtime in the command sbatch

sbatch --mem=40GB bowtie2.sbatch

Or within the shell bowtie2.sbatch itself

#!/bin/bash
#
#SBATCH --mem 40GB
srun bowtie2 -x hg19 -1 sample_R1.fq.gz -2 sample_R2.fq.gz -S sample_hg19.sam
sbatch bowtie2.sbatch

The scripts can contain slurm options just after the shebang but before the script commands → #SBATCH

Note that the syntax #SBATCH is important and doesn't contain any ! (as in the Shebang)

Advice: We recommend to set as many parameters as you can in the script to keep a track of your execution parameters for a future submission.

Execution parameters

These parameters are common to the commands srun and sbatch.

Parameters for log

#!/bin/bash
#
#SBATCH -o slurm.%N.%j.out  # STDOUT file with the Node name and the Job ID
#SBATCH -e slurm.%N.%j.err  # STDERR file with the Node name and the Job ID

Parameters to control the job

--partition=<partition_names>, -p

Request a specific partition for the resource allocation. Each partition (queue in SGE) have their own limits: time, memory, nodes ...

Cf: sinfo to know which partitions are available.

--mem=<size[units]>

Specify the real memory required per node. The default units is MB (Default: 2GB)

The job is killed if it exceeds the limit

Note that you can use the variable $SLURM_MEM_PER_NODE in the command line to synchronize the software settings and the resource allocated.

--time=<time>, -t

Set a limit on the total run time of the job allocation.

Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".

Parameters for multithreading

--cpus-per-task=<ncpus>, --cpus, -c

Request a number of CPUs (default 1)

Note that you can use the variable $SLURM_CPUS_PER_TASK in the command line to avoid mistake between the resource allocated and the job.

#!/bin/bash
#
#SBATCH --cpus-per-task=8

srun bowtie2 --threads $SLURM_CPUS_PER_TASK -x hg19 -1 sample_R1.fq.gz -2 sample_R2.fq.gz -S sample_hg19.sam

Full example of the sbatch command

Random

  1. Open a script file with any text editor (but not Word)

For beginners, we suggest to use nano, which has restricted functionalities but is quite intuitive.

nano slurm_random.sh
  1. Copy/Paste the following script which is writing 10 000 random numbers in a file and then sort them :
#!/bin/bash
#
#SBATCH -p fast                      # partition
#SBATCH -N 1                         # nombre de nœuds
#SBATCH -n 1                         # nombre de cœurs
#SBATCH --mem 100                    # mémoire vive pour l'ensemble des cœurs
#SBATCH -t 0-2:00                    # durée maximum du travail (D-HH:MM)
#SBATCH -o slurm.%N.%j.out           # STDOUT
#SBATCH -e slurm.%N.%j.err           # STDERR

for i in {1..10000}; do
  echo $RANDOM >> SomeRandomNumbers.txt
done

sort -n SomeRandomNumbers.txt > SomeRandomNumbers_sorted.txt

Press Ctrl-x to exit nano, then "Y" when nano asks you whether the modified buffer should be saved, then press the "Enter" key to confirm the file name.

  1. Check the content of the script
cat slurm_random.sh
  1. Submit the job
sbatch slurm_random.sh
  1. Check the result

Since this script is running a very basic task, the results should promptly be available.

Check the output files with ls and head.

Note: these commands can be run on the login node since they are consuming very little computing resources.

# List the result files
ls -l SomeRandomNumbers*.txt

# Print the 20 first lines of the original random numbers
head -n 20 SomeRandomNumbers.txt

# Print the 20 first lines of the sorted random numbers
head -n 20 SomeRandomNumbers_sorted.txt

# Print the 20 last lines of the sorted random numbers
tail -n 20 SomeRandomNumbers_sorted.txt

Salmon

  1. Open a script file with any text editor (but not Word)
nano slurm_salmon.sh
  1. Set the slurm parameters, the [conda] environment and the command itself
#!/bin/bash
#
#SBATCH -o slurm.%N.%j.out
#SBATCH -e slurm.%N.%j.err
#SBATCH --mail-type END
#SBATCH --mail-user foo.bar@france-bioinformatique.fr
#
#SBATCH --partition fast
#SBATCH --cpus-per-task 6
#SBATCH --mem 5GB

module load salmon

salmon quant --threads $SLURM_CPUS_PER_TASK -i transcripts_index -l A -1 reads1.fq -2 reads2.fq -o transcripts_quant
  1. Submit the job
sbatch slurm_salmon.sh

Job information

List a user's current jobs:

squeue -u <username>

List a user's running jobs:

squeue -u <username> -t RUNNING

List a user's pending jobs:

squeue -u <username> -t PENDING

View accounting information for all user's job for the current day :

sacct --format=JobID,JobName,User,Submit,ReqCPUS,ReqMem,Start,NodeList,State,CPUTime,MaxVMSize%15 -u <username>

View accounting information for all user's job for the 2 last days (it worth an alias) :

sacct -a -S $(date --date='2 days ago' +%Y-%m-%dT%H:%M) --format=JobID,JobName,User%15,Partition,ReqCPUS,ReqMem,State,Start,End,CPUTime,MaxVMSize -u <username>

List detailed job information:

scontrol show -dd jobid=<jobid>

Manage jobs

To cancel/stop a job:

scancel <jobid>

To cancel all jobs for a user:

scancel -u <username>

To cancel all pending jobs for a user:

scancel -t PENDING -u <username>