SLURM Running interactive jobs
Submit an interactive job#
Interactive sessions allow you to connect to a compute node and work on that node directly.
To launch an interactive job on the HPC cluster, here is the command.
Please note it will launch the job with the following defaults:
- 1 cpu core
- 2G RAM
- Max 2 days
sinteractive
If you want to change default memory to 10GB:
sinteractive --mem=10GB
If you need multi-core:
sinteractive --cpus=2
Interactive Jobs will remain active until "exit" or the job is canceled The mechanism behind the interactive job is:
- User runs the 'sinteractive' command
sinteractive
schedules a Slurm batch job to start a screen session on a compute node- Slurm grants the user ssh access to the node
sinteractive
connects the user to the node and attaches to the screen session- the job is completed whenever the screen session ends or the job is canceled
Therefore, an interactive job will not be automatically terminate unless user manually quit the session. To quit it,
Option 1: Run#
exit
on the compute node. Once you come back to the login node, it means the sinteractive session is terminated.
Option 2: Cancel the job directly (from compute nodes or login nodes)#
scancel <job_id>
Reconnect/Disconnect to an Active Interactive Job#
Since an interactive job is a screen session, you can reconnect/disconnect to it anytime. Here is a real-world scenario.
I am in lab and have an interactive job running (ID#10554). Now I plan to go home but I want to leave this job running so I can reconnect to it when I am home. The steps are:
1- Disconnect the screen session for the existing interactive job:
[john@cpu-node-08 ~]$ screen -d
[15495.slurm158985 detached.]
2- Now squeue to see if the job is still running:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
10554 igbmc _interact john R 56:11 1 cpu-node-08
3- Once I am home, ssh to the node:
[john@slurm-client ~]$ ssh cpu-node-08
[john@cpu-node-08 ~]$ screen -ls
There is a screen on:
15495.slurm10554 (Detached)
1 Socket in /var/run/screen/S-john
4- Reconnect to the screen session:
[john@phantom-node3 ~]$ screen -r 15495.slurm10554