Skip to content

Quick start guide

Account

Sign up (request an account): IFB Core Cluster - Account Request.
The account will be created in few days.

More information about services on IFB Core Cluster website.

Log in

Access: SSH
Server: core.cluster.france-bioinformatique.fr

On Linux/Mac, simply use ssh client like OpenSSH

ssh <username>@core.cluster.france-bioinformatique.fr

On Windows, you can use clients like PuTTY
> Show me

Please see the Logging in page for further details.

Data

Storage

Several volumes of data storage are available on the NNCR cluster.

Usage Quota (default) Backup policy
/shared/home Home directory (personnal data) 100GB NA
/shared/projects Scientific and project (common data) 250GB NA
  • The /shared/bank directory contains common banks (UniProt, RefSeq, ...)
  • If you need or plan to use more storage space, please contact-us.

Please see "storage" page for further details.

Transfer

SSH protocol (or SFTP - SSH File Transfer Protocol) is the only protocol available.
But, you can use many clients to download your data from the cluster (scp, rsync, wget, ftp, etc.).

You can also use graphic clients like FileZilla.
> Show me

Or simply use your file manager with SSHFS
> Show me

Please see the Transfer page for further details.

Software

To use softwares like blast, python, gcc, etc. we have to "load" them using module commands (Environment Modules):

  • List: module avail
  • Use blast: module load blast
  • Use a specific version: module load blast/2.2.25

You can also use singularity or conda directly.

Please see the Conda page for further details.
Please see the Singularity page for further details.

Submit a job

The computing works are done by submitting "jobs" to the workload manager Slurm.
You must use Slurm to execute your jobs.

1. Write a bash script

This script must contain your commands to execute. Many editors are available (see editors page).
Here, inside myscript.sh, we launch a bowtie command and just print some truth.

#!/bin/bash

bowtie2 -x hg19 -1 sample_R1.fq.gz -2 sample_R2.fq.gz -S sample_hg19.sam

echo "Enjoy slurm ! It's highly addictive."

2. Add options and requirements

You can specify several options to your jobs (name, number of CPU, amount of memory, time limit, etc.). All this parameter take place in the beginning of the script with the #SBATCH directive (just after the shebang #!/bin/bash).
Here we specify the job name and the amount of memory required.
Advice: We recommend to set as many parameters as you can in the script in order to keep a track of your execution parameters for a future submission.

#!/bin/bash

#SBATCH --job-name=bowtie
#SBATCH --mem=40GB

bowtie2 -x hg19 -1 sample_R1.fq.gz -2 sample_R2.fq.gz -S sample_hg19.sam

echo "Enjoy slurm ! It's highly addictive."

3. Launch your job with sbatch

sbatch myscript.sh

The command return a jobid to identify your job.
See more usefull information below (Slurm commands).

4. Follow your job

The status goes successively from PENDING (PD) to RUNNING (R) and finally COMPLETED (C) (and job disappear from the queue). So if your job is not displayed, your jobs is finished (with success or with error).

squeue

5. See output

The output of the script (standard output and standard error) is live written.
Default output file is slurm-[jobid].out in your working directory.
And of course, if you have some result files like sample_hg19.sam, these files will be available.

Notes

  • All nodes have access to the data (/shared/home, /shared/projects or /shared/bank).

  • All softwares are available on the nodes, but we have to load them inside the script with command module add [module].

  • All jobs are contained and can not use more resources than defined (CPU, memory).

  • Jobs which exceed limits (memory or time, values by default or set) are killed.

  • It's possible to be connected on a compute node when a job is running (ssh cpu-node-XX).

Demo

Slurm commands

If you are used to PBS/Torque/SGE/LSF/LoadLeveler, refer to the Rosetta Stone of Workload Managers

  • Submit a job: sbatch myscript.sh
  • Information on jobs: squeue
  • Information on my jobs: squeue -u $USER
  • Information on running job: scontrol show job <jobid>
  • Delete a job: scancel <jobid>
Options frequently used
--job-name=demojob Job name
--time=01:00:00 Limit run time "hours:minutes:seconds" (default = max partition time)
--partition=long Select partition (default = fast)
Partitions = fast (job <= 12 hours) or long (job > 12 hours)
--nodes=N Request N compute node to this job (default = 1)
--cpus-per-task=N Number of cores/tasks requested (default = 1 per node)
--mem=2GB
--mem-per-cpu=2GB
Amount of real memory (default = 2GB/cpu)
--exclusive Whole node only for you
--output=slurm-%j.out Specify the output file (standard output and error, default = slurm-[jobid].out)
--workdir=/path/ Working directory (default = submission directory)
--mail-user=email@address
--mail-type=ALL
Send mail on job events (NONE, BEGIN, END, FAIL, ALL)

Please see the SLURM user guide page for further details.

Don't hesitate to have also a look at the sbatch official documentation

Script template

Just an example. Customize it to meet your needs.

#!/bin/bash

################################ Slurm options #################################

### Job name
#SBATCH --job-name=demo_job

### Limit run time "days-hours:minutes:seconds"
#SBATCH --time=01:00:00

### Requirements
#SBATCH --partition=fast
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --mem-per-cpu=8GB

### Email
#SBATCH --mail-user=email@address
#SBATCH --mail-type=ALL

### Output
#SBATCH --output=/shared/home/<user>/demojob-%j.out

################################################################################

echo '########################################'
echo 'Date:' $(date --iso-8601=seconds)
echo 'User:' $USER
echo 'Host:' $HOSTNAME
echo 'Job Name:' $SLURM_JOB_NAME
echo 'Job Id:' $SLURM_JOB_ID
echo 'Directory:' $(pwd)
echo '########################################'

# modules loading
module add ...


# What you actually want to launch
echo 'Waooouhh. Awesome.'


echo '########################################'
echo 'Job finished' $(date --iso-8601=seconds)

Cluster

At your disposal:

  • 68 standard compute nodes (28 cores, 256 GB RAM)
  • 1 "fat memory" compute node (64 cores, 3 TB RAM)

Please see the Cluster description page for further details.

View results

Usage Pros Cons
Browse data on server Use your favorite viewer/editor/tool: less, vim, emacs, nano, ... Simple. Quick. Easy Not always suitable, sufficient or possible.
Get the data back on your workstation See Transfer Have a own local copy. Flexibility (use your own tools/workstation). Use space. Take time.
Use SSHFS on your workstation Browse and view your data directly from you local file manager. Integrated on many distributions. See Transfer Easy to browse data. Can be slow. Data are transferred (so it can take space and time)
Export display Remote display, for Graphic User Interface (xemacs, geany, gedit, etc.). See Export display No data transferred (only display) Can be slow
R lovers Use RStudio :) Integrated Specific

Go further