Skip to content

Conda / Singularity / Module

To provide a software environment, we rely on 2 main technologies: Conda and Singularity depending of the software specificities, their license...

In order to offer a unify user interface, we implement on top, Environment Modules

Environment Modules

The Environment Modules package is a tool that simplify shell initialization and lets users easily modify their environment during the session with modulefiles. Each modulefile contains the information needed to configure the shell for an application. Modules can be loaded and unloaded dynamically and atomically, in an clean fashion. Modules are useful in managing different versions of applications. Modules can also be bundled into metamodules that will load an entire suite of different applications.

In our case, within the IFB [Core] Cluster[s], the Conda environment or the Singularity image will be loaded through a Module environment : One Conda environment, One Modulefile.

But why using module load fastqc/0.11.7 instead of conda activate fastqc-0.11.7?

  • Module will provide a useful autocompletion to help you in searching a tool and a version module load snp<TAB><TAB>, module load fasqtc<TAB><TAB>.
  • Module will be able to load either Conda environment or Singularity wrappers. That way, one loader for different underlying technologies.

How to?

List the available software

module avail

Load a software

module load fastqc/0.11.7

Unload a software

module unload fastqc/0.11.7

Software stacking

Conda/Modules can be stacked if you need several software at once.

module load trinity/2.8.4 fastqc/0.11.7
module load snakemake/5.3.0

But in case of incompatibilities, as for example two software which require python2 and python3, it's recommanded to load the software just before using it.

$ module load trinity/2.8.4
Trinity --seqType fq --max_memory 50G --left reads_1.fq.gz  --right reads_2.fq.gz --CPU 6
$ module unload trinity/2.8.4
$ module load fastqc/0.11.7
fastqc Trinity.fas
$ module unload fastqc/0.11.7

Conda

Most of the tools need some requirements. Some need lot of requirements: Python or R libraries in specific version, the last brand new compilator, or simply a newer one, ... Often, those dependencies are not compatibles with what we already have on our system or between them.

Conda is an open source package, dependency and environment manager for any language: Python, R, Ruby, Lua, Scala, Java, Javascript, C/ C++, FORTRAN. Miniconda is a small “bootstrap” version that includes only conda, Python, and the packages they depend on. Over 720 scientific packages and their dependencies can be installed individually from the Continuum repository with the conda install command.

At IFB, it will allow us to install tools within some dedicated and isolated environments. Note that all software are not provided by Conda. Meanwhile, IFB platforms are contributing to add package in Conda through the GitHub repository Bioconda.

Shared Conda environments

We are installing software and software environments within Miniconda3.

Why ask for a shared environment

There are different use cases:

  • I don't know how to use Conda
  • I'm preparing a training session and I want that all the attendees have the same software environment
  • Conda packages can be heavy in term of disk usage

To request a tool or a Conda environment, 2 solutions:

  • Propose one via our dedicated git repository cluster/tools
  • Request a tool on our community forum: https://community.france-bioinformatique.fr/

To know if a package and a specific version is available in the channels bioconda, conda-forge and default:

conda search -c conda-forge -c bioconda mu_tool

"Private" Conda environments

We don't recommand installing tools on your own if the require tools is available as a Conda package in Bioconda or Conda-forge channels

Because your ~ directory isn't design to store lot of files. If you really want to install Conda packages, please install them on your project directory.

To do that, you need to edit a configuration file ~/.condarc

~/.condarc

envs_dirs:
  - /shared/projects/<project_name>/conda/env
pkgs_dirs:
  - /shared/projects/<project_name>/conda/pkgs

Cons of Conda

Because, those environments isolate the software. The other Python, R or Perl libraries which are installed on the system or within other Conda environments are not availables within the Conda environment. If it is an issue for you, let us know.

Singularity

Singularity is a free, cross-platform and open-source computer program that performs operating-system-level virtualization also known as containerization.

One of the main uses of Singularity is to bring containers and reproducibility to scientific computing and the high-performance computing (HPC) world.

For more information, please visite this page: Singularity advanced guide