Conda / Singularity / Module
To provide a software environment, we rely on 2 main technologies: Conda and Singularity depending of the software specificities, their license...
In order to offer a unified user interface, we implement Environment Modules on top.
The Environment Modules package is a tool that simplifies shell initialization and lets users easily modify their environment during the session with modulefiles. Each modulefile contains the information needed to configure the shell for an application. Modules can be loaded and unloaded dynamically and atomically, in an clean fashion. Modules are useful in managing different versions of applications. Modules can also be bundled into metamodules that will load an entire suite of different applications.
In our case, within the IFB Core Cluster and more generally within the IFB NNCR Cluster, the Conda environment or the Singularity image will be loaded through a Module environment : One Conda environment, One Modulefile.
But why using
module load fastqc/0.11.7 instead of
conda activate fastqc-0.11.7?
- Module will provide a useful autocompletion to help you in searching a tool and a version
module load snp<TAB><TAB>,
module load fasqtc<TAB><TAB>.
- Module will be able to load either Conda environment or Singularity wrappers. That way, you have one loader for different underlying technologies.
List the available software
Load a software
module load fastqc/0.11.7
Unload a software
module unload fastqc/0.11.7
Conda/Modules can be stacked if you need several software at once.
module load trinity/2.8.4 fastqc/0.11.7
module load snakemake/5.3.0
But in case of incompatibilities, as for example two software which require python2 and python3, it's recommanded to load the software just before using it.
$ module load trinity/2.8.4
Trinity --seqType fq --max_memory 50G --left reads_1.fq.gz --right reads_2.fq.gz --CPU 6
$ module unload trinity/2.8.4
$ module load fastqc/0.11.7
$ module unload fastqc/0.11.7
Most of the tools need some requirements. Some need lot of requirements: Python or R libraries in specific version, the last brand new compilator, or simply a newer one, ... Often, those dependencies are not compatible with what we already have on our system or between them.
conda install command.
At IFB, it will allow us to install tools within some dedicated and isolated environments. Note that all software are not provided by Conda. Meanwhile, IFB platforms are contributing to add package in Conda through the GitHub repository Bioconda.
Shared Conda environments
We are installing software and software environments within Miniconda3.
Why ask for a shared environment
There are different use cases:
- I don't know how to use Conda
- I'm preparing a training session and I want all attendees to have the same software environment
- Conda packages can be heavy in term of disk usage
To request a tool or a Conda environment, 2 solutions:
- Propose one via our dedicated git repository cluster/tools
- Request a tool on our IFB community forum: https://community.france-bioinformatique.fr/
To know if a package and a specific version is available in the channels
conda search -c conda-forge -c bioconda mu_tool
"Private" Conda environments
We don't recommend installing tools on your own if the required tool is available as a Conda package in Bioconda or Conda-forge channels
~ directory isn't designed to store lots of files. If you really want to install Conda packages, please install them in your project directory.
To do that, you need to edit a configuration file
Then, use the following commands to create your private Conda environment, to activate it and to install the Conda packages:
module load conda
conda create -n MYENV
source activate MYENV
conda install PKGNAME1 [PKGNAME2...] # install packages
Cons of Conda
Because, those environments isolate the software. The other Python, R or Perl libraries which are installed on the system or within other Conda environments are not available within the Conda environment. If it is an issue for you, let us know.
Singularity is a free, cross-platform and open-source computer program that performs operating-system-level virtualization also known as containerization.
One of the main uses of Singularity is to bring containers and reproducibility to scientific computing and the high-performance computing (HPC) world.
For more information, please visit this page: Singularity advanced guide