3.1. Conda tutorial/exercises¶
3.1.1. Installation¶
First we will install miniconda, which is a light version of conda. Miniconda will create a conda environment with the conda command line interface (CLI).
Go to the download page https://docs.conda.io/en/latest/miniconda.html
Download the relevant installer file (see notes here below)
Execute the downloaded installer (script shell)
For Linux users, you should see a section like this one:

Note
What about the Python versions and 32/64 bits flavours ?
You have the choice between different architecture (e.g., 32/64 bits). Usually, you with modern archictecture the 64bits version is the one to download.
You have the choice between several version of Python (e.g., 2.7, 3.7, etc). This version is independent of the version installed on your computer. This will be the version used by conda internally in the first environement (the base). Later on you may create environments with other Python version anyway. So, you may choose the latest one(e.g. python 3.9).
As an example, in a Linux terminal you can download this version:
cd Downloads
wget https://repo.anaconda.com/miniconda/Miniconda3-py39_4.9.2-Linux-x86_64.sh
and then execute the installer:
sh Miniconda3-py39_4.9.2-Linux-x86_64.sh
Follow the instructions. Answer yes to "Do you wish the installer to initialize Miniconda3 by running conda init ?"
The installation will create a miniconda folder, by default in the home directory (~/miniconda3)
.
Exit the shell and start a new one for the changes to be active.
Update conda:
conda update conda
3.1.2. Installing a package¶
We will first install a standard Python package called pandas:
conda install pandas
It should be installed in a few seconds with a good internet connection ! Installing it normally would have taken a little more time and compilation would have been required.
Now what about a bioinformatics tools such as bwa ?
conda install bwa
You should get an error (unknown package).
Let us see how to configure conda so as to be able to install more specific tools.
3.1.3. Configuring channels¶
One strength of the Conda community is that dedicated channels have been created for different scientific communities. For instance, BioConda (https://bioconda.github.io/) for bioinformatics.
iTo benefit from those channels, we need to globally configure the sources of the packages we will want to install. We will add new sources to the defaults
channel already configured in the original installation.
First let's check the currently configured channels with conda config
:
conda config --show channels
This information is actually stored in the file ~/.condarc
.
Now, we add more channels:
conda config --add channels bioconda
conda config --add channels conda-forge
You can now verify that your channel are properly configured with:
conda config --show channels
Warning
Channels are appearing in order of priority, which depends on the order in which you specified the
conda config --add channels
. By default, to identify the package which will be installed,
conda sorts packages from highest to lowest channel priority, then by version number, then by build number.
This means that if you try to install pandas
for example, conda will look first in conda-forge for the
most recent version and builds. If it is not present in conda-forge, then it will look for it in bioconda
and then in defaults.
3.1.4. Updating a package¶
You can now update pandas from the conda-forge or bioconda channel:
conda update pandas
3.1.5. Creating an environment¶
So far we used the main base environment but one can create as many environments as desired. A good practice is to keep the base clean and funtional.
So, let us create a new environment, which will encapsulate the different softwares we will need for our analysis
This is done with conda create:
conda create --name repro_variant python=3.7
# Equivalent to
conda create -n repro_variant python=3.7
Here we create a new environment called repro_variant and we install python version 3.7
We can then list different environments that have been created so far:
conda env list
The currently active environment is highlighted by a *
. You can see as well the absolute location of the environment folder.
3.1.6. Activate/Deactivate an environment¶
Depending on the configuration, when you start a new shell, the active environment is the base environment. You can deactivate it totally using:
conda deactivate
Then, you can activate another environment:
conda activate repro_variant
activate the base and/or repro_variant environements and check that the python version is indeed 3.9 in the first case and 3.7 in the second case.
3.1.7. Installing packages in the environment¶
First we will activate the environment:
conda activate repro_variant
You can see that the information of your current environment is displayed in the bash prompt (ie (repro_variant)
).
And install in it the packages needed for our analysis:
conda install fastp bwa samtools ivar pangolin nextclade_js
You can check the packages installed, their version, build and source in the current environment using:
conda list
Note
Environment creation and package installations can be done in a single command, for example:
conda create --name myenv fastp bwa samtools ivar pangolin nextclade_js
3.1.8. Searching packages in channels¶
You can search for a particular package in the channels you defined like so:
conda search deseq
The results will give you all the packages fitting the *deseq*
pattern
as well as all the versions available in the different channels.
3.1.9. Exporting/Reproducing environments¶
One way to reach a certain degree of reproducibility is to share your environment with the community. You can do so by exporting the content of your environment using:
conda env export > repro_variant.yml
Your channel setup and packages with version and builds will be then stored in the YAML file repro_variant.yml
.
You could now delete this environment and reproduce it from the YAML file:
# You will need to deactivate the environment before removing
conda deactivate
conda env remove -n repro_variant
# And reproducing it from the yaml
conda env create --file repro_variant.yml
# If you want to use another environment name than the one specified in the YAML
conda env create -n repro_variant_copy --file repro_variant.yml
Note
By experience, incorporating the builds in the environment export can lead to bugs for environment reproduction. You can use this command as a workaround:
conda env export --no-builds > repro_variant_no_builds.yml