Skip to content

Conda

Conda is an open source package management system and environment management system that helps to find and install packages. It was created for Python programs, but it can package and distribute software for any language, allowing for existence of completely separate and mutually conflicting environments that can be loaded/unloaded as necessary.

Distributions

Conda package manager is also distributed with the Miniconda and Anaconda distributions. Miniconda contains the bare minimum packages for the conda package manager to work, and Anaconda contains multiple commonly used packages and a graphical user interface, see Figure below.

CONDA

Miniconda Installation

A complete guide regarding Miniconda installation can be found in the official documentation. To install Miniconda into ~/miniconda3 download the latest miniconda3 installer:

login01:~ $ mkdir -p ~/miniconda3
login01:~ $ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
login01:~ $ bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3

After installation initialize conda:

login01:~ $ /miniconda3/bin/conda init bash
For changes to take effect, close and re-open your current shell.

Conda channels

Conda channels are the locations where conda packages are stored. By default, packages are automatically downloaded and updated from the default channel, but other channels (i.e., conda-forge) can be specified using the --channel flag:

login01:~ $ conda install <package> --channel conda-forge

Managing environments

Conda allows you to create isolated environments for different projects or tasks. To create a new environment named myenv run:

login01:~ $ conda create --name myenv

To activate/deactivate the environment use:

login01:~ $ conda activate myenv
login01:~ $ conda deactivate myenv

Alternatively, conda environment can be created based on an environment definition file. Typically, the environment name is stated in the first line of the environment.yml file, but it can be named other things as long as you specify the file name in the command using the -f flag. For example, the following command will create a conda environment that is defined in a file called my-env.yml:

login01:~ $ conda env create -f my-env.yml

Creating environment from YAML file can be beneficial when sharing conda environments with others, making it easier to collaborate on projects or reproduce specific software setups.

To see a list of all of your environments, in your terminal window, run:

login01:~ $ conda info --envs

To remove an environment, in your terminal window, run:

login01:~ $ conda remove --name myenv --all

To verify that the environment was removed, in your terminal window, run:

login01:~ $ conda info --envs

Sharing conda environment

With conda, you can export the environment to a YAML file that contains a list of all the packages and their versions. Open the terminal or command prompt, activate the environment you want to share, and use the following command:

login01:~ $ conda env export > my-env.yml

Once you have the my-env.yml file, you can share it with others. The my-env.yml file contains information about the dependencies required for the environment, including packages, versions, and channels. To create an identical environment, the other person can use the following command in their terminal or command prompt:

login01:~ $ conda env create -f my-env.yml

If your environment relies on packages from specific channels other than the defaults, then you might need to add those channels using:

login01:~ $ conda config --add channels <channel-name>

Installing packages

You can install packages into a previously created environment. To do this, you can either activate the environment you want to modify or specify the environment name on the command line:

# via environment activation
login01:~ $ conda activate myenvironment
login01:~ $ conda install matplotlib

# via command line option
login01:~ $ conda install --name my-env matplotlib

To see a list of all packages installed in a specific environment:

# If the environment is not active
login01:~ $ conda list -n myenv

# If the environment is active
login01:~ $ conda list

# To see if a specific package is installed in an environment
login01:~ $ conda list -n myenv <package>

Base environment

The Python packaging system is prone to develop incompatibilities over time; the more packages you install into one conda environment, the more complex the dependency graph gets, which makes the default base environment prone to problems and breakage each time another package is installed.

For this reason, it is highly recommended to utilize separate conda environments for each project/purpose in order to mitigate the dependency management issues of the Python packaging system and to keep project dependencies as separate and simple as possible.

Using conda in sumbission scripts

Since all computationally heavy operations must be performed on compute nodes, Conda environments are can also be used in jobs submitted through SLURM scheduler:

Conda submission script

#!/bin/bash
#SBATCH -J "sample_job"     # name of job in SLURM
#SBATCH --account=<project> # project number
#SBATCH --partition=short   # select partition short, medium, long, ngpu
#SBATCH --nodes=        # number of nodes
#SBATCH --ntasks=       # number of mpi ranks, needs to be tested for the best performance
#SBATCH --cpus-per-task=    # number of cpus per mpi rank, needs to be tested for the best performance
#SBATCH --time=hh:mm:ss     # time limit for a job
#SBATCH -o stdout.%J.out    # standard output
#SBATCH -e stderr.%J.out    # error output

echo "Launched at $(date)"
echo "Job ID: ${SLURM_JOBID}"
echo "Node list: ${SLURM_NODELIST}"
echo "Submit dir.: ${SLURM_SUBMIT_DIR}"
echo "Numb. of cores: ${SLURM_CPUS_PER_TASK}"

conda activate myenv

# Continue with your code ...

# Example
export SRUN_CPUS_PER_TASK="${SLURM_CPUS_PER_TASK}"
export OMP_NUM_THREADS=1

srun ...

Cleaning up package data

The Conda environment managers download and store a sizable amount of data to provided packages to the various environments. These consume space and count towards your /home quotas. Since Conda packages are self managed, you need to clean unused data yourself:

conda clean -all

Which opens up an interactive dialogue with details about the operations performed. You can follow the default option, unless you have manually edited any files in you package data directory.

Created by: marek.steklac