Containers

Table of Contents

 

Singularity 3.x Usage on HPC Clusters

Singularity is available on HPC clusters at Iowa State and allows users to create their own environment and let the users develop and customize their workflow without the need for admin intervention.

As always, email hpc-help@iastate.edu if you have questions regarding the use of Singularity.

Using docker and singularity images from existing container libraries

List of useful container libraries

1. Docker Based Container Libraries

Docker Hub: https://hub.docker.com/

Nvidia GPU-Accelerated Containers (NGC): https://ngc.nvidia.com/

Quay (Bioinformatics): https://quay.io/ or https://biocontainers.pro/#/registry

2. Singularity Container Library

Singularity Library: https://cloud.sylabs.io/library

Using Singularity

Visit any one of the above listed libraries and search for the container image you need.

After gaining interactive access to one of the compute nodes and loading the module, we first pull (download) the pre-built image

module load singularity # loads the latest singularity module v3.1.1 as of 09-23-2019
singularity pull docker://gcc                      # pulls an image from docker hub

This pulls the latest GCC container (gcc v 9.1.0 as of 08/19/2019) and saves the image in your current working directory.

If you prefer to pull a specific GCC version, look at the available tags for the specific container and append the tag version to the end of the container name. For example, if you need to pull the GCC v 5.3.0

singularity pull docker://gcc:5.3.0

Note: You can pull the images to a directory of your choosing (assuming you have write permission) by setting the variables SINGULARITY_CACHEDIR and SINGULARITY_TMPDIR. For instance,

export SINGULARITY_CACHEDIR=$TMPDIR 
  export SINGULARITY_TMPDIR=$TMPDIR

Note on home directory and Singularity
While pulling the containers, pay attention to the home directory as the cached image blobs will be saved in ${HOME}/.singularity
Since the home directory has a limited amount of space, this can fill up quite easily. Users can change where the files will be cached by setting SINGULARITY_CACHEDIR and SINGULARITY_TMPDIR environment variables.

To use the executables within the container

$ singularity exec gcc.img gcc --version  
  
gcc (GCC) 9.1.0
  Copyright (C) 2019 Free Software Foundation, Inc.
  
This is free software; see the source for copying conditions. 

  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

You can use this to compile your code or run your scripts using the container’s executables.

$ singularity exec gcc.img gcc hello_world.c -o hello_world  
  
$ singularity exec gcc.img ./hello_world  
  
Hello, World!

/home/${USER}, /work, /ptmp and ${TMPDIR} are accessible via the container image.

$ pwd
  /home/ynanyam
  
$ ls
  
catkin_ws       cuda       env_before     luarocks        pycuda_test.py
  
$ singularity exec gcc.img pwd
  /home/ynanyam
  
$ singularity exec gcc.img ls
  catkin_ws       cuda         env_before          luarocks         pycuda_test.py 

Interactive access

To gain interactive access to the container

$ singularity shell gcc.img
  
Singularity gcc.img:~> gcc --version
  gcc (GCC) 9.1.0
  Copyright (C) 2019 Free Software Foundation, Inc.
  
This is free software; see the source for copying conditions. 

  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. 

To access GPUs within a container, use “–nv” option

singualrity exec --nv <options>
  
singularity shell --nv <options> 

Slurm batch jobs with containers

Here is a example slurm batch script that downloads a container and uses it to run a program.

#!/bin/bash
   

  #SBATCH -N1
  
#SBATCH -n20
  
#SBATCH -t120
  
unset XDG_RUNTIME_DIR
 
  
cd $TMPDIR
 
  
wget ftp://ftp.ncbi.nlm.nih.gov/genomes/Bactrocera_oleae/protein/protein.fa.gz
 
  
gunzip protein.fa.gz
 
  
SINGULARITY_TMPDIR=$TMPDIR SINGULARITY_CACHEDIR=$TMPDIR singularity build clustalo.sif docker://quay.io/biocontainers/clustalo:1.2.4--1
 
  
singularity exec clustalo.sif clustalo -i protein.fa -o result.fasta --threads=${SLURM_NPROCS} -v

Singularity and MPI

Reference: https://sylabs.io/guides/3.3/user-guide/mpi.html

Singularity supports running MPI applications via a container. In order to use singularity containers with MPI, the MPI application must be made available on both the host and within the container.

This requires the host and container MPI versions to be compatible. One way to make sure the versions match is to use the MPI executables and libraries available on the host and bind it to the containers.

Below is a simple MPI batch script that uses OpenMPI

#!/bin/bash
 
  
#SBATCH -N1
  
#SBATCH -n4
  
#SBATCH -t20
  
#SBATCH -e slurm-%j.err
  #SBATCH -o slurm-%j.out
 
  
unset XDG_RUNTIME_DIR
 
  
cd $TMPDIR
 
  SINGULARITY_TMPDIR=$TMPDIR SINGULARITY_CACHEDIR=$TMPDIR singularity build centos-openmpi.sif docker://centos
 

  export SINGULARITYENV_PREPEND_PATH=/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/openmpi-3.1.3-dfskvlwuv5kwrmgx47ct57aorxdpak2c/bin
   
export SINGULARITYENV_LD_LIBRARY_PATH=/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/openmpi-3.1.3-dfskvlwuv5kwrmgx47ct57aorxdpak2c/lib
 
  
wget https://raw.githubusercontent.com/wesleykendall/mpitutorial/gh-pages/tutorials/mpi-hello-world/code/mpi_hello_world.c
 
  
module load openmpi
 
  
mpicc mpi_hello_world.c -o hello_world
 
  
mpirun -np ${SLURM_NPROCS} singularity exec --bind /opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/openmpi-3.1.3-dfskvlwuv5kwrmgx47ct57aorxdpak2c centos-openmpi.sif ./hello_world 

In the above example script - openmpi is made available with the container using

export SINGULARITYENV_PREPEND_PATH=/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/openmpi-3.1.3-dfskvlwuv5kwrmgx47ct57aorxdpak2c/bin
 
export SINGULARITYENV_LD_LIBRARY_PATH=/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/openmpi-3.1.3-dfskvlwuv5kwrmgx47ct57aorxdpak2c/lib 

and then bind mounted to the container. (By default we don’t bind mount the software directories)

In most cases, the containers we use don’t include the Infiniband drivers, OpenMPI fails through and then falls back to the ethernet interface.

If users are building their own containers which make use of MPI be sure to include the OFED IB driver stack.

Singularity and Nvidia Container Library (Optional)

Nvidia provides a container library that works with both Docker and Singularity.

UPDATE(05/11/2020):

NCG Container library is now public. No account and API Key required to access the library.

Users must have an account created in NGC and create an API key (save your key in a secure location!!!). Instructions to get started are here: https://docs.nvidia.com/ngc/ngc-user-guide/singularity.html#singularity

Once you have an account created, export the required environment variables and download the containers for the library. https://ngc.nvidia.com/catalog/containers

NOTE:
Do not set the environment variables SINGULARITY_DOCKER_USERNAME and SINGULARITY_DOCKER_PASSWORD in your .bashrc file as it prevents pulling containers from libraries other than NGC https://groups.google.com/a/lbl.gov/forum/#!topic/singularity/9q1aTycZ6CA

Below is an example script that downloads namd container and then runs an example on GPU node. https://ngc.nvidia.com/catalog/containers/hpc:namd

#!/bin/bash
 
  
#SBATCH -N1
  
#SBATCH -n36
  
#SBATCH -t10

  #SBATCH -pgpu
 
  

unset XDG_RUNTIME_DIR
 
  
cd $TMPDIR
 
  
export SINGULARITY_DOCKER_USERNAME='$oauthtoken'
 
  
export SINGULARITY_DOCKER_PASSWORD=<API Key>
 

  wget http://www.ks.uiuc.edu/Research/namd/utilities/stmv.tar.gz
  
tar xf stmv.tar.gz
 
  
curl -s https://www.ks.uiuc.edu/Research/namd/2.13/benchmarks/stmv_nve_cuda.namd > stmv/stmv_nve_cuda.namd  # constant energy
 
  
curl -s https://www.ks.uiuc.edu/Research/namd/2.13/benchmarks/stmv_npt_cuda.namd > stmv/stmv_npt_cuda.namd  # constant pressure
 
  
nvidia-cuda-mps-server # loads nvidia modules to work with singularity GPU containers
 
  
SINGULARITY_TMPDIR=$TMPDIR SINGULARITY_CACHEDIR=$TMPDIR singularity pull docker://nvcr.io/hpc/namd:2.13-singlenode
 
  
singularity exec --nv --bind $PWD:/host_pwd namd_2.13-singlenode.sif namd2 +ppn ${SLURM_NPROCS} +setcpuaffinity +idlepoll +devices 0,1 /host_pwd/stmv/stmv_nve_cuda.namd # stmv constant energy benchmark
 
  
singularity exec --nv --bind $PWD:/host_pwd namd_2.13-singlenode.sif namd2 +ppn ${SLURM_NPROCS} +setcpuaffinity +idlepoll +devices 0,1 /host_pwd/stmv/stmv_npt_cuda.namd # stmv constant pressure benchmark

Building Containers

Since docker images are compatible with Singularity, users can write recipes for their containers either as a dockerfile or singularity definition file.

Be aware that regular users do not have privileges to build container images from recipes on the cluster. They either need privileged access to a machine with docker or singularity installed or use docker hub/singularity library to push their recipes and build container images - we recommend the latter.

Docker

Below are the links to get started on building docker containers -

Dockerfile reference - https://docs.docker.com/engine/reference/builder/

Upload your dockerfiles to docker hub - https://docs.docker.com/docker-hub/

Singularity

Create an account at Sylabs to build a container using a definition. https://cloud.sylabs.io/builder

Once logged in we are presented with a text box to start writing the definition.

Detailed documentation for a Singularity definition file is here. But below is a simple example to get you started.

Bootstrap: docker
 
  
From: continuumio/miniconda3
 
  

%labels
  
maintainer "Name" <email address>
 
  

%post
  
apt-get update && apt-get install -y git
  
# Conda install stringtie

  conda install -c bioconda stringtie 

The header includes the Bootstrap and the label of the container.

Most of the definition files use docker as their bootstrap as docker library is more robust and well maintained.

This is using an existing docker image for miniconda (Python 3) available from https://hub.docker.com/r/continuumio/miniconda3

Post section is any modifications or additions the user can make to the original container - in this case we are adding stringtie package to the container.

Things to keep in mind

  • By default - work, home, ${TMPDIR} and ptmp are bind mounted to the container

  • User outside the container = User inside the container (This implies that the permissions within the container are the same as on the bare-metal compute node)

  • All the networking stack is available from within the container - if the container has Infiniband stack installed it will make use of the network

  • Having

      unset XDG_RUNTIME_DIR  

    in your slurm script is useful when you have jupyter notebook in the container. Also removes some annoying warnings from your logs

  • In the examples above, everything was done within ${TMPDIR} which will be deleted at the end of the job. Make sure you copy the output to your project directory to retain your work

  • Make sure you issue

      nvidia-cuda-mps-server 

    when using GPU nodes as this loads all the required modules to make them work with Singularity

 

 

Workshop Materials

The following materials from the workshop on october 14, 2021 are available for download: