Using NVIDIA enroot to run containers
enroot is currently supported on AiMOS (DCS) and AiMOSx (NPL) to run containers. enroot is similar to Docker and Singularity but is compatible with our GPFS file system.
Before using enroot¶
enroot will attempt to install containers in $HOME
. In most cases, it will be
necessary to place these files somewhere else to prevent hitting quota limts.
For example, they can be moved to the barn:
mkdir -p $HOME/.local/share
mkdir $HOME/barn/enroot
ln -s $(readlink $HOME/barn)/enroot $HOME/.local/share/enroot
Importing container images¶
enroot can import container images and package a single squashfs file system used to create enroot containers.
CCI does not allow users the admin/root privileges necessary to import container images on CCI systems. Users wishing to import container images must do so on another system where they have root privileges. Instructions for installing and importing container images with enroot are available on the enroot github page.
Creating containers¶
Create refers to creating an instance of a container using a squashfs container image. Once a container image has been imported, a container can be created:
enroot create --name mycontainer previously-imported-image.sqsh
Running¶
Once a container has been created, it can be run with enroot using sbatch or srun like any other application. Normal requirements for Slurm on each cluster apply such as specifying time limits or GPU requirements.
Although not strictly necessary, you may also wish to pass environment
variables to your containers. This can be done with the -e
or --env
flag,
ex:
enroot start -e http_proxy=http://proxy:8888 -e https_proxy=http://proxy:8888 entrypoint
Running with srun
:
srun enroot start entrypoint
Example batch script to run with sbatch
:
#!/bin/bash
srun enroot start entrypoint
Example containers¶
There are some example containers provided that can be created and used to sanity check an environment. They are based on the NVIDIA CUDA runtime containers and have the corresponding CUDA samples installed. The deviceQuery sample is already built so the container can be created and run to test the environment, ex:
enroot create --name mycontainer /gpfs/u/software/container-examples/cuda-11.1-runtime-samples.sqsh
srun --gres=gpu:2 -t 5:00 enroot start mycontainer /usr/local/cuda/samples/1_Utilities/deviceQuery/deviceQuery
Examples provided in /gpfs/u/software/container-examples/
:
- cuda-ppc64le-10.2-runtime-samples.sqsh
- cuda-11.1-runtime-samples.sqsh