Singularity Containers on Olympus GPU Nodes

Who can use the GPU nodes?

GPU nodes are available for faculty and students for approved instructional and research use.  If you need GPU access please have your professor contact the Linux support team. 

A directory for your GPU work will be created on:

/mnt/shared-scratch

What is Singularity?

Singularity is similar to using docker.  Singularity uses containers as the base of your virtual instance.  You can choose a pre-built singularity container that has the versions of CUDA and cuDNN you need for your project.  If you would like, you can build your own singularity container and use it on Olympus. 

NOTE: You can copy .sif files to any Linux system running Singularity, including HPRC and workstations.

Where are the singularity containers?

The pre-built singularity container images are located in:

/mnt/shared-scratch/containers

 The name of each container is descriptive.

For example:

cuda_10.2-devel-ubuntu18.04.sif ubuntu-22.04-cuda_11.7.sif

How can I download a container that meets my requirements?

Docker images can be pulled from GitHub or other repositories.

https://hub.docker.com/r/nvidia/cuda/tags?page=1 is a site for docker CUDA images.

  1. Select a docker image that closely matches your requirements.

  2. Change to your directory located on the /mnt/shared-scratch drive.

This is an example command to pull a docker image with CUDA 11.7.1 and cuDNN 8 running Ubuntu 20.04:

singularity pull docker://nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04

This command creates a file named cuda_11.7.1-cudnn8-runtime-ubuntu20.04.sif.

NOTE: You can rename this file, but keep the .sif extension.

How do I use singularity with a container?

Once you have selected the image with the versions CUDA/cuDNN needed, start the

container instance on a GPU node by using the command:

load-singularity /path/to/container.sif

For example – to use the image from the pull example above:

The prompt will change to Singularity> 

When finished, use the exit command to exit the container.

How do I add packages to the container (Python 3.x, Anaconda, etc.) 

In order to add software to the container, the .sif file needs to be converted into a sandbox (editable) image.  The following command will convert the example .sif image into a sandbox image:

NOTE: Please be sure you are in the directory where the .sif image is before running the command.

This will create a directory with the name cuda_11.7.1-cudnn8-runtime-ubuntu20.04

The following command will provide root access to the sand-boxed container:

Once you have installed the packages you should convert the sand-boxed image to a .sif file using the following command:

This will overwrite the existing .sif file.  You can change the name of the .sif file in the above command to keep the older .sif file. 

What directories can I use for my files?

The following are three data storage options available on the GPU nodes: 

/mnt/research – This contains your professors research group drive and is on  engineering’s datastorage.  It is accessed via the campus network and has the slowest access time of the three options.  If your jobs are read/write intensive, running from this directory will run slowly.

/mnt/shared-scratch – This contains your working directory. Total space: 22TB. User default size is 500GB, but can be increased.  It is a network datastore that is on the same network as the GPU nodes.  This provides much faster access times.  Read/write intensive jobs will run much faster from this directory. 

/mnt/scratch – a 1.5TB NVME drive in each GPU node.  This drive is local to the node (not shared between nodes). This provides the fastest read/write access, but data has to be copied to the specific node you are going to run on.

Where can I get more information on Singularity?

See https://tamuengr.atlassian.net/wiki/spaces/helpdesk/pages/1983512577 for more information.