NPL Cluster

Users may connect to nplfen01 to build and submit jobs via Slurm.

System information¶

System contains 40 nodes, each with:

2x 20 core 2.5 GHz Intel Xeon Gold 6248
8x NVIDIA Tesla V100 GPU each with 32 GiB HBM
768 GiB RAM per node
Dual 100 Gb EDR Infiniband

Resource allocation¶

The primary schedulable resource on the NPL cluster are GPUs. When submitting a job request typically you will only need to specifcy how many GPUs you need. Each GPU requested will also allocate 10 CPU cores and 64GB of system RAM. If you need additional memory you can specify the --mem slurm option.

Using NVMe storage¶

To use the NVMe storage in a node, request it along with the job specification: --gres=nvme (This can be combined with other requests, such as GPUs.) When the first job step starts, the system will initialize the storage and create the path /mnt/nvme/uid_${SLURM_JOB_UID}/job_${SLURM_JOB_ID}.

The storage is not persistent between allocations. However, it may be used/shared by multiple job steps within an allocation, see Slurm job arrays.