NPL Cluster
Users may connect to nplfen01
to build and submit jobs via Slurm.
System information¶
System contains 40 nodes, each with:
- 2x 20 core 2.5 GHz Intel Xeon Gold 6248
- 8x NVIDIA Tesla V100 GPU each with 32 GiB HBM
- 768 GiB RAM per node
- Dual 100 Gb EDR Infiniband
Resource allocation¶
The primary schedulable resource on the NPL cluster are GPUs. When submitting a job request typically you will only need to specifcy how many GPUs you need. Each GPU requested will also allocate 10 CPU cores and 64GB of system RAM. If you need additional memory you can specify the --mem slurm option.
Using NVMe storage¶
To use the NVMe storage in a node, request it along with the job
specification: --gres=nvme
(This can be combined with other requests,
such as GPUs.) When the first job step starts, the system will
initialize the storage and create the path
/mnt/nvme/uid_${SLURM_JOB_UID}/job_${SLURM_JOB_ID}
.
The storage is not persistent between allocations. However, it may be used/shared by multiple job steps within an allocation, see Slurm job arrays.