PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella.

PyTorch provides two high-level features:

Tensor computing (like NumPy) with acceleration via GPUs
Deep neural networks built on a tape-based automatic differentiation system

Installation¶

The following directions assume a working Conda install.
Installation requires proxy access to allow external downloads

DCS (AiMOS) Cluster¶

Setup Conda environment:¶

conda create -n "my_pytorch_environment" python=3.7.13

conda activate my_pytorch_environment

conda config --add channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/#/

Install Python in environment:¶

module load gcc 
module load spectrum-mpi
module load cuda/11.2
conda install pytorch=1.3.1

NPL (AiMOSx) Cluster¶

Setup Environment:¶

conda create -n "my_pytorch_environment" python=3.10.13

conda activate my_pytorch_environment

Install PyTorch:¶

module load gcc 
module load cuda/12.1
conda install pytorch=2.4.1

NGH Cluster¶

COMING SOON!

Troubleshooting¶

It is no longer necessary to specify a CUDA by installing "pytorch::pytorch-cuda"
If CUDA is not loaded into Pytorch, performance will suffer

Confirm CUDA is enabled:¶

python

import torch
torch.cuda.is_available()
true