File System
CCI utilizes a single unified GPFS file system across all clusters/nodes, including landing pad nodes.
NOTE: CCI does not perform any project data backups. Users are responsible for their own data management and backups.
Layout¶
The shared file system is divided into three main areas:
Each area of the file system has its own purpose and trade-offs.
NOTE: A user may only access data within their project and may not access data within other projects.
Each user has the following directory structure:
/gpfs
/u
/home
/PROJ
/USER
/scratch
/PROJ
/shared
/USER
/barn
/PROJ
/shared
/USER
Home¶
Home directories are intended to store files that are used by or during interactive sessions: "dot files", configuration files, scripts, and applications needed to customize the working environment.
Application files, data sets and other working data should be stored in the barn directory.
The home directory has a 10 gB quota.
Note: The home directory limit is per project (The sum of the project user's home directory usage).
Scratch¶
This space is meant as a temporary staging area for performing computation.
Performance in this directory will be better than in the home or barn areas.
Each home directory contains a personal scratch directory and a scratch-shared directory, the project's shared scratch space.
NOTE: The scratch and scratch-shared directories do not have a quota however, it may periodically be purged of files that have not be accessed in more than 56 days. If this is not sufficient to maintain enough working space, may be (with advance warning) purged of all files.
NOTE: Because scratch space is not replicated, it is vulnerable to data loss and/or corruption. We may remove files before their normal expiration date if data corruption is detected.
If longer-term storage of data is necessary, it should be stored in the barn area.
Barn¶
The barn directory is meant to allow for storage of working applications and working data. It is not meant for long-term retention of results, but is rather the space to store the working data of the project. It is not intended to be the area where your computations run; they will perform better out of scratch.
Like scratch directories, each user has a personal barn directory and a barn-shared directory, shared amongst the all users of the project.
A common use case is to keep your actual executables in barn and stage data for processing in scratch using data sets stored in barn, then copy the results back to barn until the data can be retrieved back to local, long term storage.
Barn directories start with a 25 GiB quota. Like home it is never automatically purged of old files. Project users must manage their own space usage in the barn.
Barn directories can be expanded, within reason, upon written request from the project's sponsor. Send requests to cci-support[at]rpi.edu.
Performance¶
The CCI file system is built using a block size of 8 mB. Performance testing indicated this was optimal.
Applications using large-record I/O will benefit most from the large block size. Performance testing shows that applications with small-record I/O perform at least nearly as well with the large block size as in a file system with a much smaller block size.
Hosts are connected to the file system hardware using 100gB EDR Infiniband, see this link for more information on CCI's networks.