Improving IO performance on /scratch and /projects
Peregrine uses a parallel file system called "Lustre" for the /scratch and /projects directories. Lustre presents data from multiple storage servers (OSSs) and disks (called targets or OSTs) as a single 'disk'. The Lustre filesystems on Peregrine are set to put one file on one OST by default. For larger files, it can be advantageous to set Lustre to stripe the file across multiple storage servers. The "lfs" command can be used at a file directory level to tell Lustre to stripe across more than one storage server.
lfs setstripe --count 10 <directory>
Will tell Lustre to stripe any new file in <directory> across 10 storage servers.
Files <1 gigabyte are probably best left with a stripe size of one. Files >1 gigabyte could benefit from striping. You can try setting the stripe count to the number of gigabytes you anticipate as a place to start. The example above is a good place to start for a 10 gigabyte file. There is a limit though: /projects has 54 OSTs. /scratch has 108 OSTs.
Here is a link to a page at NERSC which explains this in more detail: https://www.nersc.gov/users/storage-and-file-systems/optimizing-io-performance-for-lustre/
A particular item to note from the NERSC page is:
File striping will primarily improve performance for codes doing serial IO from a single node or parallel IO from multiple nodes writing to a single shared file as with MPI-IO, parallel HDF5 or parallel NetCDF.