working with data
This page explains the disk and file systems available on Hummel-2 and their usage, as well as general topics related to data handling:
- Hummel-2 storage concept
- Data security and safety
- I/O (input/output) performance basics
- Which file system to use when?
- Temporary/scratch files
- Disk quotas
- File transfer
- Sharing files
See also:
Hummel-2 storage concept
On Linux PCs the /home directory usually is the only
place where users keep their data. Because of the total size of data
this would not be economical or even not be technically feasible on an
HPC cluster. A technical limitation is the amount of data that can be
backed up on a daily basis. There are suitable sophisticated disk
systems, but these are expensive because they combine all desirable
properties
- data safety
- high performance
- large capacity
in a single system. Hummel-2 is equipped with disk systems that each have basically only one of these three properties.
Data security and safety
Data security
On Hummel-2 there are no security measures beyond standard Unix-like/POSIX file access permissions.
Data safety
Data can get lost as a consequence of hardware failures or human mistakes. Data backup tries to protects against both. However, data can still get lost in time between data creation an the next backup run. Typically, the RAID technology is employed to protect against disk failures. However, if too many disks in a RAID group fail at the same time data can get lost, too.
It is important to know the data safety properties of the available files systems:
- Backups are only made of files in - /home. The backup frequency is nightly.
- /homeand- /uswhave redundant disks at the RAID-1 level.
- /beegfshas redundant disks in a Declustered RAID (dRAID).
I/O (input/output) performance basics
Software characteristics
- Sequential access (accessing data in a contiguous manner, also called streaming I/O) is characterized by the measured bandwidth, i.e. the amount of data read or written per second. For achieving high bandwidth it is advantageous to read/write data in large chunks (e.g. 1 MB). 
- Random access is characterized by I/O operations per second (IOPS) for small data chunks (e.g. 4096 bytes). 
- The number of file operations per seconds, e.g. the number of file creations or deletions per second. 
Hardware characteristics
- Spinning/Hard disk drives (HDD) are considerably cheaper than SSDs. Systems of HDDs are powerful enough for streaming I/O which is the I/O pattern adopted in classic computer simulation software. However, their IOPS are fairly limited and they are not well suited for accessing very many small files. 
- Individual solid-state drives (SSD/NVMe) are much more powerful than individual HDDs. They are ideal for processing data in small chunks and for working with small files. When attached via NVMe extreme IOPS can be obtained. 
I/O does not scale
Scalability is an important property of a system, like an HPC cluster, which is built from many components. Scalibilty means that N times more hardware delivers roughly N times more performance. While this is the case from some components, in particular for computer nodes, the situation is not so simple for I/O hardware. First, the performance of an indivdual disk does not scale with its capacity: typically a large disk has approximately the same performance as a smaller one. Second, only the bandwidth scales with the number of disk employed, while IOPS do not scale well, e.g. two disks working together in a single file system will approximately double the bandwidth but IOPS will remain roughly constant (to our experience).
Which file system to use when?
/home
($HOME) and
/usw
($USW)
/home and /usw are actually two partitions
on the same RAID-1 pair of SSDs. The are the smallest and slowest file
sytems of Hummel-2. Data should not be stored here but rather on one of
the other file systems
- The - /homefile system is backup up every night.- /homeshould be used for files like shell scripts or program code that you write yourself. The idea is to provide a reasonable level of safety for results from your personal time (in contrast to time the computer is working for you).
- /usw(“User-installed SoftWare”) is not backed up. The reason why it is a separate file system is the avoidance of backup of large software packages that can easily be re-installed. Examples of such packages are all variants of conda. Please install software packages in- /uswrather than in- /home
Note: /home and /usw are
mounted readonly on the compute nodes, i.e. they are
not writable in batch jobs.
/beegfs
($BEEGFS)
/beegfs is a classic parallel file system that is based
on spinning disks. On the predecessor system a BeeGFS file system was
the only large file system which was used for all kinds of data
processing. On Hummel-2 /beegfs is the largest file system.
It is well-suited for the classic computer simulation workload. However,
it should not be used for storing and processing many small files. Small
files should be packed (e.g. with tar)
or kept on $SSD (see below). Also, data to be processed
with mostly random access should be put on $SSD
(see below). /beegfs can also be used for backing up files
stored on $SSD. Every user can use /beegfs.
The /beegfs directory name is kept in the environment
variable $BEEGFS.
Note: Disk quotas enforce an average file size of 1 MB.
/nfs/ssdX.Y
($SSD) and
/nvmeof/ssdX.Y
ssdX.Y stands for an SSD from the SSD/NVMe pool. SSDs from the pool
can either be exported as an NFS file
system or as block
devices via NVMe-oF.
The export mechanism is reflected in the first part of the file system
name: /nfs or /nvmeof.
These file systems are well-suited for data intensive computing tasks and for keeping many small files. Because there is no protection againts disk failures care should be taken to avoid data loss: it is assumed that
- input data is always a copy of data that is stored elsewhere,
- output data is copied elsewhere timely.
The differences between /nfs and /nvmeof
are:
- /nfs/ssdX.Ywill be available on all nodes.
- /nvmeof/ssdX.Ycan only be available on one node. It will only be provided for very high performance demands.
At the beginning every user has a 100 GB quota in $SSD.
This space can be used for working with small files or as scratch space.
Please contact the HPC team if you need more
SSD space,
Note: There is no protection against disk failures, i.e. if a disk fails all data stored on it is lost.
Temporary/scratch files
See also: Temporary directories
($TMPDIR).
In our understanding the term scratch implies automatic deletion. On Hummel-2 temporary directories for batch jobs will be deleted at job end.
/tmp and
/dev/shm
On each Unix-like system the /tmp directory must exist
for storing temporary files. Under Linux many applications use the RAM disk
/dev/shm.
In batch jobs running on Hummel-2 /tmp and
/dev/shm are virtual file systems that are both kept in
memory. Each batch job has its private /tmp and
/dev/shm which are automatically removed at the end of the
job.
Note: Usage of /tmp or
/dev/shm counts as memory usage of the batch job, i.e. the
job can run out of memory if to much data is written there.
For login sessions on the login and front-end nodes the
procedure is similar: /tmp or /dev/shm are
also kept in memory, but the same space is used for all
sessions by the user running on the same node. The size of the temporary
space is 50 MB. The command
$ df /tmp
displays the amount of space that is available to the user (not all users). The temporary space is deleted when the node is rebooted.
Note: On the login and front-end nodes space for
/tmp or /dev/shm is small. There is no
automatic clean-up.
Disk quotas
Disk quota is a means to limit file system usage:
- block quota limits disk space
- inode quota limits the number of files
On Hummel-2 disk quota is enabled on all disk systems. Users can
check their disk usage with the RRZ tool rrz-quota.
File transfer
On your local computer files can be copied to and from Hummel-2 for example with:
The server that needs to be specified in these commands is one of the login gateway nodes.
On Hummel-2 one can use the UHHDisk, see:
If you need to transfer data to or from another HPC cluster, please contact the HPC team.
Sharing files
Files can be shared by setting appropriate access permission. This is explained on the page Sharing files.