The solaris subcluster in maxwell¶
the solaris subcluster consists of a collection of not very new hardware. The nodes are however well suited to run less compute intense tasks requiring only a few cores, or a GPU with comparably low GPU memory like nvidia P100.
To make better use of the resources we added a couple of services:
- jupyterhub: it's a regular jupyterhub like max-jhub, but you have to specify what resources to use.
- REST API: see documentation for more details
- Portal with a number of services to view the solaris subcluster and job utilization
Running non-demanding batch jobs¶
A separate slurm instance has been created to support single or few-core jobs. The slurm commands are almost identical to those described for standard full-node jobs, except that you need to specify the slurm instance:
max-wgse002:~$ sinfo -M solaris # or sinfo --cluster=solaris
CLUSTER: solaris
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
solcpu* up 7-00:00:00 12 idle max-wn[020-031]
solarm up 7-00:00:00 2 idle max-arm[002-003]
solgpu up 7-00:00:00 15 idle max-cmsg[001-008,010],max-wng[004-009]
The slurm instance - named solaris - contains 3 partitions with a handful of old nodes:
max-wgse002:~$ sinfo --cluster=solaris -o '%n %R %f'
CLUSTER: solaris
HOSTNAMES PARTITION AVAIL_FEATURES
# CPU nodes
max-wn020 solcpu INTEL,V4,E5-2640,256G
max-wn021 solcpu INTEL,V4,E5-2640,256G
max-wn022 solcpu INTEL,V4,E5-2640,256G
max-wn023 solcpu INTEL,V4,E5-2640,256G
max-wn024 solcpu INTEL,V4,E5-2640,256G
max-wn025 solcpu INTEL,V4,E5-2640,256G
max-wn026 solcpu INTEL,V4,E5-2640,256G
max-wn027 solcpu INTEL,V4,E5-2640,256G
max-wn028 solcpu INTEL,V4,E5-2640,256G
max-wn029 solcpu INTEL,V4,E5-2640,256G
max-wn030 solcpu INTEL,V4,E5-2640,512G
max-wn031 solcpu INTEL,V4,E5-2640,512G
# ARM nodes
max-arm002 solarm ARM,ARMv8,Ampere,Altra,256G
max-arm003 solarm ARM,ARMv8,Ampere,Altra,256G
# GPU nodes
max-cmsg001 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-cmsg002 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-cmsg003 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-cmsg004 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-cmsg005 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-cmsg006 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-cmsg007 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-cmsg008 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-cmsg010 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-wng004 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-wng005 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-wng006 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-wng007 solgpu INTEL,V4,E5-2640,256G,GPU,GPUx1,P100
max-wng008 solgpu INTEL,V4,E5-2640,512G,GPU,GPUx2,P100
max-wng009 solgpu INTEL,V4,E5-2640,512G,GPU,GPUx2,P100
Job configuration¶
The solaris instance supports allocation of specific number of cores, and specification of memory. This means, that you have to set sensible limits. The node will otherwise either be poorly utilized, or your jobs terminated once exceeding the limits.
The default memory allocated to a job is 4GB.
Example 1¶
Allocate 4 cores:
#!/bin/bash
#SBATCH --cluster=solaris
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --time=0-00:10:00
unset LD_PRELOAD
np=$(nproc)
echo "Cores available: $np"
srun -n $np hostname
# Output:
Cores available: 4
max-wn008.desy.de
max-wn008.desy.de
max-wn008.desy.de
max-wn008.desy.de
Example 2¶
Allocate 4 cores and try to use 6 cores:
#!/bin/bash
#SBATCH --cluster=solaris
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=4G
#SBATCH --time=0-00:10:00
unset LD_PRELOAD
np=$(nproc)
echo "Cores available: $np"
srun -n 6 hostname
# Output:
Cores available: 4
srun: error: Unable to create step for job 51: More processors requested than permitted
Example 3¶
Allocate 4GB of memory and try to use 5GB
#!/bin/bash
#SBATCH --cluster=solaris
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=0-00:10:00
unset LD_PRELOAD
np=$(nproc)
echo "Cores available: $np"
# try to allocate 5G of memory:
timeout 10 cat /dev/zero | head -c 5G | tail
# Output:
/var/spool/slurmd/job00050/slurm_script: line 17: 24886 Broken pipe timeout 10 cat /dev/zero
24887 | head -c 5G
24888 Killed | tail
slurmstepd: error: Detected 1 oom-kill event(s) in StepId=50.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.
# Note: the job state will in this case be OUT_OF_MEMORY
Job information¶
the squeue, sinfo, sacct ... commands work all as usual, just that you need to add --cluster=solaris. So to see your job it's
# squeue
squeue -u $USER -M solaris # or
squeue --user=$USER --cluster=solaris
# sacct
sacct -M solaris # or
sacct --cluster=solaris # or
sacct -L # for both slurm instances (maxwell,solaris)
Running graphical applications¶
the display nodes in Maxwell impose some limitations on cores and memory, simply because esources are shared among many users. For some applications like fiji/imageJ for example the limitations cause a lot of problems. The GPU-nodes (P100 only!) in the solaris subcluster support hardware-accelerated graphical applications with virtualGL.
1. Allocate a GPU on a display node
salloc -p solgpu -M solaris --gres gpu:1 --ntasks 20 --mem 200G # it allocates a full node to avoid interference with other applications/users
2. Connect to the node with GPU
on max-display, or on your desktop, use vglconnect to login to the allocated node
vglconnect max-wng004 # or whatever node you get
3. Run your application
module load maxwell virtualgl
module load fiji # for example
vglrun fiji