gromacs¶

Summary

Source: http://www.gromacs.org/
License: LGPL
Path: /usr/bin (resp. under the mpi-specific path), /software/gromacs
Documentation: http://www.gromacs.org/Documentation
Citations: http://www.gromacs.org/Gromacs_papers

GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.

Using gromacs¶

gromacs is installed in different variants. For the "custom installations on Maxwell" see below.

The generally available version of gromacs is installed in the system path and doesn't require any special setup.:

# get some information about the setup
[max]$ gmx mdrun --version
GROMACS:    gmx mdrun, VERSION 2018.8
[...]

gromacs comes with support for openmpi and mpich MPI implementations. To use these you'd need to initialize the environment use the module command:

[max]% module avail
[max]% module load mpi/openmpi-x86_64   # correspondingly for mpich.
[max]% which mdrun_openmpi
/usr/lib64/openmpi/bin/mdrun_openmpi

See below for an example using openmpi.

Running gromacs batch-jobs¶

A single-node job without MPI¶

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --partition=maxcpu
#SBATCH --time=0-01:00:00
unset LD_PRELOAD
# Sample: /beegfs/desy/group/it/Benchmarks/gromacs/sample-gromacs-single-node-20k.sh

base=/beegfs/desy/user/$USER/GROMACS
project=20k
rm -rf $base/$project

mkdir -p $base/$project
pushd $base/$project

# fetch the benchmark sample:
cp /beegfs/desy/group/it/Benchmarks/gromacs/HECBioSim/${project}-atoms/benchmark.tpr .

# gromacs doesn't benefit form hyperthreaded cores, so just use phyical cores:
nt=$(( $(nproc) / 2 ))

INPUT=benchmark.tpr
OUTPUT=benchmark.log

STARTTIME=$(date +%s)
gmx mdrun -nt $nt -s ${INPUT} -g ${OUTPUT}
ENDTIME=$(date +%s)

ELAPSED=$(($ENDTIME - $STARTTIME))

x=$(grep Performance $OUTPUT)
c=$(grep "model name" /proc/cpuinfo | head -1 | cut -d: -f2)

cat <<EOF
Number of threads: $nt
Time elapsed:      $ELAPSED seconds
NodeList:          $SLURM_JOB_NODELIST
Processor:         $c

                  (ns/day)      (hour/ns)
$x
EOF

popd
exit

A multi-node job with MPI¶

#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=maxcpu
#SBATCH --time=0-01:00:00
# sample: /beegfs/desy/group/it/Benchmarks/gromacs/sample-gromacs-mpi-20k.sh

source /etc/profile.d/modules.sh
module load mpi/openmpi-x86_64

base=/beegfs/desy/user/$USER/GROMACS
project=20k
rm -rf $base/$project

mkdir -p $base/$project
pushd $base/$project

# fetch the benchmark sample:
cp /beegfs/desy/group/it/Benchmarks/gromacs/HECBioSim/${project}-atoms/benchmark.tpr .

# gromacs doesn't benefit form hyperthreaded cores, so just use phyical cores.
# gromacs recommends between 1 and 6 threads per mpi rank, so as an example we use 2 threads per rank
# which means we have  mpi-ranks= "total number of cores" / (2*2) 

total_cores=$(scontrol show job $SLURM_JOB_ID | grep NumCPUs | cut -d= -f3 | awk '{print $1}')
np=$(( $total_cores / 4 ))
nt=2

INPUT=benchmark.tpr
OUTPUT=benchmark.log
STARTTIME=$(date +%s)


# the actual gromacs run
# just use as many MPI processes as nodes. For each node use nt threads:
mpirun --map-by node -np $np `which mdrun_openmpi` -ntomp $nt  -s ${INPUT} -g ${OUTPUT}

ENDTIME=$(date +%s)
ELAPSED=$(($ENDTIME - $STARTTIME))

x=$(grep Performance $OUTPUT)

cat <<EOF
Time elapsed:           $ELAPSED seconds
NodeList:               $SLURM_JOB_NODELIST
Number of nodes:        $SLURM_JOB_NUM_NODES
Number of mpi ranks:    $np
Number of threads/rank: $nt

                  (ns/day)    (hour/ns)
$x
EOF

popd
exit

Note on openmpi configuration¶

you might need to set a couple of openmpi-flags to run properly (due to age and heterogeneity of hardware):

# MCA paramaters
export OMPI_MCA_btl=^openib,uct,ofi
export OMPI_MCA_mtl=^ofi
export OMPI_MCA_pml=ucx
export OMPI_MCA_opal_warn_on_missing_libcuda=0

The gromacs modules on Maxwell set these parameters. This is not the case for the gromacs version intialized with

module load mpi/openmpi-x86_64

Custom gromacs on Maxwell¶

In addition to the standard gromacs installation, there are customized versions available on Maxwell. To check which versions are available

module load maxwell
module avail gromacs

The setup includes modules for vmd.

Running gromacs¶

See /software/gromacs/00BUILD* for the setup used to compile gromacs

A job script should look like this:

#!/bin/bash
#SBATCH --partition=maxcpu
#SBATCH --time=0-16:00:00
#SBATCH --job-name=gromacs
unset LD_PRELOAD
source /etc/profile.d/modules.sh
# the required setup. Note: the modules have to be loaded in exactly this order!
module load maxwell gromacs/2018

# ------------------------------------------------------------------------------------
# everything else is identical for the standard gromacs:
base=/beegfs/desy/user/$USER/GROMACS
project=20k
rm -rf $base/$project

mkdir -p $base/$project
pushd $base/$project

# fetch the benchmark sample:
cp /beegfs/desy/group/it/Benchmarks/gromacs/HECBioSim/${project}-atoms/benchmark.tpr .

# gromacs doesn't benefit form hyperthreaded cores, so just use phyical cores.
# gromacs recommends between 1 and 6 threads per mpi rank, so as an example we use 2 threads per rank
# which means we have  mpi-ranks= "total number of cores" / (2*2)

total_cores=$(scontrol show job $SLURM_JOB_ID | grep NumCPUs | cut -d= -f3 | awk '{print $1}')
np=$(( $total_cores / 4 ))
nt=2

INPUT=benchmark.tpr
OUTPUT=benchmark.log
STARTTIME=$(date +%s)

SLURM_JOB_NUM_NODES

# the actual gromacs run
# just use as many MPI processes as nodes. For each node use nt threads:
mpirun --map-by node -np $np `which mdrun_openmpi` -ntomp $nt  -s ${INPUT} -g ${OUTPUT}

ENDTIME=$(date +%s)
ELAPSED=$(($ENDTIME - $STARTTIME))

x=$(grep Performance $OUTPUT)

cat <<EOF
Time elapsed:           $ELAPSED seconds
NodeList:               $SLURM_JOB_NODELIST
Number of nodes:        $SLURM_JOB_NUM_NODES
Number of mpi ranks:    $np
Number of threads/rank: $nt

                  (ns/day)    (hour/ns)
$x
EOF

popd
exit