SLURM Reservations¶
SLURM does not have a role model which would allow to delegate certain tasks to individual users; you are either a SLURM admin or you are not. The ability to create and manage reservations is one of the tasks which would greatly benefit from delegation, which would allow for example to reserve a compute node for a beamtime. We have therefore implemented a web-service which does exactly that (but please note that it's work in progress!).
The web-service is accessible - from within DESY network - at https://max-slurm-rest.desy.de/reservation/. It requires DESY credentials to login, but you won't be able to do anything unless you have been added to the list of authorized accounts. If you would like to use the web-service please get in touch with maxwell.service@desy.de.
NOTE: the reservation endpoint had to be moved from max-portal.desy.de to max-slurm-rest.desy.de. max-portal.desy.de will redirect reservation requests to max-slurm-rest, so using max-portal should still work (if your client accepts the redirect), but it's highly recommended to update the url accordingly.
The reservation tool has a number of nice features:
- it allows to set authorization per partition
- it allows to limit the consumable resources per partition; it's possible to impose limits that a partition can never have more than N nodes reserved at a time.
- it supports constraints; it guides you through set of constraints and makes it impossible to create invalid combination of constraints
- it nicely handles groups and users
- it comes with a REST API
The REST API has been used to create a couple of python scriplets, which allow to perform most of the tasks of the web-services directly from the command-line.
SLURM RESERVATION CLI¶
The python modules to handle slurm reservations can be found on maxwell under /software/tools/lib/python3/slurmres. The python modules are not bound to Maxwell, and should work on any machine (i.e. would allow to create reservation from a beamline pc).
Like for the web-service: without account authorization none of the modules will work. Assuming that you are authorized to manage reservations for partition allcpu, the CLI works as follows.
TOKEN¶
To work conveniently with the CLI you'll need a token (a portal token):
# create a portal token
@max-wgse001:~$ portal_token # /software/tools/bin/portal_token
Username: your username
Password:
Portal token generated at /home/username/.maxwell/portal.token
List reservations¶
@max-wgse001:~$ python3 /software/tools/lib/python3/slurmres/slurmreslist.py
{'accounts': [],
'burst_buffer': [],
'core_cnt': 20,
'end_time': '2025-12-19T10:24:40',
'features': '',
'flags': 'IGNORE_JOBS',
'licenses': {},
'name': '11021575',
'node_cnt': 1,
'node_list': 'max-p3a031',
'partition': 'ponline',
'start_time': '2024-12-19T10:24:40',
'tres_str': ['cpu=40'],
'users': ['bttest01']}
3 reservations found
Create reservations¶
@max-wgse001:~$ python3 /software/tools/lib/python3/slurmres/slurmresnew.py -h
usage: slurmresnew.py -n NAME -p PARTITION -c COUNT -u user [user ...] -s
START -e END [-h] [-f feature [feature ...] | -N node
[node ...]] [-i | -P | -k] [-t TOKEN_PATH]
Create a reservation in a partition.
required arguments:
-n NAME, --name NAME the name of the reservation
-p PARTITION, --partition PARTITION
name of partition the reservation is to be created in
-c COUNT, --count COUNT
the amount of nodes
-u user [user ...], --users user [user ...]
a list of users
-s START, --start START
the start date of the reservation [Y-M-DTH:M]
-e END, --end END the end date of the reservation [Y-M-DTH:M]
optional arguments:
-h, --help show this help message and exit
-f feature [feature ...], --features feature [feature ...]
optional features
-N node [node ...], --nodes node [node ...]
optional specified nodes
-i, --ignore_jobs ignore currently running jobs
-P, --preempt_jobs kill currently running jobs if preemptable
-k, --kill_jobs kill currently running jobs
-t TOKEN_PATH, --token TOKEN_PATH
local path to token file
@max-wgse001:~$ python3 /software/tools/lib/python3/slurmres/slurmresnew.py -p allcpu -n res_test_002 -c 1 -s 2021-07-19T12:00 -e 2021-07-19T14:00 -u user1,user2
{'accounts': [],
'burst_buffer': [],
'core_cnt': 20,
'end_time': '2021-07-19T14:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_001',
'node_cnt': 1,
'node_list': 'max-cfel023',
'partition': 'allcpu',
'start_time': '2021-07-19T12:00:00',
'tres_str': ['cpu=40'],
'users': ['user1', 'user2']}
{'accounts': [],
'burst_buffer': [],
'core_cnt': 20,
'end_time': '2021-07-19T14:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_002',
'node_cnt': 1,
'node_list': 'max-cfel024',
'partition': 'allcpu',
'start_time': '2021-07-19T12:00:00',
'tres_str': ['cpu=40'],
'users': ['user1', 'user2']}
Reservation successfully created
Edit reservations¶
@max-wgse001:~$ python3 /software/tools/lib/python3/slurmres/slurmresedit.py -h
usage: slurmresedit.py -n NAME -p PARTITION [-h] [-c COUNT]
[-u user [user ...]] [-s [START]] [-e [END]]
[-N node [node ...]] [-i | -P | -k] [-t TOKEN_PATH]
Edit a reservation in a partition.
required arguments:
-n NAME, --name NAME the name of the reservation
-p PARTITION, --partition PARTITION
name of the reservations partition
optional arguments:
-h, --help show this help message and exit
-c COUNT, --count COUNT
the amount of nodes
-u user [user ...], --users user [user ...]
a list of users
-s [START], --start [START]
the start date of the reservation [Y-M-DTH:M]
-e [END], --end [END]
the end date of the reservation [Y-M-DTH:M]
-N node [node ...], --nodes node [node ...]
optional specified nodes
-i, --ignore_jobs ignore currently running jobs
-P, --preempt_jobs kill currently running jobs if preemptable
-k, --kill_jobs kill currently running jobs
-t TOKEN_PATH, --token TOKEN_PATH
local path to token file
# change nodecount and list of users:
@max-wgse001:~$ python3 /software/tools/lib/python3/slurmres/slurmresedit.py -n res_test_002 -c 2 -u user1,user2,user3 -p allcpu
[...]
{'accounts': [],
'burst_buffer': [],
'core_cnt': 40,
'end_time': '2021-07-19T14:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_002',
'node_cnt': 2,
'node_list': 'max-cfel[024-025]',
'partition': 'allcpu',
'start_time': '2021-07-19T12:00:00',
'tres_str': ['cpu=80'],
'users': ['user1', 'user2', 'user3']}
Reservation successfully edited
Delete reservations¶
@max-wgse001:~$ python3 /software/tools/lib/python3/slurmres/slurmresdelete.py -h
usage: slurmresdelete.py [-h] -n NAME -p PARTITION [-t TOKEN]
Delete a reservation in a partition.
optional arguments:
-h, --help show this help message and exit
-n NAME, --name NAME name of reservation
-p PARTITION, --partition PARTITION
name of partition the reservation is in
-t TOKEN, --token TOKEN
the path to the token
@max-wgse001:~$ python3 /software/tools/lib/python3/slurmres/slurmresdelete.py -n res_test_002 -p allcpu
Reservation successfully deleted
Convenience wrapper¶
Most people will presumably make use of the python code. For convenience there is a wrapper which invokes the python-module, syntax is identical: convenience wrapper Collapse source
@max-wgse001:~$ /software/tools/sbin/slurmreservation
usage: slurmreservation token|list|create|edit|delete
@max-wgse001:~$ /software/tools/sbin/slurmreservation create -n res_test_004 -c 1 -u user1,user2 -p allcpu -s 2021-07-19T14:50 -e 2021-07-19T16:00
{'accounts': [],
'burst_buffer': [],
'core_cnt': 48,
'end_time': '2021-07-19T16:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_004',
'node_cnt': 1,
'node_list': 'max-wn096',
'partition': 'allcpu',
'start_time': '2021-07-19T14:50:00',
'tres_str': ['cpu=96'],
'users': ['user1', 'user2']}
Reservation successfully created
@max-wgse001:~$ /software/tools/sbin/slurmreservation list
{'accounts': [],
'burst_buffer': [],
'core_cnt': 48,
'end_time': '2021-07-19T16:00:00',
'features': [],
'flags': '',
'licenses': {},
'name': 'res_test_004',
'node_cnt': 1,
'node_list': 'max-wn096',
'partition': 'allcpu',
'start_time': '2021-07-19T14:50:00',
'tres_str': ['cpu=96'],
'users': ['user1', 'user2']}
1 reservation found
@max-wgse001:~$ /software/tools/sbin/slurmreservation delete -n res_test_004 -p allcpu
Reservation successfully deleted