Skip to content

SLURM Rest API

Current versions of SLURM provide a REST API daemon which allows to submit and manage jobs through REST calls for example via curl. For users there is hardly a benefit using the REST API; the slurm commands like sbatch, squeue, etc. are much more handy. It provides however the possibility to launch and manage batch jobs from a (web-)service and - under certain circumstances - the handling of batch jobs on behalf of other users.

BE AWARE: whoever knows your token has access to all your files on maxwell! We therefore disabled the ability to generate unlimited tokens using scontrol. Only the mechanism described below will work.

NEW: The SLURM REST daemon for the maxwell cluster is running on https://max-slurm-rest.desy.de/sapi The SLURM REST daemon for the solaris sub-cluster is running on https://max-sol-portal.desy.de/sapi

Note: the REST API had to be moved from max-portal.desy.de to max-slurm-rest.desy.de. max-portal.desy.de will redirect API requests to max-slurm-rest, so using max-portal should still work (if your client accepts the redirect), but it's highly recommended to update the url accordingly.

Documentation

General information about SLURMs REST API
SLURM REST API reference [https://slurm.schedmd.com/rest\_api.html](https://slurm.schedmd.com/rest_api.html)
Information about JSON web tokens
slides for the SLUG 2020 talk [https://slurm.schedmd.com/SLUG20/REST\_API.pdf](https://slurm.schedmd.com/SLUG20/REST_API.pdf)
slides for the SLUG 2019 talk [https://slurm.schedmd.com/SLUG19/REST\_API.pdf](https://slurm.schedmd.com/SLUG19/REST_API.pdf)

JSON web token (JWT)

slurmrestd is configured to work only with JWTs for authentication. To talk to slurmrestd you first need to generate such a token and set the environment SLURM_TOKEN:

# in order to generate a slurm JWT, first generate a maxwell portal token:
portal_token
> Username:
> Password:
> Portal token generated at /home/username/.maxwell/portal.token
# the portal token never expires - if desired, use -r flag to revoke or create new portal token to overwrite old one (-h flag for help)

# generate a slurm JWT with a default lifespan of 1800 seconds:
slurm_token
> SLURM_TOKEN=long.token

# generate a token with a lifespan of 1 day (max lifespan):
slurm_token -l $((3600*24))

# generate a token specifying a username
slurm_token -l $((3600*24)) -u $USER  
# only a privileged account can specify a username - drop us a mail at maxwell.service@desy.de if you need to create JWTs for other accounts

# generate a token and set $ so it can be used in curl:
export $(slurm_token -l $((3600*24)))

Job submission

To submit a job, the job-script has to be embedded into a json string. A very simple example for a job script:

job.json:
{
  "job":{
     "partition": "maxcpu", 
     "name":"testapi",
     "time_limit": {"set": True, "number": 1000},
     "current_working_directory":"/home/schluenz",
     "environment":["PATH=/bin:/usr/bin/:/usr/local/bin/","LD_LIBRARY_PATH=/lib/:/lib64/:/usr/local/lib"]
   },
  "script":"#!/bin/bash -l\nsrun hostname; sleep 300"
}

curl -L -H "Content-Type: application/json" -H X-SLURM-USER-NAME:$(whoami) -H X-SLURM-USER-TOKEN:$SLURM_TOKEN \
     -X POST https://max-slurm-rest.desy.de/sapi/slurm/v0.0.40/job/submit -d@job.json

Submission to the solaris sub-cluster works exactly the same way, just replace max-slurm-rest by max-sol-portal, and set the partition to solcpu.

For complex batch-scripts that might quickly become unfeasible. For simpler scripts one can convert a job-script into a json'ized strings, for example

# sample job script
cat job.script
> #!/bin/bash
> echo "hello script"
> srun hostname
> echo $SLURM_JOB_ID
> sleep 100

# convert into string
scr=$(cat job.script | sed 's|"|\\"|g' | sed ':a;N;$!ba;s|\n|\\n|g' )
echo $scr
> #!/bin/bash\necho \"hello script\"\nsrun hostname\necho $SLURM_JOB_ID\nsleep 100

# embed into payload
payload=$(cat <<eof
{"job":{"partition": "short","tasks":1,"name":"test","nodes":1,"current_working_directory":"/home/$USER","environment":{"PATH":"/bin:/usr/bin/:/usr/local/bin/","LD_LIBRARY_PATH":"/lib/:/lib64/:/usr/local/lib"}},"script":"$scr"}
eof
)
echo $payload
> {"job":{"partition": "short","tasks":1,"name":"test","nodes":1,"current_working_directory":"/home/user","environment":{"PATH":"/bin:/usr/bin/:/usr/local/bin/","LD_LIBRARY_PATH":"/lib/:/lib64/:/usr/local/lib"}},"script":"#!/bin/bash\necho \"hello script\"\nsrun hostname\necho $SLURM_JOB_ID\nsleep 100"}

# submit job
curl -L -H "Content-Type: application/json" -H X-SLURM-USER-NAME:$(whoami) -H X-SLURM-USER-TOKEN:$SLURM_TOKEN -X POST 'https://max-slurm-rest.desy.de/sapi/slurm/v0.0.38/job/submit' -d "$payload"

Be aware: curl will NOT transport your current environment to the batch-job. You have to define everything as part of the environment, or part of your batch-job. batch-jobs also won't read ~/.bashrc unless when using a login shell ('#!/bin/bash -l').

Job information

# all jobs
curl -L -s -H "Content-Type: application/json" -H X-SLURM-USER-NAME:$(whoami) -H X-SLURM-USER-TOKEN:$SLURM_TOKEN -X GET https://max-slurm-rest.desy.de/sapi/slurm/v0.0.38/jobs
# to extract information about individual jobs use json parser like jq:
curl -L -s -H "Content-Type: application/json" -H X-SLURM-USER-NAME:$(whoami) -H X-SLURM-USER-TOKEN:$SLURM_TOKEN -X GET https://max-slurm-rest.desy.de/sapi/slurm/v0.0.38/jobs > jobs.json
cat jobs.json | jq '.jobs[]| select(.job_id == 7752003)'
cat jobs.json | jq '.jobs[]| select(.user_name == "username")'  # replace username by a real username

# specific running or pending job
curl -L -s -H "Content-Type: application/json" -H X-SLURM-USER-NAME:$(whoami) -H X-SLURM-USER-TOKEN:$SLURM_TOKEN -X GET https://max-slurm-rest.desy.de/sapi/slurm/v0.0.38/job/7750418
# a job already removed from the queue will report an error
>       "error": "_handle_job_get: unknown job 7751560",

# retrieve information about finished jobs:
curl -L -H "Content-Type: application/json" -H X-SLURM-USER-NAME:$(whoami) -H X-SLURM-USER-TOKEN:$SLURM_TOKEN -X GET https://max-slurm-rest.desy.de/sapi/slurmdb/v0.0.38/job/7740309

As a privileged user it's possible to create token for arbitrary users on maxwell. For services it might be more handy to generate tokens on the service host and not necessarily requiring priviliges. schedmd has provided a simple python script to generate token (with a very minor modification):

Generating JWT for service providers

As a privileged user it's possible to create token for arbitrary users on maxwell. For services it might be more handy to generate tokens on the service host and not necessarily requiring priviliges. schedmd has provided a simple python script to generate token (with a very minor modification):

#!/usr/bin/env python3
import sys
import os
import pprint
import json
import time
from datetime import datetime, timedelta, timezone

from jwt import JWT
from jwt.jwa import HS256
from jwt.jwk import jwk_from_dict
from jwt.utils import b64decode,b64encode

if len(sys.argv) != 3:
    sys.exit("generate_jwt.py [user name] [expiration time (seconds)]");

jwt_key = os.environ.get('JWT_KEY', '/etc/slurm/jwt_hs256.key')

with open(jwt_key, "rb") as f:
    priv_key = f.read()

signing_key = jwk_from_dict({
    'kty': 'oct',
    'k': b64encode(priv_key)
})

message = {
    "exp": int(time.time() + int(sys.argv[2])),
    "iat": int(time.time()),
    "sun": sys.argv[1]
}

a = JWT()
compact_jws = a.encode(message, signing_key, alg='HS256')
print("SLURM_TOKEN={}".format(compact_jws))

python3 generate_jwt.py user 3600 would generate the token - if you have a copy of the "secret key". Using keycloak tokens to submit jobs on behalf of users is a much better choice!

Using Keycloak tokens - JWKS

tbd