SLURM is a job scheduler that manages resources on a computer cluster.
It accepts job scripts from users and places them in a queue (which
SLURM calls a "partition"); it determines
an appropriate time when a user job can be run with the necessary memory,
processors, and other requested resources; it returns to the user the
output from a completed job.
The user defines a job request by writing a SLURM script. This script
begins with a sequence of lines using the prefix #SBATCH which indicate
the requested resources, and other information that configures the job.
The script then lists a sequence of commands to be executed, such as
compiling a program, running an executable program, moving files, and
other commands that, theoretically, could have been issued by the
user in an interactive session.
The sbatch command is used to submit a job to SLURM. Assuming the job
script is called "myjob.sh", the user, working from a login node of the
computer cluster, would issue the command:
The submitted job is accepted by SLURM for execution at some later time.
A concerned user can check on the status of all job
squeue -u USERNAME
or a particular job:
sacct -j JOBID
or cancel a job:
or monitor the status of all available queues:
SLURM defines a number of environment variables to simplify work;
the most commonly used one is $SLURM_SUBMIT_DIR, which identifies
the directory from which the job script was submitted. This allows
a user to indicate that the batch job should move to this directory
at execution time, typically because this is the place where input
files and other data may be conveniently found.
The SLURM home page:
The SLURM manual at Lawrence Livermore National Laboratory:
Currently, SLURM is available only on the ARC Huckleberry cluster.
The other systems use PBS.
The following batch file illustrates a simple set of SLURM header lines
appropriate to run a job:
#! /bin/bash # #SBATCH -J slurm_huckleberry #SBATCH -p normal_q #SBATCH -N 1 #SBATCH -n 1 #SBATCH -t 00:05:00 #SBATCH --mem=100M # cd $SLURM_SUBMIT_DIR # # module load commands, such as "module load gcc", go here. # echo "Commands to be executed by your SLURM job go here." # echo "" echo "SLURM_HUCKLEBERRY: Normal end of execution." exit 0
A complete set of files to carry out a similar process are available in