Use apt to install the necessary packages:
sudo apt install -y slurm-wlm slurm-wlm-doc
Load file:///usr/share/doc/slurm-wlm/html/configurator.html in a browser (or file://wsl%24/Ubuntu/usr/share/doc/slurm-wlm/html/configurator.html on WSL2), and:
- Set your machine's hostname in
SlurmctldHost
andNodeName
. - Set
CPUs
as appropriate, and optionallySockets
,CoresPerSocket
, andThreadsPerCore
. Use commandlscpu
to find what you have. - Set
RealMemory
to the number of megabytes you want to allocate to Slurm jobs, - Set
StateSaveLocation
to/var/spool/slurm-llnl
. - Set
ProctrackType
tolinuxproc
because processes are less likely to escape Slurm control on a single machine config. - Make sure
SelectType
is set toCons_res
, and setSelectTypeParameters
toCR_Core_Memory
. - Set
JobAcctGatherType
toLinux
to gather resource use per job, and setAccountingStorageType
toFileTxt
.
Hit Submit
, and save the resulting text into /etc/slurm-llnl/slurm.conf
i.e. the configuration file referred to in /lib/systemd/system/slurmctld.service
and /lib/systemd/system/slurmd.service
.
Load /etc/slurm-llnl/slurm.conf
in a text editor, uncomment DefMemPerCPU
, and set it to 8192
or whatever number of megabytes you want each job to request if not explicitly requested using --mem
during job submission. Read the docs and edit other defaults as you see fit.
Create /var/spool/slurm-llnl
and /var/log/slurm_jobacct.log
, then set ownership appropriately:
sudo mkdir -p /var/spool/slurm-llnl
sudo touch /var/log/slurm_jobacct.log
sudo chown slurm:slurm /var/spool/slurm-llnl /var/log/slurm_jobacct.log
Install mailutils
so that Slurm won't complain about /bin/mail
missing:
sudo apt install -y mailutils
Make sure munge is installed and running, and a munge.key
was created with user-only read-only permissions, owned by munge:munge
:
sudo service munge start
sudo ls -l /etc/munge/munge.key
Start services slurmctld
and slurmd
:
sudo service slurmd start
sudo service slurmctld start
All,
I had similar issues. I found the following to be helpful: https://blog.llandsmeer.com/tech/2020/03/02/slurm-single-instance.html
I can confirm that this works on Ubuntu 20, with Slurm 19.05.5-1 (which installs through apt).
I have copied the steps below:
Set up munge
$ sudo apt install munge
Test if it works:
Set up MariaDB
Set up SLURM
$ sudo apt install slurmd slurm-client slurmctld
Use configurator.html to create the SLURM config file. There is one online but it is only useful for the last version.
Find out which version you have (dpkg -l | grep slurm, mine was 17.11.2). Go to https://www.schedmd.com/archives.php and download the package correspond to your version (ended up with a small version mismatch, worked out anyway).
Unpack and enter directory, then build and run the Configuration Tool
(mkasemer: you may need to edit more here depending on your machine, but these basics worked for me)
Fill in all NodeName/Hostname field in with own hostname(1).
For testing, fill in root for SlurmUser.
Make sure that the slurmd and slurmctld PID file path are the same as listed in the systemd file (e.g., /lib/systemd/system/slurmd.service).
You might want to look at the Number of CPUs setting (mkasemer: I edited sockets, cores per socket, and threads per core. I left number of CPUs blank, and let Slurm figure it out based on these values)
Copy-paste to /etc/slurm-llnll/slurm.conf.
Next, create a file /etc/slurm-llnl/cgroup.conf:
Restart daemons
Running sinfo should show no errors:
Test an actual job
Run sleep 1 on 8 processors:
Some useful debugging commands:
Note from mkasemer
When attempting to use this with OpenMPI installed via apt, it had issues (see here for a complete description of the problem). Specifically, when using a submission script of the following form (using
srun
, as is often suggested):srun
would not work properly. There would be MPI Initialization errors all over, despite MPI being installed with Slurm support. The fix that works is:You have to specify both the number of slots that Slurm is to save (using
#SBATCH -n
), as well as the number of slots for mpirun to use (usingmpirun -np
). So, a little annoying, but it works properly.