A mini cluster of 1 head node and 2 compute node.
-
Make sure the clocks, users and groups (UIDs and GIDs) are synchronized across the cluster.
There must be a uniform user and group name space (including UIDs and GIDs) across the cluster. It is not necessary to permit user logins to the control hosts (SlurmctldHost), but the users and groups must be resolvable on those hosts.
-
Each node in a cluster must be able to resolve other nodes in the cluster by their host names.
On Ubuntu 24, host name can be set by
sudo hostnamectl set-hostname your-host-name. On Azure, the vnet's DNS will pick up the hostname of each VM (a VM reboot may be required) and make it resolvable across the vnet. So you don't need to sync/etc/hostson each node in the same vnet.
- Install munge on each node
- Install slurmctld on each head node
- Install slurmd on each compute node
- Configure slurm on each node
- Optionally install slurmdbd for accounting on head node (or a dedicated node)
- Verification
NOTE
Before any
apt install, doapt update
Install munge on each node by
sudo apt install mungeThe /etc/munge/munge.key must be the same on each node. So generate a key file (by mungekey) or use a key file from one node and sync it to all nodes in the cluster.
Verify the installation by
munge -n -t 10 | ssh somehost unmungeand
ssh somehost munge -n -t 10 | unmungeInstall slurmctld on each head node by
sudo apt install slurmctldInstall slurmd on each compute node by
sudo apt install slurmdA system user slurm will be created on either installation. This is the user for SlurmUser in slurm's configuration.
The Slurm configuration file is /etc/slurm/slurm.conf. The file is created manually and must be the same on each node.
GUI tools in slurmctld are provided to help generate the file
- /usr/share/doc/slurmctld/slurm-wlm-configurator.easy.html
- /usr/share/doc/slurmctld/slurm-wlm-configurator.html
Just download one to local computer and open it in a web browser.
Example configuration files are provided under /usr/share/doc/slurmctld/examples.
By default job completion is not recorded for accounting. It can be enabled with a simple file storage by the following settings in slurm.conf
JobAcctGatherType=jobacct_gather/cgroup
JobCompType=jobcomp/filetxt
JobCompLoc=/var/log/slurm/job_completions
Then job completion records are written in /var/log/slurm/job_completions, one line per job, like
JobId=3 UserId=robert(1000) GroupId=robert(1000) Name=test JobState=COMPLETED Partition=dev TimeLimit=525600 StartTime=2025-08-21T08:49:46 EndTime=2025-08-21T08:49:56 NodeList=computenode-01 NodeCnt=1 ProcCnt=1 WorkDir=/home/robert ReservationName= Tres=cpu=1,mem=1M,node=1,billing=1 Account= QOS= WcKey= Cluster=unknown SubmitTime=2025-08-21T08:49:46 EligibleTime=2025-08-21T08:49:46 DerivedExitCode=0:0 ExitCode=0:0
Accounting commands like sacct and sreport, etc. depend on slurmdbd, which in turn depends on MySQL or MariaDB. We're going to install slurmdbd and MariaDB on the head node.
On head node, install MariaDB and secure it.
sudo apt install mariadb-server mariadb-client
sudo mariadb-secure-installationNOTE
Set a password for database root user.
Then configure the database.
create user 'slurm'@'localhost' identified by 'YourPassword';
grant all on *.* TO 'slurm'@'localhost';On head node, install slurmdbd.
sudo apt install slurmdbdThen configure slurmdbd by /etc/slurm/slurmdbd.conf. The file is created manually and should be accessible only by slurmdbd since it contains database credential.
There're example configuration files under /usr/share/doc/slurmdbd/examples.
For Slurm accounting, not only the slurmdbd must be configured, but also the slurmctld. A minimal configuration of slurmctld (slurm.conf) for accounting is like
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageHost=headnode
There're more settings about accounting, with names like AccountingStorage*.
Restart slurmctld and slurmd when slurm.conf changes.
Verify Slurm installation on the head node.
Show Slurm nodes and partitions by
sinfoRun a test job by
sbatch ./test.shShow job accounting information by
sacct