In your job .sh
file:
#!/bin/bash
set -e
python sample_experiment.py
#optional remove .out file to keep folder clean
#rm slurm-$SLURM_JOB_ID.out
and in your sample_experiment.py
file:
import os
from mlflow import log_metric, log_param, log_artifact
if __name__ == "__main__":
# Log a parameter (key-value pair)
log_param("param1", 5)
# Log a metric; metrics can be updated throughout the run
log_metric("foo", 1)
log_metric("foo", 2)
log_metric("foo", 3)
# Log an artifact (output file)
log_artifact("slurm-" + os.environ['SLURM_JOB_ID'] + ".out")
Check mlflow ui
, the slurm output file will be linked as an artifact in the UI and left the working folder clean.