Skip to content

Instantly share code, notes, and snippets.

@Natooz
Created August 4, 2023 17:45
Show Gist options
  • Save Natooz/a75c34729221aed63e9d50754852180c to your computer and use it in GitHub Desktop.
Save Natooz/a75c34729221aed63e9d50754852180c to your computer and use it in GitHub Desktop.
Script launching a SLURM sbatch jobs until it is completed. This can be useful in case the SLURM environment allows jobs to run only a limited amount of time.
#!/bin/bash
# Here the job is relaunched if it was not 100% completed, that is until a specific file exists
# The --wait argument allows to hang the execution until the job execution ends, before resubmitting it if needed
# Set vars
NAME=$1
JOB_FILE="train_$NAME.sh"
TRAIN_FILE="runs/gen_MMD/TSD_$NAME/train_results.json"
# Loop job until training is done
while [ ! -f "$TRAIN_FILE" ];
do
sbatch --wait "$JOB_FILE"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment