Skip to content

Instantly share code, notes, and snippets.

@benman1
Last active September 14, 2018 10:38
Show Gist options
  • Save benman1/d41e1ea376a4bce17f69f2360702e877 to your computer and use it in GitHub Desktop.
Save benman1/d41e1ea376a4bce17f69f2360702e877 to your computer and use it in GitHub Desktop.
Parallel bash loop over files
# USAGE:
# ./parallel.sh <command> <input directory> [<threads> [<log file>]]
#
# EXAMPLE:
# ./parallel.sh longrunningcommand.java /myfiles/input/ 10 process.log
shell_command=$1
input_dir=$2
threads=$3
threads=${threads:=10}
output_log=$4
output_log=${output_log:="files_processed.log"}
rm -f ${output_log}
find ${input_dir} -name '*' -type f -print0 | while read -d $'\0' file
do
((i=i%threads)); ((i++==0)) && wait
${shell_command} ${file} &
if [ $? -eq 0 ]; then
echo "${file}\tOK" >> ${output_log}
else
echo "${file}\tFAIL" >> ${output_log}
fi
done
@benman1
Copy link
Author

benman1 commented Sep 13, 2018

shell_command is the command that's executed over files.

A log is maintained over all the processed files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment