Last active
October 5, 2016 08:40
-
-
Save julianthome/161e6734c36611fcf03c91c9f76ebd5a to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# make variables available in function started by | |
# gnu parallel | |
export FINALRES="result" | |
export WPSIZE=5 | |
export JOBLOG="joblog" | |
function getfiles() { | |
find ./* -name "*.txt" | |
} | |
function worker() { | |
[ ${#@} -eq 0 ] && exit 0 | |
for fil in "${@}"; do | |
echo "process $fil" | |
done | |
exit 0 | |
} | |
# make function worker known to gnu parallel | |
export -f worker | |
if [ ! -e "${JOBLOG}" ]; then | |
getfiles | parallel --joblog "${JOBLOG}" -n"${WPSIZE}" -j +0 worker >> "${FINALRES}" | |
else | |
getfiles | parallel --resume --joblog "${JOBLOG}" -n"${WPSIZE}" -j +0 worker >> "${FINALRES}" | |
fi | |
echo "all jobs are finished" | |
exit 0 |
Hello Ole,
thank for your feedback. I am using the WPSIZE
variable to control the number of params that are passed to the workers. My intention was to prevent too much locking on the output file (which I assume does gnu parallel internally) when the workers are finishing quickly. In my case that happened quite often, so I extended the lifetime of every worker a bit. The touch $FINALRES
part is indeed redundant.
Thank again and kind regards
Julian
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Do you need WPSIZE? By removing that and by running a single job at a time, you get the added benefit that GNU Parallel will log if any of the jobs failed.
function worker() {
echo "process $1"
}
export -f worker
getfiles | parallel --joblog "${JOBLOG}" -j +0 worker >> "${FINALRES}"
Also there is no need to touch $FINALRES: The >> will create the file if it is not there.