Skip to content

Instantly share code, notes, and snippets.

@florin-chelaru
Last active November 20, 2015 13:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save florin-chelaru/ae8706f9bd84c31100fd to your computer and use it in GitHub Desktop.
Save florin-chelaru/ae8706f9bd84c31100fd to your computer and use it in GitHub Desktop.
#!/bin/bash
# input is a file containing a list of paths (generated with extract_all_paths.sh)
input=$1
# get the id of the task, generated by GridEngine
taskId=$SGE_TASK_ID
# the number of files to be processed in one run
batchSize=$2
# total number of tasks for this job
tasks=$3
# a file where we will dump all the stderr and stdout output
out=$4
for i in `seq 0 $(( $batchSize - 1 ))`; do
j=$(( $i * $tasks + $taskId ))
# extract the file path from the input
targetFile=`sed "${j}q;d" $input`
echo "[$JOB_NAME] [${JOB_ID}-${taskId}:$i:$j] $targetFile" &>> $3
/usr/bin/xz -zvf "$targetFile" &>> $3
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment