Created
March 25, 2012 03:14
-
-
Save rsvp/2191033 to your computer and use it in GitHub Desktop.
argin.sh : universal xargs properly handled from ANY standard input. Plus parallelization is activated by default. (No more hassles with the null character :-) argin.sh converts lines from stdin into arguments to build and execute a command individually.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
# bash 4.1.5(1) Linux Ubuntu 10.04 Date : 2012-03-22 | |
# | |
# _______________| argin : xargs properly handled from ANY standard input. | |
# xargs converts stdin into arguments in order | |
# to build and execute a command individually. | |
# | |
# Usage: argin [options] command [{} as placeholder if necessary] | |
# | |
# Examples: % argin -t echo {} < foo.txt 2> argin-done.txt | |
# # -t prints command to STDERR before issuing it. | |
# # -p prompt about whether to run each command line. | |
# # -P1 for serial, not faster parallel, execution. | |
# # | |
# % cat foo.txt | argin echo {} | |
# # same thing without a record of executed commands. | |
# # {} placeholder is available for incoming arguments; | |
# % cat foo.txt | argin echo | |
# # absent {} will be implicitly placed at the end. | |
# | |
# Dependencies: xargs and tr. | |
# [GNU parallel is not needed here -- it's overkill ;-] | |
# | |
# ___ATTN___ xargs works best with null character as separator. | |
# argin is the prefilter which solves this problem from ANY standard input. | |
# Blank and comment lines are ignored by argin. | |
# CHANGE LOG LATEST version available: https://bitbucket.org/rsvp/gists/src | |
# | |
# 2012-03-22 Minor fixes. | |
# 2012-03-21 Joshua Levy suggested using parallel option -P0 | |
# for dramatic speedup in execution. | |
# 2012-03-20 Most helpful reference: http://en.wikipedia.org/wiki/Xargs | |
# _____ PREAMBLE_v2: settings, variables, and error handling. | |
# | |
LC_ALL=POSIX | |
# locale means "ASCII, US English, no special rules, | |
# output per ISO and RFC standards." | |
# Esp. use ASCII encoding for glob and sorting characters. | |
shopt -s extglob | |
# ^set extended glob for pattern matching. | |
set -e | |
# ^errors checked: immediate exit if a command has non-zero status. | |
set -u | |
# ^unassigned variables shall be errors. | |
# Example of default VARIABLE ASSIGNMENT: arg1=${1:-'foo'} | |
cmdarg="$@" | |
program=${0##*/} # similar to using basename | |
# memf=$( mktemp /dev/shm/88_${program}_tmp.XXXXXXXXXX ) | |
cleanup () { | |
# Delete temporary files, then optionally exit given status. | |
local status=${1:-'0'} | |
# rm -f $memf | |
[ $status = '-1' ] || exit $status # thus -1 prevents exit. | |
} #-------------------------------------------------------------------- | |
warn () { | |
# Message with basename to stderr. Usage: warn "message" | |
echo -e "\n !! ${program}: $1 " >&2 | |
} #-------------------------------------------------------------------- | |
die () { | |
# Exit with status of most recent command or custom status, after | |
# cleanup and warn. Usage: command || die "message" [status] | |
local status=${2:-"$?"} | |
cleanup -1 && warn "$1" && exit $status | |
} #-------------------------------------------------------------------- | |
trap "die 'SIG disruption, but cleanup finished.' 114" 1 2 3 15 | |
# Cleanup after INTERRUPT: 1=SIGHUP, 2=SIGINT, 3=SIGQUIT, 15=SIGTERM | |
# | |
# _______________ :: BEGIN Script :::::::::::::::::::::::::::::::::::::::: | |
testarg=${cmdarg//\{\}/} | |
# ^globally remove {} from cmdarg | |
# # We could make the use of placeholder {} mandatory... | |
# [ "$testarg" = "$cmdarg" ] && die "must use {} to replace argument." 115 | |
# ... OR, we can add missing {} at the end for the user's convenience: | |
if [ "$testarg" = "$cmdarg" ] ; then | |
cmdarg="$@ {}" | |
fi | |
# __________ MAIN one-liner | |
# | |
# Delete all blank lines since as empty arguments they could be dangerous. | |
# Also delete comment lines beginning with '# ' -- so allow hashtag. | |
# sed is using standard input '-' here. | |
# Then, translate any newline to NULL CHARACTER. | |
sed -e '/^[[:blank:]]*$/d' -e '/^# /d' - \ | |
| tr \\n \\0 \ | |
| xargs -r -0 -I '{}' -n1 -P0 $cmdarg || die "given bad command." 113 | |
# ^Parallel execution for speedup... | |
# (don't expect serial arg ordering | |
# unless -P1 is used at commandline; | |
# later option will overrule). | |
# ^replacement string. | |
# -r means don't execute command if no input was received. | |
# -0 Input items are terminated by a null character instead of by | |
# whitespace, and the quotes and backslash are not special (every character | |
# is taken literally). Disables the end of file string, which is treated | |
# like any other argument. Useful when input items might contain white space, | |
# quote marks, backslashes, or commas. | |
# -I replace-str [where we use {} as the replace-str] | |
# Replace occurrences of replace-str in the initial-arguments with names read | |
# from standard input. Also, unquoted blanks do not terminate input items; | |
# instead the separator is the newline character. Implies -x and -L 1. | |
# ___ATTN___ Thus this uses at most ONE NONBLANK INPUT LINES PER COMMAND LINE. | |
# | |
# GENERALLY, blank lines from standard input are ignored. | |
# | |
# Without replace-str, the argument simply gets tacked on the end. | |
# So for example, without {} this move would not be possible: | |
# % cat filenames.txt | argin mv {} /tmp | |
# | |
# The use of '{}' as placeholder is merely conventional, | |
# it could have been '@@ARGUMENT' or any other string. | |
# _____ Other options, esp. PARALLEL execution. | |
# Speedup is over 2x even on single core processor. | |
# Execution will not be in serial order of arguments. | |
# | |
# --max-procs=max-procs | |
# -P max-procs | |
# Run up to max-procs processes at a time; the default is 1. If | |
# max-procs is 0, xargs will run as many processes as possible | |
# at a time. Use the -n option with -P; otherwise chances are | |
# that only one exec will be done. | |
# | |
# --max-args=max-args | |
# -n max-args | |
# Use at most max-args arguments per command line. Fewer than | |
# max-args arguments will be used if the size (see the -s | |
# option) is exceeded, unless the -x option is given, in which | |
# case xargs will exit. | |
# [ Set to 1 without any loss of generality | |
# since -L1 is implied by using placeholder with -I. ] | |
cleanup | |
# _______________ EOS :: END of Script :::::::::::::::::::::::::::::::::::::::: | |
# _____ DECLARING FREEDOM from remembering these NULL options! :-) | |
# | |
# perl (requires -0 and \0 instead of \n) | |
# locate (requires using -0) | |
# find (requires using -print0) | |
# grep (requires -z or -Z) | |
# sort (requires using -z) | |
# vim: set fileencoding=utf-8 ff=unix tw=78 ai syn=sh : |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment