Skip to content

Instantly share code, notes, and snippets.

@rsvp
Created March 25, 2012 03:14
Show Gist options
  • Save rsvp/2191033 to your computer and use it in GitHub Desktop.
Save rsvp/2191033 to your computer and use it in GitHub Desktop.
argin.sh : universal xargs properly handled from ANY standard input. Plus parallelization is activated by default. (No more hassles with the null character :-) argin.sh converts lines from stdin into arguments to build and execute a command individually.
#!/usr/bin/env bash
# bash 4.1.5(1) Linux Ubuntu 10.04 Date : 2012-03-22
#
# _______________| argin : xargs properly handled from ANY standard input.
# xargs converts stdin into arguments in order
# to build and execute a command individually.
#
# Usage: argin [options] command [{} as placeholder if necessary]
#
# Examples: % argin -t echo {} < foo.txt 2> argin-done.txt
# # -t prints command to STDERR before issuing it.
# # -p prompt about whether to run each command line.
# # -P1 for serial, not faster parallel, execution.
# #
# % cat foo.txt | argin echo {}
# # same thing without a record of executed commands.
# # {} placeholder is available for incoming arguments;
# % cat foo.txt | argin echo
# # absent {} will be implicitly placed at the end.
#
# Dependencies: xargs and tr.
# [GNU parallel is not needed here -- it's overkill ;-]
#
# ___ATTN___ xargs works best with null character as separator.
# argin is the prefilter which solves this problem from ANY standard input.
# Blank and comment lines are ignored by argin.
# CHANGE LOG LATEST version available: https://bitbucket.org/rsvp/gists/src
#
# 2012-03-22 Minor fixes.
# 2012-03-21 Joshua Levy suggested using parallel option -P0
# for dramatic speedup in execution.
# 2012-03-20 Most helpful reference: http://en.wikipedia.org/wiki/Xargs
# _____ PREAMBLE_v2: settings, variables, and error handling.
#
LC_ALL=POSIX
# locale means "ASCII, US English, no special rules,
# output per ISO and RFC standards."
# Esp. use ASCII encoding for glob and sorting characters.
shopt -s extglob
# ^set extended glob for pattern matching.
set -e
# ^errors checked: immediate exit if a command has non-zero status.
set -u
# ^unassigned variables shall be errors.
# Example of default VARIABLE ASSIGNMENT: arg1=${1:-'foo'}
cmdarg="$@"
program=${0##*/} # similar to using basename
# memf=$( mktemp /dev/shm/88_${program}_tmp.XXXXXXXXXX )
cleanup () {
# Delete temporary files, then optionally exit given status.
local status=${1:-'0'}
# rm -f $memf
[ $status = '-1' ] || exit $status # thus -1 prevents exit.
} #--------------------------------------------------------------------
warn () {
# Message with basename to stderr. Usage: warn "message"
echo -e "\n !! ${program}: $1 " >&2
} #--------------------------------------------------------------------
die () {
# Exit with status of most recent command or custom status, after
# cleanup and warn. Usage: command || die "message" [status]
local status=${2:-"$?"}
cleanup -1 && warn "$1" && exit $status
} #--------------------------------------------------------------------
trap "die 'SIG disruption, but cleanup finished.' 114" 1 2 3 15
# Cleanup after INTERRUPT: 1=SIGHUP, 2=SIGINT, 3=SIGQUIT, 15=SIGTERM
#
# _______________ :: BEGIN Script ::::::::::::::::::::::::::::::::::::::::
testarg=${cmdarg//\{\}/}
# ^globally remove {} from cmdarg
# # We could make the use of placeholder {} mandatory...
# [ "$testarg" = "$cmdarg" ] && die "must use {} to replace argument." 115
# ... OR, we can add missing {} at the end for the user's convenience:
if [ "$testarg" = "$cmdarg" ] ; then
cmdarg="$@ {}"
fi
# __________ MAIN one-liner
#
# Delete all blank lines since as empty arguments they could be dangerous.
# Also delete comment lines beginning with '# ' -- so allow hashtag.
# sed is using standard input '-' here.
# Then, translate any newline to NULL CHARACTER.
sed -e '/^[[:blank:]]*$/d' -e '/^# /d' - \
| tr \\n \\0 \
| xargs -r -0 -I '{}' -n1 -P0 $cmdarg || die "given bad command." 113
# ^Parallel execution for speedup...
# (don't expect serial arg ordering
# unless -P1 is used at commandline;
# later option will overrule).
# ^replacement string.
# -r means don't execute command if no input was received.
# -0 Input items are terminated by a null character instead of by
# whitespace, and the quotes and backslash are not special (every character
# is taken literally). Disables the end of file string, which is treated
# like any other argument. Useful when input items might contain white space,
# quote marks, backslashes, or commas.
# -I replace-str [where we use {} as the replace-str]
# Replace occurrences of replace-str in the initial-arguments with names read
# from standard input. Also, unquoted blanks do not terminate input items;
# instead the separator is the newline character. Implies -x and -L 1.
# ___ATTN___ Thus this uses at most ONE NONBLANK INPUT LINES PER COMMAND LINE.
#
# GENERALLY, blank lines from standard input are ignored.
#
# Without replace-str, the argument simply gets tacked on the end.
# So for example, without {} this move would not be possible:
# % cat filenames.txt | argin mv {} /tmp
#
# The use of '{}' as placeholder is merely conventional,
# it could have been '@@ARGUMENT' or any other string.
# _____ Other options, esp. PARALLEL execution.
# Speedup is over 2x even on single core processor.
# Execution will not be in serial order of arguments.
#
# --max-procs=max-procs
# -P max-procs
# Run up to max-procs processes at a time; the default is 1. If
# max-procs is 0, xargs will run as many processes as possible
# at a time. Use the -n option with -P; otherwise chances are
# that only one exec will be done.
#
# --max-args=max-args
# -n max-args
# Use at most max-args arguments per command line. Fewer than
# max-args arguments will be used if the size (see the -s
# option) is exceeded, unless the -x option is given, in which
# case xargs will exit.
# [ Set to 1 without any loss of generality
# since -L1 is implied by using placeholder with -I. ]
cleanup
# _______________ EOS :: END of Script ::::::::::::::::::::::::::::::::::::::::
# _____ DECLARING FREEDOM from remembering these NULL options! :-)
#
# perl (requires -0 and \0 instead of \n)
# locate (requires using -0)
# find (requires using -print0)
# grep (requires -z or -Z)
# sort (requires using -z)
# vim: set fileencoding=utf-8 ff=unix tw=78 ai syn=sh :
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment