Skip to content

Instantly share code, notes, and snippets.

@johnxx
Created August 24, 2018 21:01
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save johnxx/474ec1b819d585080bbe677c4939cfe1 to your computer and use it in GitHub Desktop.
Save johnxx/474ec1b819d585080bbe677c4939cfe1 to your computer and use it in GitHub Desktop.
Use html-xml-utils like grep with CSS selectors
#!/bin/bash
if ! which hxselect > /dev/null; then
echo You need hxselect and hxnormalize
echo Run this: sudo apt-get install html-xml-utils
exit 1
fi
hxselect="hxselect -s \n -i"
hxnormalize="hxnormalize -l 10000000 -x "
while [ $# -gt 0 ]; do
case "$1" in
-c)
hxselect_args="-c"
shift
;;
*)
selector=$1
shift
break
;;
esac
done
if [ ! -z $2 ] ; then
for filename in "$@" ; do
results=$($hxnormalize "$filename" | $hxselect $hxselect_args "$selector")
if [ ! -z "$results" ] ; then
for r in "$results" ; do
echo "$filename: $r"
done
fi
done
elif [ ! -z $1 ] ; then
filename=$1
$hxnormalize "$filename" | $hxselect $hxselect_args "$selector"
else
$hxnormalize | $hxselect $hxselect_args "$selector"
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment