Skip to content

Instantly share code, notes, and snippets.

@griloHBG
Last active May 25, 2022 15:42
Show Gist options
  • Save griloHBG/c1ed4880866adca82c0c19d3e097c31d to your computer and use it in GitHub Desktop.
Save griloHBG/c1ed4880866adca82c0c19d3e097c31d to your computer and use it in GitHub Desktop.
Convert all PDF files in a directory (and all directories within) to text
# making sure that names with spaces won't be a problem
IFS=$'\n';
# head to remove the last 2 lines (one blank and one summary) and tail to remove the first line (always a dot) of tree output
# TODO: what about *.PDF?
for i in $(tree . -f -P "*.pdf" -i | head -n -2 | tail -n +2);
# echo the relative file path
do echo -n "${i}: ";
# if it is a file (and not a directory), perform the conversion
[ -f ${i} ] && echo PDF && pdftotext "${i}" "${i}.txt";
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment