Skip to content

Instantly share code, notes, and snippets.

@jwhb
Last active April 29, 2019 08:11
Show Gist options
  • Save jwhb/ea7d0ed1c22d3ed8cfd7be00252e1f25 to your computer and use it in GitHub Desktop.
Save jwhb/ea7d0ed1c22d3ed8cfd7be00252e1f25 to your computer and use it in GitHub Desktop.
Search all PDF files in directory, show all matches for specified search text.
#!/bin/sh
if [ ! "$#" -eq "2" ]; then
echo "Usage: $0 <path> <search_text>"
exit 1
fi
find "$1" -name '*.pdf' -exec sh -c "pdftotext -q '{}' - | grep --with-filename --label='{}' --color -i '$2'" \;
@jwhb
Copy link
Author

jwhb commented Apr 29, 2019

Example

$ ../literature/search_pdf.sh ../literature/ "irrelevant"
../literature/DockerIntroAnalysisPerformance.pdf:present irrelevant overhead for CPU and memory execution.
../literature/DockerIntroAnalysisPerformance.pdf:KVM and docker present irrelevant overhead for CPU and
../literature/InformationSecurity_Stamp.pdf:ignore details that I deem irrelevant to the topic at hand. You can judge
../literature/ISEC-CIA.pdf:discussion irrelevant), e.g., that people may choose to act on other goals (e.g.,
../literature/RelationshipOfDevOpsToAgileLea_978-3-319-49094-6.pdf:duplicates and 55 irrelevant document types, such as glossaries, indexes and
../literature/whatisdevops_978-1-4503-4134-9.pdf:Irrelevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment