Skip to content

Instantly share code, notes, and snippets.

@kardeiz
Created July 25, 2012 17:41
Show Gist options
  • Save kardeiz/3177463 to your computer and use it in GitHub Desktop.
Save kardeiz/3177463 to your computer and use it in GitHub Desktop.
Generate simple sitemap for DigiTool archive
#!/bin/bash
for i in {1..50}
do
# ea=$(wget -O - --user-agent="botbotbot" "http://specoll.lib.tcu.edu/dtl_publish/$i/index.html" | grep -oh 'href="[0-9]*.html"' | grep -oh [0-9]*)
ea=$(cat "/exlibris/dtl/u3_1/dtle/apache/htdocs/dtl_publish/$i/index.html" | grep -oh 'href="[0-9]*.html"' | grep -oh [0-9]*)
for jj in $ea
do
ace="http://specoll.lib.tcu.edu/dtl_publish/$i/$jj.html"
if ! grep $ace "/exlibris/dtl/u3_1/dtle/apache/htdocs/sitemap.txt"
then echo $ace >> "/exlibris/dtl/u3_1/dtle/apache/htdocs/sitemap.txt"
fi
done
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment