Skip to content

Instantly share code, notes, and snippets.

@gabrielsimoes
Last active December 30, 2015 00:27
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save gabrielsimoes/718319ec11e0e019eae5 to your computer and use it in GitHub Desktop.
Save gabrielsimoes/718319ec11e0e019eae5 to your computer and use it in GitHub Desktop.
Download homiliae from vatican and save as html
#!/bin/bash
link=$1
if [[ ! -n $1 ]]
then
echo "ERROR:no link" >&2
exit 1
fi
#htmlheader="<?xml version=\"1.0\" encoding=\"utf-8\"?>
#<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.1//EN\"
# \"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd\">
#
#<html xmlns=\"http://www.w3.org/1999/xhtml\">
#<head>
#<title></title>
#</head>
#<body>
#"
htmlheader="<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">
<head>
<title></title>
</head>
<body>
"
htmlfooter="</body>
</html>
"
curl $link > .tp1
cat .tp1 |
#sed 's/&laquo;&#x2005;/""/g' | sed 's/&#x2005;&raquo;/"/g' |
recode html..utf-8 |
sed 's/–/–/g' |
sed -e '/<!-- \/CONTENUTO DOCUMENTO -->/q' |
tac | sed -e '/<!-- CONTENUTO DOCUMENTO -->/q' |
sed "s/“/\"/g" |
sed "s/”/\"/g" |
sed "s/’/\'/g" |
sed "s/…/.../g" > tp2.html
echo "$htmlheader" | tac >> tp2.html
cat tp2.html | tac > page.html
echo "$htmlfooter" | tac >> page.tml
rm tp2.html .tp1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment