Created
February 1, 2013 19:52
-
-
Save kcarnold/4693641 to your computer and use it in GitHub Desktop.
Download LaTeX source code from Google Drive, strip it to plain text without comments, and compile. Just set the Google Doc sharing mode to "Anyone with the link", and paste the unique part of that link into the appropriate place in gen.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import lxml.html | |
import sys | |
doc = lxml.html.fromstring(sys.stdin.read()) | |
for elt in doc.cssselect('a, div, style, title'): | |
elt.getparent().remove(elt) | |
s = u'\n'.join((elt.text_content() for elt in doc.cssselect('p, h1, h2, h3, h4, h5, h6'))) # add other nodes if I forgot any | |
tr = [ | |
(u'\u2018', "`"), | |
(u'\u2019', "'"), | |
(u'\u201c', "``"), | |
(u'\u201d', "''"), | |
(u'\xa0', ' '), # no-break space | |
] | |
for a, b in tr: | |
s = s.replace(a, b) | |
sys.stdout.write(s.encode('latin1')) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
set -e | |
function get() { | |
curl "https://docs.google.com/document/d/$2/export?format=html" | python dehtml.py > "$1".tex | |
} | |
NAME=paper | |
get $NAME THE_LINK_CODE | |
#get intro ANOTHER_CODE | |
#get results ANOTHER_CODE | |
get ending ANOTHER_CODE | |
pdflatex $NAME | |
bibtex $NAME | |
pdflatex $NAME | |
pdflatex $NAME |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It would be quite easy to do this with a single Python script also.