Skip to content

Instantly share code, notes, and snippets.

@josiahcarlson
Created February 26, 2012 01:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save josiahcarlson/1912183 to your computer and use it in GitHub Desktop.
Save josiahcarlson/1912183 to your computer and use it in GitHub Desktop.
Convert Word-style annotations to Agile Author style annotations for Manning books
'''convert_agile_author_annotations.py
Written February 25, 2012 by Josiah Carlson
Released into the public domain.
Did you write your annotions for Agile Author using the Word style?
<example id="...">
<title>title</title>
<programlisting>
def foo():
line1 #A
line2 #B
line3 #B
#A annotation 1
#B annotation 2
</programlisting></example>
Do you now need to convert them using the Agile Author co, callout, and
calloutlist elements? I feel your pain. This module will pull the code and
annotations from inside the programlisting element, and using the id given
for the example element, produce the following output:
def foo():
line1 <co id="...-A" linkends="...-Ar"/>
line2 <co id="...-B" linkends="...-Br"/>
line3 <co id="...-B2" linkends="...-Br"/>
</programlisting>
<calloutlist>
<callout id="...-Ar" arearefs="...-A"><para>annotation 1</para></callout>
<callout id="...-Br" arearefs="...-B ...-B2"><para>annotation 2</para></callout>
</calloutlist>
</example>
You can take this output and replace everything after the <programlisting>
element until the end of the </example> element.
Note: this code will only print the first code listing it comes across, then it
will quit. This is on purpose, primarily to allow you to take one example at
a time to convert.
'''
from collections import defaultdict
import re
import sys
from BeautifulSoup import BeautifulStoneSoup as Soup
def main(files):
for f in files:
parsed = Soup(open(f, 'rb').read())
for tag in parsed.findAll('example'):
text = tag.find('programlisting').text
if '#' not in text:
continue
pieces = re.split('#([A-Z])', text)
id = [v for k,v in tag.attrs if k == 'id'][0]
code_done = False
counts = defaultdict(int)
out1 = []
for i, piece in enumerate(pieces):
odd = i & 1
if not code_done and not odd and piece[-1:] == '\n':
code_done = True
out1.append(piece.rstrip() + '\n')
out1.append('</programlisting>\n')
out1.append('<calloutlist>\n')
continue
if code_done:
if odd:
out1.append(' <callout id="%s-%sr" arearefs="%s"><para>'%(
id, piece, ' '.join(
id+'-'+piece+(str(i+1) if i else '') for i in xrange(counts[piece]))))
else:
out1.append(' '.join(piece.split()) + '</para></callout>\n')
else:
if odd:
counts[piece] += 1
out1.append('<co id="%s-%s%s" linkends="%s-%sr"/>'%(
id, piece, counts[piece] if counts[piece] > 1 else '', id, piece))
else:
out1.append(piece)
out1.append('</calloutlist>\n</example>')
print "".join(out1)
raise SystemExit
if __name__ == '__main__':
if len(sys.argv) > 1:
main(sys.argv[1:])
else:
print "usage: python convert_agile_author_annotations.py [agile author xml files]"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment