Skip to content

Instantly share code, notes, and snippets.

@gagarine
Last active December 12, 2015 08:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gagarine/4745331 to your computer and use it in GitHub Desktop.
Save gagarine/4745331 to your computer and use it in GitHub Desktop.
Documentation store and annotate

Target

Storage

  • Store documents
  • Get the structure of plain text (title, chapter, paragraphe, sentence, line, word we want to turn in link) to a structured version (XML)
  • Store structured documents in a NoSQL database (or git?)

Annotations

  • Attach annotation in document

Version and history

  • Documents can have different version: language,
  • Documents evolve with time (like law)

Tools and methodes

Annotator

http://okfnlabs.org/annotator/

Structuring document

How do we cute plain text document into paragraphe, sentence and so on? Some texts have also inline reference (law, bible).

Text processing

https://github.com/knipknap/Gelatin

Natural Language processing

https://code.google.com/p/nltk/

Bible specific

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment