YES
- https://github.com/buriy/python-readability [content,title] very good, works well
PYTHON2 ONLY
- https://github.com/codelucas/newspaper [content,title,authors,etc.] wrote an issue
- https://github.com/seomoz/dragnet [content,comments,learning]
- https://github.com/grangier/python-goose [content] PR ready to be merged
MAYBE
- https://github.com/scrapinghub/extruct [framework]
- https://github.com/andreypopp/extracty [authors, title]
NO
- https://github.com/ziyan/spider [experimental] maybe read algorithm
- https://github.com/miso-belica/jusText [content] doesn't work
OTHER
PROPRIETARY
- diffbot
- import.io
- embed.ly