Skip to content

Instantly share code, notes, and snippets.

@choryuidentify
Created February 8, 2018 01:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save choryuidentify/1b7b24a07082ca9e9e9b6e2ac027a50b to your computer and use it in GitHub Desktop.
Save choryuidentify/1b7b24a07082ca9e9e9b6e2ac027a50b to your computer and use it in GitHub Desktop.
bstidy.py
from ebooklib.plugins.base import BasePlugin
from bs4 import BeautifulSoup
def tidy_cleanup(content):
return BeautifulSoup(content, "html.parser", from_encoding="UTF-8").prettify()
class TidyPlugin(BasePlugin):
NAME = 'Tidy HTML using BeautifulSoup4'
def html_before_write(self, book, chapter):
if not chapter.content:
return None
return tidy_cleanup(chapter.content)
def html_after_read(self, book, chapter):
if not chapter.content:
return None
return tidy_cleanup(chapter.content)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment