Skip to content

Instantly share code, notes, and snippets.

Last active Jul 29, 2021
What would you like to do?
Converts images in a directory to a comic/manga EPUB3 ebook. Can be used to convert extracted CBZ/CBR to EPUB3.

Converts a directory of images into a modern EPUB3 ebook. Use a tool to extract CBZ/CBR/CBT files and then run this program to generate a nice fixed-layout EPUB ebook of it. You can optionally set the reading direction to right-to-left (e.g. for manga). For Kobo ereaders, use the file extension .kepub.epub to get the modern reader and correct reading direction.


Install dependencies with pip install imagesize lxml

usage: [-h] [-t TITLE] [-a AUTHOR] [-i STORYID] [-d DIRECTION]
                      [-s SUBJECT] [-l LEVEL] [--pagelist PAGELIST]
                      [--toclist TOCLIST]
                      directory output

positional arguments:
  directory             Path to directory with images
  output                Output EPUB filename

optional arguments:
  -h, --help            show this help message and exit
  -t TITLE, --title TITLE
                        Title of the story
  -a AUTHOR, --author AUTHOR
                        Author of the story
  -i STORYID, --storyid STORYID
                        Story id (default: random)
  -d DIRECTION, --direction DIRECTION
                        Reading direction (ltr or rtl, default: ltr)
  -s SUBJECT, --subject SUBJECT
                        Subject of the story. Can be used multiple times.
  -l LEVEL, --level LEVEL
                        Compression level [0-9] (default: 9)
  --pagelist PAGELIST   Text file with list of images
  --toclist TOCLIST     Text file with table of contents


./ -t "Sailor Moon #1" -a "Naoko Takeuchi" -s "Magical Girl" -s "Manga" -d rtl images/ sailormoon1.epub

Advanced usage

You can specify a pagelist. Newlines are ignored:





You can specify a table of contents (EPUB metadata). Newlines are ignored:



Chapter One
#!/usr/bin/env python3
import sys
from os import listdir, path
from lxml import etree
from html import escape
from uuid import uuid4
import argparse
import datetime
import zipfile
import imagesize
parser = argparse.ArgumentParser()
parser.add_argument('-t', '--title', help='Title of the story', default="Unknown Title")
parser.add_argument('-a', '--author', help='Author of the story', default="Unknown Author")
parser.add_argument('-i', '--storyid', help='Story id (default: random)', default='urn:uuid:' + str(uuid4()))
parser.add_argument('-d', '--direction', help='Reading direction (ltr or rtl, default: ltr)', default='ltr')
parser.add_argument('-s', '--subject', help='Subject of the story. Can be used multiple times.', action='append', default=[])
parser.add_argument('-l', '--level', help='Compression level [0-9] (default: 9)', default=9, type=int)
parser.add_argument('--pagelist', help='Text file with list of images')
parser.add_argument('--toclist', help='Text file with table of contents')
parser.add_argument('directory', help='Path to directory with images')
parser.add_argument('output', help='Output EPUB filename')
args = parser.parse_args()
if args.direction != 'rtl':
args.direction = 'ltr'
UID_FORMAT = '{:03d}'
'DC': ''}
CONTAINER_PATH = 'META-INF/container.xml'
CONTAINER_XML = '''<?xml version='1.0' encoding='utf-8'?>
<container xmlns="urn:oasis:names:tc:opendocument:xmlns:container" version="1.0">
<rootfile media-type="application/oebps-package+xml" full-path="OEBPS/content.opf"/>
IBOOKS_DISPLAY_OPTIONS_XML = '''<?xml version="1.0" encoding="UTF-8"?>
<platform name="*">
<option name="fixed-layout">true</option>
<option name="open-to-spread">false</option>
@page {
padding: 0;
margin: 0;
body {
padding: 0;
margin: 0;
height: 100%;
#image {
width: 100%;
height: 100%;
display: block;
margin: 0;
padding: 0;
'jpeg': 'image/jpeg',
'jpg': 'image/jpeg',
'png': 'image/png',
'svg': 'image/svg+xml'
def image2xhtml(imgfile, width, height, title, epubtype='bodymatter', lang='en'):
content = '''<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html>
<html xmlns="" xmlns:epub="" lang="{lang}">
<meta name="viewport" content="width={width}, height={height}"/>
<link rel="stylesheet" type="text/css" href="imagestyle.css"/>
<body epub:type="{epubtype}">
<svg xmlns="" xmlns:xlink="" id="image" version="1.1" viewBox="0 0 {width} {height}"><image width="{width}" height="{height}" xlink:href="{filename}"/></svg>
'''.format(width=width, height=height,
filename=escape(imgfile), title=escape(title),
epubtype=epubtype, lang=lang)
return content
def create_opf(title, author, bookId, imageFiles):
package_attributes = {'xmlns': NAMESPACES['OPF'],
'unique-identifier': 'bookId',
'version': '3.0',
'prefix': 'rendition:',
'dir': args.direction}
nsmap = {'dc': NAMESPACES['DC'], 'opf': NAMESPACES['OPF']}
root = etree.Element('package', package_attributes)
# metadata
metadata = etree.SubElement(root, 'metadata', nsmap=nsmap)
el = etree.SubElement(metadata, 'meta', {'property': 'dcterms:modified'})
el.text ='%Y-%m-%dT%H:%M:%SZ')
el = etree.SubElement(metadata, '{' + NAMESPACES['DC'] + '}identifier', {'id': 'bookId'})
el.text = bookId
el = etree.SubElement(metadata, '{' + NAMESPACES['DC'] + '}title')
el.text = title
el = etree.SubElement(metadata, '{' + NAMESPACES['DC'] + '}creator', {'id': 'creator'})
el.text = author
el = etree.SubElement(metadata, 'meta', {'refines': '#creator', 'property': 'role', 'scheme': 'marc:relators'})
el.text = 'aut'
el = etree.SubElement(metadata, '{' + NAMESPACES['DC'] + '}language')
el.text = 'en'
for subject in args.subject:
el = etree.SubElement(metadata, '{' + NAMESPACES['DC'] + '}subject')
el.text = subject
etree.SubElement(metadata, 'meta', {'name': 'cover', 'content': 'img-' + UID_FORMAT.format(0)})
el = etree.SubElement(metadata, 'meta', {'property': 'rendition:layout'})
el.text = 'pre-paginated'
el = etree.SubElement(metadata, 'meta', {'property': 'rendition:orientation'})
el.text = 'portrait'
el = etree.SubElement(metadata, 'meta', {'property': 'rendition:spread'})
el.text = 'landscape'
width, height = imagesize.get(path.join(, imageFiles[0]))
# width, height = (-1, -1)
etree.SubElement(metadata, 'meta', {'name': 'original-resolution', 'content': str(width) + 'x' + str(height)})
# manifest
manifest = etree.SubElement(root, 'manifest')
etree.SubElement(manifest, 'item', {
'href': 'imagestyle.css',
'id': 'imagestyle',
'media-type': 'text/css'
for i, img in enumerate(imageFiles):
uid = UID_FORMAT.format(i)
ext = path.splitext(img)[1][1:]
imgattrs = {
'href': 'images/page-' + uid + '.' + ext,
'id': 'img-' + uid,
'media-type': IMAGE_TYPES[ext],
if i == 0:
imgattrs['properties'] = 'cover-image'
etree.SubElement(manifest, 'item', imgattrs)
etree.SubElement(manifest, 'item', {
'href': 'page-' + uid + '.xhtml',
'id': 'page-' + uid,
'media-type': 'application/xhtml+xml',
'properties': 'svg'
etree.SubElement(manifest, 'item', {
'href': 'toc.ncx',
'id': 'ncxtoc',
'media-type': 'application/x-dtbncx+xml',
etree.SubElement(manifest, 'item', {
'href': 'toc.xhtml',
'id': 'toc',
'media-type': 'application/xhtml+xml',
'properties': 'nav'
# spine
spine = etree.SubElement(root, 'spine', {
'toc': 'ncxtoc',
'page-progression-direction': args.direction
for i, img in enumerate(imageFiles):
uid = UID_FORMAT.format(i)
props = 'page-spread-left'
if (i % 2 == 0 and args.direction == 'ltr') or (i % 2 != 0 and args.direction == 'rtl'):
props = 'page-spread-right'
etree.SubElement(spine, 'itemref', {
'idref': 'page-' + uid,
'properties': props
tree_str = etree.tostring(root, pretty_print=True, encoding='utf-8', xml_declaration=True)
return tree_str
def create_ncx(title, author, book_id):
return '''<?xml version="1.0" encoding="utf-8" standalone="no"?>
<ncx:ncx xmlns:ncx="" version="2005-1">
<ncx:meta name="dtb:uid" content="{book_id}"/>
<ncx:meta name="dtb:depth" content="1"/>
<ncx:meta name="dtb:totalPageCount" content="0"/>
<ncx:meta name="dtb:maxPageNumber" content="0"/>
<ncx:navPoint id="p1" playOrder="1">
<ncx:content src="page-000.xhtml"/>
'''.format(title=escape(title), author=escape(author), book_id=book_id)
def create_nav(title, page_count):
pages = [None] * page_count
for i, page in enumerate(pages):
uid = UID_FORMAT.format(i)
pages[i] = ' <li><a href="page-{uid}.xhtml">{page_number}</a></li>'.format(uid=uid, page_number=i)
toc = [(0, title)]
if args.toclist:
toc = []
title = ""
img = ""
with open(args.toclist) as toclist:
for item in toclist:
if item.strip():
if not title:
title = item.strip()
img = item.strip()
toc.append((imageFiles.index(img), title))
title = ""
tochtml = []
for item in toc:
(i, name) = item
print(i, name)
uid = UID_FORMAT.format(i)
tochtml.append(' <li epub:type="chapter"><a href="page-{uid}.xhtml">{name}</a></li>'.format(uid=uid, name=escape(name)))
return '''<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html>
<html xmlns="" xmlns:epub="" lang="en">
<section class="frontmatter" epub:type="frontmatter toc">
<h1>Table of Contents</h1>
<nav epub:type="toc" id="toc">
<nav epub:type="page-list">
</html>'''.format(pages='\n'.join(pages), toc='\n'.join(tochtml), title=escape(title))
if not args.pagelist:
imageFiles = sorted([f for f in listdir( if path.isfile(path.join(, f))])
imageFiles = []
with open(args.pagelist) as pagelist:
for page in pagelist:
if page.strip():
imageFiles = list(filter(lambda img: path.splitext(img)[1][1:] in IMAGE_TYPES, imageFiles))
if len(imageFiles) < 1:
print('Too few images:', len(imageFiles))
print('Found ' + str(len(imageFiles)) + ' pages.')
prev_compression = zipfile.zlib.Z_DEFAULT_COMPRESSION
zipfile.zlib.Z_DEFAULT_COMPRESSION = args.level
output = zipfile.ZipFile(args.output, 'w', zipfile.ZIP_DEFLATED)
output.writestr('mimetype', 'application/epub+zip', compress_type=zipfile.ZIP_STORED)
output.writestr('OEBPS/content.opf', create_opf(args.title,, args.storyid, imageFiles))
output.writestr('OEBPS/toc.ncx', create_ncx(args.title,, args.storyid))
output.writestr('OEBPS/toc.xhtml', create_nav(args.title, len(imageFiles)))
output.writestr('OEBPS/imagestyle.css', IMAGESTYLE_CSS)
for i, img in enumerate(imageFiles):
uid = UID_FORMAT.format(i)
title = 'Page ' + str(i)
ext = path.splitext(img)[1][1:]
epubtype = 'bodymatter'
if i == 0:
title = 'Cover'
epubtype = 'cover'
if ext == 'svg':
width, height = (-1, -1)
width, height = imagesize.get(path.join(, img))
print(str(round(i/len(imageFiles)*100)) + '%', 'Processing page ' + str(i+1) + ' of ' + str(len(imageFiles)) + ': ' + img, '(' + str(width) + 'x' + str(height) + ')')
html = image2xhtml('images/page-' + uid + '.' + ext, width, height, title, epubtype, 'en')
output.writestr('OEBPS/page-{uid}.xhtml'.format(uid=uid), html)
output.write(path.join(, img), 'OEBPS/images/page-' + uid + '.' + ext)
zipfile.zlib.Z_DEFAULT_COMPRESSION = prev_compression
print('Complete! Saved EPUB as ' + args.output)
Copy link

imkh commented Jul 29, 2021

Thanks for this script @daniel-j! Especially for mangas, it's the only way I found that results in a nice formatted EPUB to read in Apple Books (Calibre completely messes the layout, I'm guessing because their EPUB conversion doesn't support fixed-layout).

It'd be great if the script would also check if there are landscape images and split them in half, otherwise Apple Books only shows half of the image. I created my own script for this to run before yours:

If I have a small request: how difficult would it be to add support for nested chapters in the table of contents file? Something like:

Volume 1

	Chapter 1

	Chapter 2

Volume 2

	Chapter 1

	Chapter 2

I tried to add it myself but my Python skills are just too limited 😬


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment