Skip to content

Instantly share code, notes, and snippets.

View AlanReviews's full-sized avatar

Alan Zhou AlanReviews

View GitHub Profile
@etienned
etienned / extractdocx.py
Last active November 21, 2022 13:56
Simple function to extract text from MS XML Word document (.docx) without any dependencies.
try:
from xml.etree.cElementTree import XML
except ImportError:
from xml.etree.ElementTree import XML
import zipfile
"""
Module that extract text from MS XML Word document (.docx).
(Inspired by python-docx <https://github.com/mikemaccana/python-docx>)