Skip to content

Instantly share code, notes, and snippets.

@rhortal
rhortal / wxr2txt.py
Last active February 2, 2024 15:59 — forked from ruslanosipov/wxr2txt.py
Script to convert WordPress posts to plain text files. Works with Python 3.x and strips out HTML.
#!/usr/bin/env python3
"""This script converts WXR file to a number of plain text files.
WXR stands for "WordPress eXtended RSS", which basically is just a
regular XML file. This script extracts entries from the WXR file into
plain text files. Output format: article name prefixed by date for
posts, article name for pages.
Usage: wxr2txt.py filename [-o output_dir]