Skip to content

Instantly share code, notes, and snippets.

@a-chen
Last active June 19, 2023 21:01
Show Gist options
  • Save a-chen/30f1e1b1a1f3d554c287bd55647acb96 to your computer and use it in GitHub Desktop.
Save a-chen/30f1e1b1a1f3d554c287bd55647acb96 to your computer and use it in GitHub Desktop.
Strips most OpenLP song export XML data so that's human-readable. The script is written in Python and uses the `lxml` library for XML parsing

OpenLP song export XML stripper

Strips most OpenLP song export XML data so that's human-readable. The script is written in Python and uses the lxml library for XML parsing.

Example

Input file

H012 This Is My Father's World (Maltbie D. Babcock).xml

<?xml version='1.0' encoding='UTF-8'?>
<song xmlns="http://openlyrics.info/namespace/2009/song" version="0.8" createdIn="OpenLP 2.4.6" modifiedIn="OpenLP 2.4.6" modifiedDate="2020-07-05T16:12:57">
  <properties>
    <titles>
      <title>H012 This Is My Father's World</title>
      <title>H012 這是天父世界</title>
    </titles>
    <verseOrder>v1 o1 v2 o2 v3 o3 v4</verseOrder>
    <authors>
      <author>Maltbie D. Babcock</author>
    </authors>
    <songbooks>
      <songbook name="Hymnary" entry="12"/>
    </songbooks>
  </properties>
  <format>
    <tags application="OpenLP">
      <tag name="su">
        <open>&lt;sup&gt;</open>
        <close>&lt;/sup&gt;</close>
      </tag>
      <tag name="y">
        <open>&lt;span style="-webkit-text-fill-color:yellow"&gt;</open>
        <close>&lt;/span&gt;</close>
      </tag>
    </tags>
  </format>
  <lyrics>
    <verse name="v1">
      <lines><tag name="su">1/3</tag> This is my Father’s world,<br/>And to my listening ears<br/>All nature sings, and round me rings<br/>The music of the spheres.<br/><br/><tag name="y">這 是 天 父 世 界 ,<br/>我 們 側 耳 要 聽 ,<br/>宇 宙 歌 唱 , 四 圍 響 應 ,<br/>星 辰 作 樂 同 聲 .</tag></lines>
    </verse>
    <verse name="o1">
      <lines>This is my Father’s world;<br/>I rest me in the thought<br/>Of rocks and trees, of skies and seas―<br/>His hand the wonders wrought.<br/><br/><tag name="y">這 是 天 父 世 界 ,<br/>我 心 滿 有 安 寧 ;<br/>樹 木 花 草 蒼 天 碧 海<br/>述 說 天 父 全 能</tag></lines>
    </verse>
    <verse name="v2">
      <lines><tag name="su">2/3</tag> This is my Father’s world,<br/>The birds their carols raise,<br/>The morning light, the lily white,<br/>Declare their Maker’s praise.<br/><br/><tag name="y">這 是 天 父 世 界<br/>小 鳥 展 翅 飛 鳴<br/>清 晨 明 亮 好 花 美 麗<br/>證 明 天 理 精 深</tag></lines>
    </verse>
    <verse name="o2">
      <lines>This is my Father’s world:<br/>He shines in all that’s fair;<br/>In the rustling grass I hear Him pass, <br/>He speaks to me everywhere.<br/><br/><tag name="y">這 是 天 父 世 界<br/>祂 愛 普 及 萬 千<br/>風 吹 之 草 將 祂 表 現 <br/>天 父 充 滿 世 間</tag></lines>
    </verse>
    <verse name="v3">
      <lines><tag name="su">3/3</tag> This is my Father’s world,<br/>O let me ne’er forget<br/>That tho’ the wrong seems oft so strong,<br/>God is the Ruler yet.<br/><br/><tag name="y">這 是 天 父 世 界<br/>求 主 叫 我 不 忘<br/>罪 惡 雖 然 好 像 得 勝<br/>天 父 卻 仍 掌 管</tag></lines>
    </verse>
    <verse name="o3">
      <lines>This is my Father’s world:<br/>Why should my heart be sad?<br/>The Lord is King: let the heavens ring!<br/>God reigns: let earth be glad!<br/><br/><tag name="y">這 是 天 父 世 界<br/>我 心 不 必 憂 傷<br/>我 主 作 王  天 地 同 唱<br/>歌 聲 充 滿 萬 方</tag></lines>
    </verse>
    <verse name="v4">
      <lines/>
    </verse>
  </lyrics>
</song>

Output file

H012 This Is My Father's World (Maltbie D. Babcock).txt

Title(s):
H012 This Is My Father's World
H012 這是天父世界
Verse Order: v1 o1 v2 o2 v3 o3 v4
Author(s): Maltbie D. Babcock
Lyrics:
1/3 This is my Father’s world,And to my listening earsAll nature sings, and round me ringsThe music of the spheres.這 是 天 父 世 界 ,我 們 側 耳 要 聽 ,宇 宙 歌 唱 , 四 圍 響 應 ,星 辰 作 樂 同 聲 .

This is my Father’s world;I rest me in the thoughtOf rocks and trees, of skies and seas―His hand the wonders wrought.這 是 天 父 世 界 ,我 心 滿 有 安 寧 ;樹 木 花 草 蒼 天 碧 海述 說 天 父 全 能

2/3 This is my Father’s world,The birds their carols raise,The morning light, the lily white,Declare their Maker’s praise.這 是 天 父 世 界小 鳥 展 翅 飛 鳴清 晨 明 亮 好 花 美 麗證 明 天 理 精 深

This is my Father’s world:He shines in all that’s fair;In the rustling grass I hear Him pass, He speaks to me everywhere.這 是 天 父 世 界祂 愛 普 及 萬 千風 吹 之 草 將 祂 表 現 天 父 充 滿 世 間

3/3 This is my Father’s world,O let me ne’er forgetThat tho’ the wrong seems oft so strong,God is the Ruler yet.這 是 天 父 世 界求 主 叫 我 不 忘罪 惡 雖 然 好 像 得 勝天 父 卻 仍 掌 管

This is my Father’s world:Why should my heart be sad?The Lord is King: let the heavens ring!God reigns: let earth be glad!這 是 天 父 世 界我 心 不 必 憂 傷我 主 作 王 天 地 同 唱歌 聲 充 滿 萬 方

Getting Started

Prerequisites

  • Python 3.x (Tested with Python 3.7, but newer versions should work as well)

  • pip (Python Package Installer)

You can check your Python version with:

python3 --version

Installing pip

If you don't already have pip installed, you can install it using the script provided by Python's package maintainers:

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py

python3 get-pip.py

You can check your pip version with:

pip --version

Installing Dependencies

This project depends on the lxml library. You can install it using pip:

python3 -m pip install lxml

If you have multiple Python versions installed, replace python3 with the version you used to install dependencies, e.g., python3.7.

Running the Script

  1. Create and put xml files into "input" directory
  2. The script reads XML files from the input directory and writes the parsed information into text files in the output directory. output directory will be created automatically

After installing the dependencies, you can run the script with:

python3 strip-openlp-song-export-xml.py
import os
import glob
from lxml import etree as ET
# Directory path
input_directory = os.path.join(os.getcwd(), 'input')
output_directory = os.path.join(os.getcwd(), 'output')
# Create output directory if it doesn't exist
os.makedirs(output_directory, exist_ok=True)
# Remove all existing files in output directory
for filename in os.listdir(output_directory):
os.remove(os.path.join(output_directory, filename))
# Get all XML files in the input directory
files = glob.glob(os.path.join(input_directory, '*.xml'))
ns = {'ns': 'http://openlyrics.info/namespace/2009/song'}
for file in files:
# Parse XML file
parser = ET.XMLParser(remove_blank_text=True)
tree = ET.parse(file, parser)
root = tree.getroot()
# Prepare output
output = "Title(s):\n"
titles = root.findall('.//ns:title', ns)
for title in titles:
output += (title.text or '').strip() + "\n"
copyright = root.find('.//ns:copyright', ns)
if copyright is not None and copyright.text:
output += "Copyright: " + copyright.text.strip() + "\n"
verseOrder = root.find('.//ns:verseOrder', ns)
if verseOrder is not None and verseOrder.text:
output += "Verse Order: " + verseOrder.text.strip() + "\n"
ccliNo = root.find('.//ns:ccliNo', ns)
if ccliNo is not None and ccliNo.text:
output += "CCLI No.: " + ccliNo.text.strip() + "\n"
authors = root.findall('.//ns:author', ns)
if authors:
output += "Author(s): "
for i, author in enumerate(authors):
output += (author.text or '').strip()
if i < len(authors) - 1:
output += ", "
else:
output += "\n"
output += "Lyrics:\n"
verses = root.findall('.//ns:verse', ns)
for verse in verses:
lines_in_verse = verse.findall('.//ns:lines', ns)
for lines in lines_in_verse:
if lines is not None:
lines_text = ET.tostring(lines, method='text', encoding='utf-8').decode('utf-8')
lines_text = lines_text.replace('<br/>', '\n').strip()
lines_text = ' '.join(lines_text.split())
output += lines_text + "\n"
output += "\n"
# Write output to a text file in output directory
txt_file_name = os.path.join(output_directory, os.path.splitext(os.path.basename(file))[0] + '.txt')
with open(txt_file_name, 'w') as f:
f.write(output)
print("All files have been processed and saved!")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment