Created
May 6, 2019 13:54
-
-
Save karkraeg/57be991d0884811218ec94ecc643c415 to your computer and use it in GitHub Desktop.
load a CSV file and use each lines’ content for a new Mediawiki page. Resulting XML can be imported into Mediawiki and thus batch creating wiki-pages from a CSV should work!
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
df = pd.read_csv('mydata.csv', sep=';') | |
def convert_row(row): | |
""" | |
This function creates a <page> node for each line in the CSV. | |
You can define variables from the CSV columns and reference them in the return statement with curly braces. | |
""" | |
column1 = row.column1 | |
column2 = row.column2 | |
return(f""" | |
<page> | |
<title>{column1}</title> | |
<revision> | |
<model>wikitext</model> | |
<format>text/x-wiki</format> | |
<text> | |
This is a page about {column1} and {column2}. | |
</text> | |
</revision> | |
</page> | |
""") | |
root = f"""<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.10/" | |
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.10/ http://www.mediawiki.org/xml/export-0.10.xsd" version="0.10" xml:lang="en">""" | |
rootclose = f"""</mediawiki>""" | |
f = open('result.xml', 'w', encoding='utf-8') | |
f.write(root + '\n'.join(df.apply(convert_row, axis=1)) + rootclose) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment