Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
# coding: utf-8
import xml.etree.ElementTree as ET
import re
import sys
myname = sys.argv[0][:sys.argv[0].rfind(".")]
print(myname, 'v0.2')
if len(sys.argv) < 4:
print('usage: python', sys.argv[0], 'in.dat out.dat region1 [region2 region3 ...]')
sys.exit(1)
indat = sys.argv[1]
outdat = sys.argv[2]
print('[in] ', indat)
print('[out]', outdat)
search_regions = set([region.lower() for region in sys.argv[3:]])
#print(search_regions)
tree = ET.parse(indat)
root = tree.getroot()
pattern = re.compile('\((.+?)\)')
for child in root.findall('game'):
name = child.attrib.get('name', '')
if not any([search_regions.intersection([region.strip().lower() for region in regions.split(',')]) for regions in re.findall(pattern, name)]):
root.remove(child)
#print(name)
print('count =', len(root.findall('game')))
tree.write(outdat)
@fuzz6001
Copy link
Author

fuzz6001 commented Feb 23, 2018

Usage

$ python RegionExtractor.py
RegionExtractor v0.2
usage: python RegionExtractor.py in.dat out.dat region1 [region2 region3 ...]

only Europe

$ python RegionExtractor.py ps2.dat ps2_europe.dat europe
RegionExtractor v0.2
[in]  ps2.dat
[out] ps2_europe.dat
count = 2767

USA & Japan

$ python RegionExtractor.py ps2.dat ps2_usa_japan.dat usa japan
RegionExtractor v0.2
[in]  ps2.dat
[out] ps2_usa_japan.dat
count = 4615

History

v0.2

solved the region extraction issue.
You can extract all titles now!

v0.1

initial release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment