Skip to content

Instantly share code, notes, and snippets.

@meg-codes
Last active May 6, 2019 21:05
Show Gist options
  • Save meg-codes/339bcab040c80afff2baf734e03f25be to your computer and use it in GitHub Desktop.
Save meg-codes/339bcab040c80afff2baf734e03f25be to your computer and use it in GitHub Desktop.
Audit cardholder list produced by mep_cardholders.py
with open('/tmp/sylviabeach-card-images.txt') as found_images:
urls = found_images.read().split('\n')
identifiers = {}
for u in urls:
identifier = '/'.join(u.split('/')[-3:])
identifiers[identifier] = ''
with open('/tmp/pudl0123-825298-noboxes-sorted.txt') as image_list:
with open('/tmp/missing-identifiers.txt', 'w') as missing_list:
missing = []
for line in image_list:
if line.strip() not in identifiers:
missing.append(line)
missing_list.write('Found %d identifiers listed by PUDL not in data export\n' % len(missing))
for line in missing:
missing_list.write(line)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment