Skip to content

Instantly share code, notes, and snippets.

@shaunagm
Created July 9, 2022 18:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shaunagm/293846720e5dad4731856d4b7bed5ac5 to your computer and use it in GitHub Desktop.
Save shaunagm/293846720e5dad4731856d4b7bed5ac5 to your computer and use it in GitHub Desktop.
Sample code for the re:power parsons training
"""
This file functions as a brief introduction to Parsons. To install Parsons, follow this
guide: https://parsons.pubpub.org/installation. If you're used to installing with pip, the tldr is
"pip install parsons".
"""
# print("Let's do this!")
"""
This is a Parsons Table. It's how we store data using Parsons. You can create a Table from a variety of sources,
like a regular old Python dictionary, a csv of data, a Pandas dataframe, a local database, or
from a third-party platform.
"""
# from parsons import Table
# table = Table.from_csv("testdata.csv") # won't work without a testdata.csv file in this directory
# table = Table.from_postgres("SELECT * FROM people WHERE registration == 'confirmed'") # won't work without postgres connection data
# table = Table([{"a": 1, "b": 2}, {"a": 10, "b": 20}])
# print(table)
"""
To get data from a third-party platform, you'll use connectors.
"""
# from parsons import MobilizeAmerica, GoogleSheets, VAN, Redshift
# mobilize = MobilizeAmerica()
# mobilize.uri = "https://staging-api.mobilize.us/v1/" # only necessary since we're using a "developer sandbox"
# mobilize_people = mobilize.get_people(organization_id=182)
# print(mobilize_people)
"""
A Parsons Table has some convenience methods. You can also access underlying PETL methods with .table
"""
# print(mobilize_people.columns)
# print(mobilize_people.num_rows)
# print(mobilize_people.table.look(limit=10)) # 'look' displays the first x rows
"""
Parsons is most useful for moving data from one place to another.
"""
# google_sheets = GoogleSheets()
# sheet_id = google_sheets.create_spreadsheet('Mobilize Data', editor_email="shaunagm@gmail.com")
# sheet_id = "1s3B7hqIIRRgmAog2XPf706owvzUzL5ZqHEG_B3Ej0QA"
# mobilize_basic_data = mobilize_people.cut(["given_name", "family_name", "id"])
# google_sheets.overwrite_sheet(sheet_id, mobilize_basic_data)
"""
Parsons Tables standardize data from different sources, but you might still want to do some transforms
along the way. You can apply transformations via functions.
For example, this transformation which changes the created date timestamp to be more human readable.
"""
# from datetime import datetime
# def convert_to_legible_date(unix_date):
# return datetime.utcfromtimestamp(int(unix_date)).strftime('%Y-%m-%d %H:%M:%S')
# mobilize_people.convert_column("created_date", convert_to_legible_date)
# print(mobilize_people)
"""
These methods are very helpful when translating data between platforms. For instance, Mobilizes uses list
structure for their contact fields, but Google sheet cells need a single value.
ie: 'email_addresses': [{'primary': True, 'address': 'email@email.com'}]
I used a "cut" helper method to exclude those columns above, but if I try the whole table, I get an error:
"""
# google_sheets.overwrite_sheet(sheet_id, mobilize_people)
"""
But I can write a transformation function that extracts the actual values:
"""
def get_primary_contact_or_first(contact_list):
extracted_contacts = []
for contact in contact_list:
if "number" in contact: selector = "number"
if "postal_code" in contact: selector = "postal_code"
if "address" in contact: selector = "address"
if contact["primary"]:
return contact[selector]
else:
extracted_contacts.append(contact[selector])
if extracted_contacts:
return extracted_contacts[0]
return None
# mobilize_people.convert_column('phone_numbers', get_primary_contact_or_first)
# mobilize_people.convert_column('postal_addresses', get_primary_contact_or_first)
# mobilize_people.convert_column('email_addresses', get_primary_contact_or_first)
"""
Once we've made those changes, we can push all of the data to Google Sheets.
"""
# google_sheets.overwrite_sheet(sheet_id, mobilize_people)
"""
You can also do row selections, and simple renames.
"""
# mobilize_people.rename_column("given_name", "first_name")
# created_this_year = mobilize_people.select_rows("'2022-' in {created_date}")
# print("Created this year: ")
# print(created_this_year)
# has_phone_number = mobilize_people.select_rows(lambda row: row.phone_numbers)
# print("Has phone number: ")
# print(has_phone_number)
"""
Complete Code Example: move all Mobilize users with phone numbers to Google Sheets
"""
# from parsons import MobilizeAmerica, GoogleSheets
# mobilize, google_sheets = MobilizeAmerica(), GoogleSheets()
# mobilize.uri = "https://staging-api.mobilize.us/v1/" # only necessary since we're using a "developer sandbox"
# mobilize_people = mobilize.get_people(organization_id=182)
# mobilize_people.convert_column('phone_numbers', get_primary_contact_or_first)
# mobilize_people.convert_column('postal_addresses', get_primary_contact_or_first)
# mobilize_people.convert_column('email_addresses', get_primary_contact_or_first)
# has_phone_number = mobilize_people.select_rows(lambda row: row.phone_numbers)
# google_sheets.overwrite_sheet("1s3B7hqIIRRgmAog2XPf706owvzUzL5ZqHEG_B3Ej0QA", has_phone_number)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment