Skip to content

Instantly share code, notes, and snippets.

@shaunagm
Created October 2, 2023 19:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shaunagm/9b7185f75ef692b777d3317f09025a16 to your computer and use it in GitHub Desktop.
Save shaunagm/9b7185f75ef692b777d3317f09025a16 to your computer and use it in GitHub Desktop.
Sample code for Oct 2 2023 training
"""
This file functions as a brief introduction to Parsons. To install Parsons, follow this
guide: https://parsons.pubpub.org/installation. If you're used to installing with pip, the tldr is
"pip install parsons".
"""
# print("Let's do this!")
"""
This is a Parsons Table. It's how we store data using Parsons. You can create a Table from a variety of sources,
like a regular old Python dictionary, a csv of data, a Pandas dataframe, a local database, or
from a third-party platform.
See the docs for this here: https://move-coop.github.io/parsons/html/latest/table.html#to-parsons-table
"""
# from parsons import Table
# table = Table.from_csv("testdata.csv") # won't work without a testdata.csv file in this directory
# table = Table.from_postgres("SELECT * FROM people WHERE registration == 'confirmed'") # won't work without postgres connection data
# table = Table([{"a": 1, "b": 2}, {"a": 10, "b": 20}])
# print(table)
"""
To get data from a third-party platform, you'll use connectors. There are nearly 50 connectors, from A to Z - ActBlue to Zoom.
Let's demonstrate getting some data from Action Network.
"""
# from parsons import ActionNetwork
# action_network = ActionNetwork()
"""
You should have just gotten an error. That's because when we work with third party platforms, we usually need to
authenticate ourselves. You can see in the Parsons documentation how to do so - each connector has authentication
instructions in a little blue box at the top of the page.
Action Network: https://move-coop.github.io/parsons/html/stable/action_network.html
"You can generate your key from the API & Sync page, located in the Start Organizing menu, under Details."
We can save authentication info by exporting an environmental variable (recommended) or setting a variable in our script.
"""
# from parsons import ActionNetwork
# api_token=""
# action_network = ActionNetwork(api_token=api_token)
# people = action_network.get_people()
# print(people)
"""
A Parsons Table has some convenience methods.
"""
# print("Columns: ", people.columns)
# print("Rows: ", people.num_rows)
# print("First row: ", people.first)
"""
There are also convenience methods for transforming data. We'll go into them in more detail soon, but for now,
let's just show one: cut, which allows us to select only the columns we care about.
"""
# columns = ['given_name', 'family_name', 'email_addresses', 'phone_numbers', 'postal_addresses', 'created_date']
# people = people.cut(columns) # Gets a subset of columns
"""
Parsons is most useful for moving data from one place to another.
Getting credentials from Google Sheets is very tedious so I have already gotten them and saved them as an
environmental variable:
export GOOGLE_DRIVE_CREDENTIALS="credentials.json"
"""
# from parsons import GoogleSheets
# google_sheets = GoogleSheets()
# sheet_id = google_sheets.create_spreadsheet('Action Network Data', editor_email="shaunagm@gmail.com")
"""
Once you've made your sheet, comment out the 'create_spreadsheet' line of code so you don't create it again.
Save the ID given on the command line to the variable sheet_id.
You can access your sheet in the browser by adding it to this url:
https://docs.google.com/spreadsheets/d/$SHEET_ID
Now you can move your Action Network Data to the sheet!
"""
# sheet_id = ""
# google_sheets.overwrite_sheet(sheet_id, people)
"""
Whoops! What went wrong?
Well, sometimes platforms give and take different structures. For instance, Google Sheets can't handle
nested data that we get from Action Network. We'll need to transform it with Parsons.
Action Network uses list structure for their contact fields, but Google sheet cells need a single value.
ie: 'email_addresses': [{'primary': True, 'address': 'email@email.com'}]
This custom helper method gets the primary value for each field.
"""
# def get_primary_contact_or_first(contact_list):
# extracted_contacts = []
# for contact in contact_list:
# if "number" in contact: selector = "number"
# if "postal_code" in contact: selector = "postal_code"
# if "address" in contact: selector = "address"
# if contact["primary"]:
# return contact[selector]
# else:
# extracted_contacts.append(contact[selector])
# if extracted_contacts:
# return extracted_contacts[0]
# return None
# people.convert_column('phone_numbers', get_primary_contact_or_first)
# people.convert_column('postal_addresses', get_primary_contact_or_first)
# people.convert_column('email_addresses', get_primary_contact_or_first)
"""
Pretty cool, right? But there's other transformations we might want to do.
For example, maybe we want to alphabetize the first letter of people's first and last names. Or maybe
we want to rename the column.
"""
# def capitalize_names(name):
# if name:
# name = name.capitalize()
# return name
# people.convert_column("given_name", capitalize_names)
# people.rename_column("given_name", "first_name")
"""
There's a lot of different transformations we can try out!
https://move-coop.github.io/parsons/html/stable/table.html#parsons-table-transformations
Another one that might be useful here is "deduplicate".
"""
# One key
people.deduplicate("email_addresses")
# Two or more keys
people.deduplicate(["given_name", "family_name"])
"""
You can also select only specific rows
"""
# created_last_year = people.select_rows("'2022-' in {created_date}")
# print("Created last year: ")
# print(created_last_year)
# has_phone_number = people.select_rows(lambda row: row.phone_numbers)
# print("Has phone number: ")
# print(has_phone_number)
"""
You can also access underlying PETL methods with .table
And, for a connector that's based on an underlying package, you can access it with (usually) .client
(This works for Google Sheets but not Action Network)
"""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment