Skip to content

Instantly share code, notes, and snippets.

@chloemar10
Last active February 18, 2018 20:09
Show Gist options
  • Save chloemar10/5f87fd49cd67210e8d9c19bf2333161f to your computer and use it in GitHub Desktop.
Save chloemar10/5f87fd49cd67210e8d9c19bf2333161f to your computer and use it in GitHub Desktop.

CHLOE MARTEN
QUANT HUMANISTS
SPRING 2018
05 02 2018

Assignment 2: Document Your Methodology, link to assignment

Introduction

This past week, I further developed the questions I seek to answer as part of my final self-tracking project. Overall, I aim to interpret what my financial data says about me by exploring the following questions:

  • What do I spend my money on?
    • Categorical Spends
  • Where do I spend money?
    • Geographic Location
    • Places/Vendors
  • How has my spending shifted over time?
    • Increased/Decreased Spending in Certain Categories
  • How have my financial habits changed over time?
    • Ratio of Money Spent
    • Ratio of Money Saved/Invested

Below outlines my methodology to go about answering these questions.

Phase 1 - Data Gathering

In order to answer the above questions, I will need to track my financial transactions. I have decided to take an automated approach in gathering my data since I have limited time on a daily basis. To source my financial transaction data I will be using Plaid (https://plaid.com/products/transactions/), from which I can export JSON files. I am currently waiting on Plaid to grant me development access in order to link to my banking and credit card accounts. In the meantime, I was able to use their sandbox data to get a feel for how it works and setup my workflow. One concern I have about using Plaid is that it may not be able to source all my historical transactions. Once I am able to link my accounts, I need to figure out how far back I can go.

Phase 2 - Data Manipulation

Using the sandbox data Plaid provides, I went through the process of cleaning up the raw JSON data to isolate needed data points. In order to answer my questions, I will need to track the following transaction data:

  • Date
  • Amount/Balance
  • Name
  • Location
  • Category
 {
      "amount": -500, 
      "category": [
        "Travel", 
        "Airlines and Aviation Services"
      ], 
      "date": "2018-01-24", 
      "location": {
        "address": null, 
        "city": null, 
        "lat": null, 
        "lon": null, 
        "state": null, 
        "store_number": null, 
        "zip": null
      }, 
      "name": "United Airlines", 
      }, 
    }

This week, I worked on importing a JSON file from Plaid, and then using Jupyter Notebook (http://jupyter.org/) to clean up and identify the needed data points. I got so far as pulling out “amounts” and “date” with the intention to plot them in a line graph.

import json
with open('transactions.json') as f:
    data = json.load(f)

transactions = data["transactions"]
amounts = [t["amount"] for t in transactions]
mydates = [datetime.datetime.strptime(t["date"], "%Y-%m-%d") for t in transactions]

Phase 3 - Data Visualization

I also plan on using Jupyter to organize and plot my data but did not get to it this week. Before plotting, I need to first design the graphs/charts I intend on using to analyze my financial data.

Phase 4 - Data Analysis

The fourth phase is the actual analysis of my data for reflective insights. I imagine there will a bit of trial and error getting the right charts for visualization, which will probably bring me to tweaking phase 2 and 3.

@joeyklee
Copy link

Hi Chloe!

Super. Thanks for your thoughts.

How have my financial habits changed over time?
Ratio of Money Spent
Ratio of Money Saved/Invested

You might also look into your calendar to discover if there are events which trigger spending or not spending. These trends might reveal themselves naturally, but to relate them to things like birthdays or anticipated travel, etc will help contextualize your findings. You might also discover things like, “generally, I tend to do last minute shopping for birthdays, but plan way further ahead for the holidays” etc etc. Just subtle but possibly telling details.

Once I am able to link my accounts, I need to figure out how far back I can go.

If you want to go back further in history, you might consider downloading your financial transactions via your bank. I know at my bank I can get back my data as csv. There’s other services like Intuit’s Mint that also provide a breakdown of spending. Not sure exactly, but might just a thought.

I also plan on using Jupyter to organize and plot my data but did not get to it this week. Before plotting, I need to first design the graphs/charts I intend on using to analyze my financial data.

Plotting in Python hasn’t always been so friendly (or beautiful) but there are lots of libraries now that are making it easier. Also interactive plots are also possible in Jupyter notebooks which is great.

For data munging:
Python Pandas: http://nbviewer.jupyter.org/github/jvns/pandas-cookbook/blob/v0.1/cookbook/Chapter%201%20-%20Reading%20from%20a%20CSV.ipynb
For mapping:
Cartoframes: https://carto.com/blog/inside/CARTOframes-python-interface-CARTO/
Interactive charts: there’s a bunch, but just a suggestion
https://plot.ly/python/

Also common practice is to generate your charts in python and then export to illustrator or sketch to beautify them.

@auremoser
Copy link

Very cool, +1 to everything Joey said here. Plaid seems like a cool app, others like Mint also let you export your data and import from your bank and various financial services, there are some limited visualizations but it's actually mostly useful because of the data export functionality, and the fact that it aggregates from banks, cc and other financial services online.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment