Skip to content

Instantly share code, notes, and snippets.

@KaitlynSisk
Last active August 29, 2015 13:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save KaitlynSisk/10023010 to your computer and use it in GitHub Desktop.
Save KaitlynSisk/10023010 to your computer and use it in GitHub Desktop.
Geo team write up for 4-7-14

Upon getting to work and trying to follow our schedule, we have realized that we planned to have too many things due in a very short period of time. We are in the process of adjusting the schedule and replanning what needs to be done. However, the three of us have been working on a few different tasks during the break.

Clare has been working on a draft of the close reading. Her rough draft includes an introduction, analysis of Arkansas, discussion of the advantages of the digital, and conclusion. The analysis of Arkansas will act as a prototype of what she plans to do with the conclusions she is reaching about Mississippi and Texas, although her data collection requires more time than previously realized. She plans on diversifying advertisement examples, as her current examples are from a few select years. We will discuss suggestions for the progress of the essay with Dr. McDaniel.

Aaron wrote a python script called placetagger.py that tags locations in each advertisement. He ran the cleaned advertisements from the two Texas newspapers and the Mississippi and Arkansas corpi through the script and saved them as JSON files. I then started to try to run the tagged locations through GeoNamesMatch, but I quickly ran into some difficulties. After discussing with Aaron, we decided that the input and output of this particular program was inconvenient for what we are trying to do. Aaron played around with Google's free geocoding API and had some success with it, so we have decided to use that instead. Aaron and I then started cleaning up the pretty printed JSON of the tagged locations, and we realized that even though we don't have to correct spelling or extend state abbreviations, this task is going to take a very long time because of the large number of advertisements we have, especially in Mississippi. Through cleaning the tagged locations, we noticed that the python script has been separating locations that should be together. Some results come out as [Travis County], [Texas] instead of [Travis County Texas], or even as [County] instead of [Travis County]. It is unlikely that we will ever be able to write a script that catches every location with precision, but we would like to be as close as we can get.

####Next Steps

Aaron is planning on rewriting placetagger.py so that it does not split up the county or city name from the state name. Once that is done, he will rerun the advertisement corpi through the new script, and then he and Kaitlyn will begin cleaning the results. We will need to come up with a few parameters or rules for cleaning the results so that there is consistency across the states. We will also need to decide if we should compare the results for each advertisement to the original text. That could be a very time consuming process and does not seem feasible to do for every single advertisement, so we may choose to compare every few advertisements.

Even though we will have to reclean the results, we can still use the current cleaned up results from Texas to start thinking about how we want to visualize our results. Right now, we are planning on looking at Palladio to see if it will fit our needs. We also have been thinking about creating a map that shows how many times a state has been referenced in another state's newspapers. Ideally, we would like to be able to hover over a state with the cursor and it shades that state and other states with intensity determined by number of mentions of places in that state from the origin state's ads, but we are still figuring out how to do that. We can start to see how this would work by using the current Texas data in Google Fusion Tables to create a preliminary visualization. Aaron and Kaitlyn will also give feedback on the close reading essay to Clare as she continues to revise her draft.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment