40% of NYC Taxi Trips are Uniquely Identified by Pickup/Drop Off Census Tracts and Hour
In my recent post analyzing 1.1 billion NYC taxi and Uber trips, I included a section about privacy concerns which showed how precise latitude/longitude coordinates of taxi pickups and drop offs could potentially be used to reveal personal information about where people live, work, socialize, etc.
I wrote that if the Taxi & Limousine Commission wanted to avoid disclosing personal information, they would have to remove latitude/longitude from the dataset, perhaps replacing them with coarser census tract location data. Now it seems like maybe census tracts are still too precise.
I hadn't previously investigated how well census tracts uniquely identify pickups and drop offs, but **it turns out that if you