Skip to content

Instantly share code, notes, and snippets.

@thorwebdev
Created December 15, 2014 20:43
Show Gist options
  • Save thorwebdev/424494f824475e30216c to your computer and use it in GitHub Desktop.
Save thorwebdev/424494f824475e30216c to your computer and use it in GitHub Desktop.
These are some interesting queries you can run on this BigQuery table: https://bigquery.cloud.google.com/table/nyctaximap:dataflow.nyc_output_join_fare_distinct . The data has been aggregated using Google Cloud Dataflow. For more background information on how the data has been processed see: https://googlecloudplatform.blogspot.com
SELECT
pickup_polyId,
SUM(fare_amount)/COUNT(*) AS average_fare,
COUNT(*) AS no_of_trips
FROM
[nyctaximap:dataflow.nyc_output_join_fare_distinct]
WHERE
fare_amount!=0
GROUP BY
pickup_polyId
HAVING
no_of_trips > 100000
ORDER BY
average_fare DESC;
SELECT
pickup_polyId,
SUM(tip_amount)/COUNT(*) AS average_tip,
COUNT(*) AS no_of_trips
FROM
[nyctaximap:dataflow.nyc_output_join_fare_distinct]
WHERE
tip_amount!=0
GROUP BY
pickup_polyId
HAVING
no_of_trips > 50000
ORDER BY
average_tip DESC;
SELECT
pickup_polyId,
COUNT(*) AS no_of_trips
FROM
[nyctaximap:dataflow.nyc_output_join_fare_distinct]
GROUP BY
pickup_polyId
ORDER BY
no_of_trips DESC;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment