Skip to content

Instantly share code, notes, and snippets.

@paulrossman
Created October 20, 2015 00:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save paulrossman/687b0e038578832c257e to your computer and use it in GitHub Desktop.
Save paulrossman/687b0e038578832c257e to your computer and use it in GitHub Desktop.
BigQuery sql for GitHub Archive dataset: issue close latency for issues opened after 2015-01-01 @ 00:00:00 UTC and closed before 2015-01-31
/* issue close latency for issues opened after 2015-01-01 @ 00:00:00 UTC and closed before 2015-01-31 */
SELECT
REGEXP_REPLACE(repo.name, 'MyOrg/', '') AS repo,
CEIL((closed_at-created_at)/86400) AS latency,
FROM (
SELECT
type,
repo.name,
JSON_EXTRACT(payload, '$.action') AS event,
TIMESTAMP_TO_SEC(TIMESTAMP(REGEXP_REPLACE(JSON_EXTRACT(payload, '$.issue.created_at'), '\"|T|Z', ' '))) AS created_at,
TIMESTAMP_TO_SEC(TIMESTAMP(REGEXP_REPLACE(JSON_EXTRACT(payload, '$.issue.closed_at'), '\"|T|Z', ' '))) AS closed_at,
FROM (TABLE_DATE_RANGE([githubarchive:day.events_], TIMESTAMP('2015-01-01'), TIMESTAMP('2015-01-31') ))
WHERE
type = 'IssuesEvent'
AND repo.name CONTAINS 'MyOrg/'
AND NOT REGEXP_MATCH(repo.name,'^MyOrg/SomeRepo$') )
WHERE
event CONTAINS 'closed'
AND NOT created_at < 1420070400 /* 1/1/2015 @ 00:00:00 UTC */
ORDER BY
repo.name,
latency
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment