Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
BigQuery sql for GitHub Archive dataset: fork count per repo
/* fork count per repo */
SELECT
REGEXP_REPLACE(repo.name, 'MyOrg/', '') AS repo,
COUNT (*) AS cnt
FROM (
SELECT
type,
repo.name,
FROM
TABLE_DATE_RANGE([githubarchive:day.events_], TIMESTAMP('2015-01-01'), TIMESTAMP('2015-10-15') )
WHERE
type = 'ForkEvent'
AND repo.name CONTAINS 'MyOrg/'
AND NOT REGEXP_MATCH(repo.name,'^MyOrg/SomeRepo$') )
GROUP BY
repo
ORDER BY
cnt DESC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.