Skip to content

Instantly share code, notes, and snippets.

@yuryu
Last active July 5, 2020 03:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save yuryu/7229236a3df16f1582fddc1399719612 to your computer and use it in GitHub Desktop.
Save yuryu/7229236a3df16f1582fddc1399719612 to your computer and use it in GitHub Desktop.
BQ Joshi Public datasets sample queries
SELECT repo_name, SUM(ARRAY_LENGTH(REGEXP_EXTRACT_ALL(content, r"(?i)bigquery"))) AS count
FROM bigquery-public-data.github_repos.contents
JOIN bigquery-public-data.github_repos.files
ON contents.id = files.id
GROUP BY repo_name
ORDER BY count DESC
LIMIT 100
SELECT email_org, COUNT(email_org) AS ncommits
FROM (
SELECT ARRAY_REVERSE(SPLIT(author.email, "@"))[SAFE_OFFSET(0)] AS email_org
FROM bigquery-public-data.github_repos.commits
) AS t1
GROUP BY email_org
ORDER BY ncommits DESC
SELECT SUM(ARRAY_LENGTH(REGEXP_EXTRACT_ALL(content, r"Google")))
FROM bigquery-public-data.github_repos.contents
https://bit.ly/spacetabs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment