Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save RanaivosonHerimanitra/3345e25866d1e9af93d8b09ced825744 to your computer and use it in GitHub Desktop.
Save RanaivosonHerimanitra/3345e25866d1e9af93d8b09ced825744 to your computer and use it in GitHub Desktop.
Retrieve events (issues and PR) related to any contents of any repos for a given day (2015-01-01)
SELECT type,payload,repo_name,content,created_at FROM [githubarchive:day.20151001] x JOIN(
SELECT a.id,path,repo_name,content FROM (
SELECT * FROM (
SELECT a.id,path,repo_name,content FROM (
SELECT * FROM [bigquery-public-data:github_repos.files] WHERE path LIKE '%.java' ) a JOIN
(SELECT id,content FROM [bigquery-public-data:github_repos.contents]) b ON a.id=b.id
)
)
) c ON x.repo.name=a.repo_name HAVING type='PullRequestEvent' or type='IssueCommentEvent' LIMIT 5000
@RanaivosonHerimanitra
Copy link
Author

Is It the best way to retrieve that information, knowing that I want it for 2011 to 2016 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment