Skip to content

Instantly share code, notes, and snippets.

@bostelk
Last active February 6, 2017 16:47
Show Gist options
  • Save bostelk/c9a9a820cdb251e3a25bbdadfd119364 to your computer and use it in GitHub Desktop.
Save bostelk/c9a9a820cdb251e3a25bbdadfd119364 to your computer and use it in GitHub Desktop.
Use BigQuery to find the Unity editor versions in use.
SELECT
line AS version,
COUNT(*) AS count
FROM
FLATTEN( (
SELECT
SPLIT(REGEXP_EXTRACT(content, r'm_EditorVersion: (.+)'), '\n') AS line,
FROM (
SELECT
id,
content
FROM
[bigquery-public-data:github_repos.sample_contents]
WHERE
REGEXP_MATCH(content, r'm_EditorVersion: (.+)')) AS C
JOIN (
SELECT
id
FROM
[bigquery-public-data:github_repos.sample_files]
WHERE
RIGHT(path, 34) = 'ProjectSettings/ProjectVersion.txt'
GROUP BY
id) AS F
ON
C.id = F.id), line)
GROUP BY
version
HAVING
version IS NOT NULL
ORDER BY
count DESC
@fhoffa
Copy link

fhoffa commented Feb 6, 2017

2 notes:

  • [sample_contents] has only a sample of all [contents], so you can test your queries on smaller tables.
  • identical files are de-duplicated, hence you'll need to account for all copies when counting:

Note for example that id='c4684cd58231b8d76c633a2bc45068594ed3961e' has 549 copies:

  SELECT id, COUNT(*) copies
  FROM [bigquery-public-data:github_repos.files]
  WHERE RIGHT(path, 34) = 'ProjectSettings/ProjectVersion.txt'
  GROUP BY id 
  ORDER BY copies DESC

This is one of those 549 files, all identical:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment