Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save velotiotech/f2e51cb53d8aa4b8349a8064c9e71e53 to your computer and use it in GitHub Desktop.
Save velotiotech/f2e51cb53d8aa4b8349a8064c9e71e53 to your computer and use it in GitHub Desktop.
BigQuery 101: All the Basics You Need to Know - Code for Velotio Blog
WITH question_answers_join AS (
SELECT *
, GREATEST(1, TIMESTAMP_DIFF(answers.first, creation_date, minute)) minutes_2_answer
FROM (
SELECT id, creation_date, title
, (SELECT AS STRUCT MIN(creation_date) first, COUNT(*) c
FROM `bigquery-public-data.stackoverflow.posts_answers`
WHERE a.id=parent_id
) answers
, SPLIT(tags, '|') tags
FROM `bigquery-public-data.stackoverflow.posts_questions` a
WHERE EXTRACT(year FROM creation_date) > 2014
)
)
SELECT COUNT(*) questions, tag
, ROUND(EXP(AVG(LOG(minutes_2_answer))), 2) mean_geo_minutes
, APPROX_QUANTILES(minutes_2_answer, 100)[SAFE_OFFSET(50)] median
FROM question_answers_join, UNNEST(tags) tag
WHERE tag IN ('javascript', 'python', 'rust', 'java', 'scala', 'ruby', 'go', 'react', 'c', 'c++')
AND answers.c > 0
GROUP BY tag
ORDER BY mean_geo_minutes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment