Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
BigQuery for most popular countries on Hacker News
SELECT
b.country,
SUM(score) as total_score,
SUM(a.c) as mention_count,
ROUND(SUM(score) / SUM(a.c), 2) as avg_score
FROM
(
SELECT
a.word,
SUM(a.score) AS score,
COUNT(1) AS c
FROM (
SELECT
SPLIT(REGEXP_REPLACE(title, '[^a-zA-Z ]', ''), ' ') AS word,
score
FROM
[fh-bigquery:hackernews.stories]) a
GROUP BY
a.word
) a
JOIN
[countries.country] b
ON
a.word = b.aka
GROUP BY b.country
HAVING mention_count >= 100
ORDER BY avg_score DESC
LIMIT 100
@siener

This comment has been minimized.

Copy link

commented Feb 17, 2016

Your query doesn't pick up any countries with multi-word names (e.g. South Africa, North/South Korea) except for the ones that have single word aliases (e.g. USA, UAE, Saudi).

According to your results South Africa and the Koreas have never been mentioned on HN or Reddit which is obviously not true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.