Skip to content

Instantly share code, notes, and snippets.

@avivl
Created December 25, 2017 16:22
Embed
What would you like to do?
SELECT COUNT(*) as num_bigram,bigram FROM
(
SELECT
split
(
REGEXP_REPLACE(review, '([^\\s]+\\s[^\\s]*)\\s', '\\1|') +
'|'+
REGEXP_REPLACE(review, '([^\\s]+)\\s([^\\s]+\\s?)', '\\1|\\2'),
) as bigram
FROM [telegrass_reviews.reviews]
) t
WHERE t.bigram contains ' '
group by bigram order by num_bigram desc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment