Skip to content

Instantly share code, notes, and snippets.

@sAbakumoff
Last active August 12, 2016 10:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sAbakumoff/b229149ac9f0de50305c302d099e1238 to your computer and use it in GitHub Desktop.
Save sAbakumoff/b229149ac9f0de50305c302d099e1238 to your computer and use it in GitHub Desktop.
function select_keywords(r, emit) {
try{
var package=JSON.parse(r.content);
(package.keywords || []).forEach(function(keyword){
emit({keyword : keyword});
});
}
catch(ex){
}
}
bigquery.defineFunction(
'select_keywords', // Name of the function exported to SQL
['content'], // Names of input columns
[{'name': 'keyword', 'type': 'string'}], // Output schema
select_keywords // Reference to JavaScript UDF
);
SELECT
keyword,
COUNT(keyword) AS count
FROM (
SELECT
keyword
FROM (select_keywords(
SELECT
content
FROM
[githubdataqueries:NpmStat.package_json_content] )) )
GROUP BY
keyword
ORDER BY
count DESC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment