Skip to content

Instantly share code, notes, and snippets.

@andreas-wilm
Last active April 11, 2019 17:37
Show Gist options
  • Save andreas-wilm/a409defc3ea526839af69e804d103575 to your computer and use it in GitHub Desktop.
Save andreas-wilm/a409defc3ea526839af69e804d103575 to your computer and use it in GitHub Desktop.
Summarizing Document Types in Microsoft Academic Graph
DECLARE @dataVersion string = "mag-2019-03-22";
DECLARE @blobAccount string = "<PLACEHOLDER>";
DECLARE @uriPrefix string = "wasb://" + @dataVersion + "@" + @blobAccount + "/";
DECLARE @typeCounts string = "/Output/TypeCounts.tsv";
@Papers = Papers(@uriPrefix);
@Types = SELECT
DocType, COUNT(1) AS cnt
FROM
@Papers
GROUP BY
DocType;
OUTPUT @Types TO @typeCounts
USING Outputters.Tsv(quoting : false);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment