Skip to content

Instantly share code, notes, and snippets.

@andreas-wilm
Created August 7, 2019 09:47
Show Gist options
  • Save andreas-wilm/5afca60b9c452cbe4f96702bf8c0647c to your computer and use it in GitHub Desktop.
Save andreas-wilm/5afca60b9c452cbe4f96702bf8c0647c to your computer and use it in GitHub Desktop.
Microsoft Academic Graph: Patents vs Language
// modify the following two values as needed
DECLARE @dataVersion string = "mag-YYYY-MM-DD";
DECLARE @blobAccount string = "XXX";
DECLARE @uriPrefix string = "wasb://" + @dataVersion + "@" + @blobAccount + "/";
DECLARE @outFile string = "/Output/patent_languages.tsv";
@Papers = Papers(@uriPrefix);
@PaperLanguages = PaperLanguages(@uriPrefix);
@plCounts = SELECT
L.LanguageCode, COUNT(1) AS cnt
FROM
@PaperLanguages AS L
INNER JOIN
@Papers AS P
ON
P.PaperId == L.PaperId
WHERE
DocType == "Patent"
GROUP BY
L.LanguageCode;
OUTPUT @plCounts TO @outFile
USING Outputters.Tsv(quoting : false);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment