Skip to content

Instantly share code, notes, and snippets.

@rjurney
Last active September 1, 2022 23:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rjurney/c5637f9d7b3bfb094b79e62a704693da to your computer and use it in GitHub Desktop.
Save rjurney/c5637f9d7b3bfb094b79e62a704693da to your computer and use it in GitHub Desktop.
The fields available in the DBLP data
{
"entity_id": "<UUID4>",
"entity_type": "node",
"entity_class": "",
"@key": "conf\/www\/Ericsson07",
"@cdate": "2021-01-01",
"@mdate": "2022-08-31",
"@publtype": NaN,
"address": "",
// Note: there is another form where author is just a string - must ETL
// Note: sometimes this is a list of strings
"authors": [{
"#text": "Morgan Ericsson",
"@orcid": "0000-0003-1173-5187"
}],
"booktitle": "OTM",
"cdrom": NaN,
"chapter": NaN,
"cite": NaN,
"crossref": "conf\/coopis\/2003",
// Note: there is another form where author is just a string - must ETL
// Note: sometimes this is a list of strings
// Note: strings/dicts occur in same lists
"editors": [{
"#text": "Morgan Ericsson",
"@orcid": "0000-0003-1173-5187"
}],
"ee": "https:\/\/doi.org\/10.1007\/s11280-007-0032-y",
"isbn": "978-3-8244-2051-3",
"journal": "World Wide Web",
"month": NaN,
"note": NaN,
"number": "3",
"pages": "279-307",
"publisher": "Dt. Univ.-Verlag",
"publnr": NaN,
"school": NaN,
"series_text": "Lecture Notes in Computer Science",
"series_href": "db\/series\/lncs\/index.html",
"title": "The Effects of XML Compression on SOAP Performance.",
"url": "db\/journals\/www\/www10.html#Ericsson07",
"volume": NaN,
"year": "2007",
}
In [5]: for key, df in dfs.items():
...: print(key, df.columns)
...:
article Index(['@mdate', '@key', '@publtype', 'title', 'author', 'pages', 'year',
'journal', 'number', 'ee', 'url', 'volume', 'crossref', 'note', 'cdrom',
'editor', 'cite', 'booktitle', 'publnr', 'month', '@cdate',
'publisher'],
dtype='object')
book Index(['@mdate', '@key', 'author', 'title', 'year', 'pages', 'publisher',
'isbn', 'ee', 'school', '@publtype', 'series', 'volume', 'note',
'editor', 'booktitle', 'url', 'crossref', 'month', 'cite', 'cdrom'],
dtype='object')
incollection Index(['@mdate', '@key', 'author', 'title', 'pages', 'year', 'booktitle', 'ee',
'crossref', 'url', '@publtype', 'cite', 'publisher', 'number', 'note',
'cdrom', 'chapter'],
dtype='object')
inproceedings Index(['@mdate', '@key', 'author', 'title', 'booktitle', 'year', 'url',
'crossref', 'ee', 'pages', 'cite', 'cdrom', '@publtype', 'note',
'editor', 'number', 'volume', 'month'],
dtype='object')
mastersthesis Index(['@mdate', '@key', 'author', 'title', 'year', 'school', 'ee', 'note'], dtype='object')
phdthesis Index(['@mdate', '@key', 'author', 'title', 'year', 'school', 'publisher',
'number', 'pages', 'isbn', 'ee', 'month', 'series', 'volume', 'note',
'@publtype'],
dtype='object')
proceedings Index(['@mdate', '@key', 'editor', 'title', 'publisher', 'year', 'isbn', 'ee',
'url', 'booktitle', 'series', 'volume', 'note', 'number', 'pages',
'@publtype', 'author', 'school', 'address', 'journal', 'cite'],
dtype='object')
www Index(['@mdate', '@key', 'author', 'title', 'url', 'note', '@publtype',
'crossref', 'cite', 'ee', 'year', 'editor'],
dtype='object')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment