Created
February 9, 2021 23:51
-
-
Save cj2001/a6c193019f98793e8bed2e08179ba7c6 to your computer and use it in GitHub Desktop.
Add arXiv paper nodes and all edges
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def add_papers(rows, batch_size=5000): | |
# Adds paper nodes and (:Author)--(:Paper) and | |
# (:Paper)--(:Category) relationships to the Neo4j graph as a | |
# batch job. | |
query = ''' | |
UNWIND $rows as row | |
MERGE (p:Paper {id:row.id}) ON CREATE SET p.title = row.title | |
// connect categories | |
WITH row, p | |
UNWIND row.category_list AS category_name | |
MATCH (c:Category {category: category_name}) | |
MERGE (p)-[:IN_CATEGORY]->(c) | |
// connect authors | |
WITH distinct row, p // reduce cardinality | |
UNWIND row.cleaned_authors_list AS author | |
MATCH (a:Author {name: author}) | |
MERGE (a)-[:AUTHORED]->(p) | |
RETURN count(distinct p) as total | |
''' | |
return insert_data(query, rows, batch_size) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment