Skip to content

Instantly share code, notes, and snippets.

View cj2001's full-sized avatar
💭
Working remote

C.J. Sullivan cj2001

💭
Working remote
View GitHub Profile
This file has been truncated, but you can view the full file.
{"type":"node","id":"0","labels":["Node"],"properties":{"node_labels":["Subject"],"name":"oh bah mə","word_vec":[-0.1991655146019209,-0.05351358562798536,0.9228555063544845,-0.6886134662870478,-0.48122510045452427,-0.40170435813501726,-0.8155066138337055,0.8357354634523664,-0.2347006396854041,0.26097340188742835,-0.2814595678676137,-0.9936133169296157,-0.3826829229304849,-0.9808776175986211,-0.20284189377753714,0.5015809495104013,0.16734307094472212,-0.5867540626284398,-0.04434894351605689,-0.6723292743695024,0.9054839794937932,-0.6772277556586848,0.38326882164488585,-0.7193810327293506,0.007985495945301846,-0.6796721226933016,0.359004027656604,0.3961584634066617,0.9789458680359799,0.3723659161768318,0.6861445958760459,0.39929387851793496,-0.3740958123352931,0.07034310049505343,0.5782312986452309,0.27844665689970527,0.04389886470438653,0.38707344473628447,-0.35045714426886465,-0.9373177969255766,-0.12407924520086833,-0.5629778221312765,-0.9617121312902264,-0.587069845901774,0.7978564493099858,0.92225653009432
@cj2001
cj2001 / docker-compose.yml
Created May 14, 2021 17:58
Networks Neo4j and Jupyter containers
version: '3.7'
services:
neo4j:
image: neo4j:4.2.3-enterprise
container_name: "neo-gds1.5"
volumes:
- $HOME/graph_data/my_data:/data
- $HOME/graph_data/my_data:/var/lib/neo4j/import
ports:
@cj2001
cj2001 / Dockerfile
Created May 14, 2021 17:54
Main data science Dockerfile
FROM jupyter/datascience-notebook
COPY requirements.txt ./
RUN pip install -U pip
RUN pip install --no-cache-dir -r requirements.txt
ENV JUPYTER_ENABLE_LAB=yes
COPY --chown=${NB_UID}:${NB_GID} . /home/jovyan/work
WORKDIR /home/jovyan/work
@cj2001
cj2001 / requirements.txt
Created May 14, 2021 17:51
Package requirements file for Neo4j + Jupyter docker container
jupyterlab==3.0.7
neo4j==4.2.1
py2neo==2021.0.1
@cj2001
cj2001 / query_to_list_values.py
Last active February 10, 2021 00:14
Query arXiv Neo4j to list
result = conn.query(query_string)
for record in result:
print(record['c.category'], record['inDegree'])
@cj2001
cj2001 / arxiv_query_db.py
Created February 9, 2021 23:56
Query arXiv data in Neo4j
query_string = '''
MATCH (c:Category)
RETURN c.category_name, SIZE(()-[:IN_CATEGORY]->(c)) AS inDegree
ORDER BY inDegree DESC LIMIT 20
'''
top_cat_df = pd.DataFrame([dict(_) for _ in conn.query(query_string)])
top_cat_df.head(20)
@cj2001
cj2001 / add_category_and_author_nodes.py
Created February 9, 2021 23:53
Add category and author nodes
categories = pd.DataFrame(df[['category_list']])
categories.rename(columns={'category_list':'category'},
inplace=True)
categories = categories.explode('category') \
.drop_duplicates(subset=['category'])
authors = pd.DataFrame(df[['cleaned_authors_list']])
authors.rename(columns={'cleaned_authors_list':'author'},
inplace=True)
authors=authors.explode('author').drop_duplicates(subset=['author'])
@cj2001
cj2001 / paper_nodes_and_edges.py
Created February 9, 2021 23:51
Add arXiv paper nodes and all edges
def add_papers(rows, batch_size=5000):
# Adds paper nodes and (:Author)--(:Paper) and
# (:Paper)--(:Category) relationships to the Neo4j graph as a
# batch job.
query = '''
UNWIND $rows as row
MERGE (p:Paper {id:row.id}) ON CREATE SET p.title = row.title
// connect categories
@cj2001
cj2001 / add_authors_categories.py
Last active July 13, 2021 14:58
Add author and category nodes to the graph
def add_categories(categories):
# Adds category nodes to the Neo4j graph.
query = '''
UNWIND $rows AS row
MERGE (c:Category {category: row.category})
RETURN count(*) as total
'''
return conn.query(query, parameters = {'rows':categories.to_dict('records')})
@cj2001
cj2001 / create_arxiv_constraints.py
Created February 9, 2021 23:44
Create arXiv constraints
conn.query('CREATE CONSTRAINT papers IF NOT EXISTS ON (p:Paper) ASSERT p.id IS UNIQUE')
conn.query('CREATE CONSTRAINT authors IF NOT EXISTS ON (a:Author) ASSERT a.name IS UNIQUE')
conn.query('CREATE CONSTRAINT categories IF NOT EXISTS ON (c:Category) ASSERT c.category IS UNIQUE')