Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Example of querying neo4j Panama Papers database from Jupyter IPython notebooks using py2neo
neo4j:
image: ryguyrg/neo4j-panama-papers
ports:
- "7474:7474"
- "1337:1337"
volumes:
- /opt/data
jupyterscipy:
image: jupyter/scipy-notebook
ports:
- "8890:8888"
links:
- neo4j:neo4j
volumes:
- ./notebooks:/home/jovyan/work
rstudio:
image: rocker/rstudio
ports:
- "8787:8787"
links:
- neo4j:neo4j
volumes:
- ./rstudio:/home/rstudio
#jupyterIR:
# image: jupyter/r-notebook
# ports:
# - "8889:8888"
# links:
# - neo4j:neo4j
# volumes:
# - ./notebooks:/home/jovyan/work
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Panama Papers Explorer - py2neo\n",
"\n",
"This notebook provides a quick start for using py2neo with the Panama Papers neo4j docker container [ryguyrg/neo4j-panama-papers](https://hub.docker.com/r/ryguyrg/neo4j-panama-papers/) [[Github](https://github.com/ryguyrg/panama-neo4j)].\n",
"\n",
"The following `docker-compose.yml` file will lauch a Jupyter notebook environment linked to the neo4j container the Panama Papers data."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"````\n",
"#Run with:\n",
"#docker-compose up\n",
"#Then visit IP_ADDRESS:7474\n",
"#The defaul credentials are neo4j/neo4j\n",
"#You will be prompted for a new password - I used: panamapapers\n",
"\n",
"#Get the Panama Papers neo4j container\n",
"neo4j:\n",
" image: ryguyrg/neo4j-panama-papers\n",
" ports:\n",
" - \"7474:7474\"\n",
" - \"1337:1337\"\n",
" volumes:\n",
" - /opt/data\n",
"\n",
"#Download a Jupyter notebook environment and link the neo4j container to it\n",
"jupyterscipy:\n",
" image: jupyter/scipy-notebook\n",
" ports:\n",
" - \"8890:8888\"\n",
" links:\n",
" - neo4j:neo4j\n",
" volumes:\n",
" - ./notebooks:/home/jovyan/work\n",
"\n",
"##If you prefer an RStudio environment:\n",
"#rstudio:\n",
"# image: rocker/rstudio \n",
"# ports:\n",
"# - \"8787:8787\"\n",
"# links:\n",
"# - neo4j:neo4j\n",
"# volumes:\n",
"# - ./rstudio:/home/rstudio\n",
"##Then install: install.packages('RNeo4j', repos=\"http://cran.rstudio.com/\")\n",
"````"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## py2neo\n",
"\n",
"`py2neo` provides a Python wrapper for neo4j."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied (use --upgrade to upgrade): py2neo in /opt/conda/lib/python3.5/site-packages\r\n",
"\u001b[33mYou are using pip version 8.1.1, however version 8.1.2 is available.\r\n",
"You should consider upgrading via the 'pip install --upgrade pip' command.\u001b[0m\r\n"
]
}
],
"source": [
"!pip3 install py2neo\n",
"from py2neo import Graph"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a connection to the `neo4j` database, using the `neo4j` user account and your new password:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"graph = Graph(\"http://neo4j:7474/db/data/\",user='neo4j',password='panamapapers')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Start to explore the graph - what node types are there?"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"frozenset({'Address', 'Entity', 'Intermediary', 'Officer'})"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"graph.node_labels"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For each node type, what attributes/labels are there?"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Intermediary\n",
"\tname\n",
"\tsourceID\n",
"\tinternal_id\n",
"\taddress\n",
"\tvalid_until\n",
"\tcountry_codes\n",
"\tcountries\n",
"\tstatus\n",
"Address\n",
"\taddress\n",
"\ticij_id\n",
"\tvalid_until\n",
"\tcountries\n",
"\tcountry_code\n",
"Officer\n",
"\tname\n",
"\taddress\n",
"\ticij_id\n",
"\tvalid_until\n",
"\tcountries\n",
"\tcountry_code\n",
"Entity\n",
"\tname\n",
"\toriginal_name\n",
"\taddress\n",
"\tincorporation_date\n",
"\tinactivation_date\n",
"\tstruck_off_date\n",
"\tcompany_type\n",
"\tservice_provider\n",
"\tjurisdiction\n",
"\tstatus\n",
"\tsourceID\n"
]
}
],
"source": [
"for nl in graph.node_labels:\n",
" print(nl)\n",
" for i in graph.schema.get_indexes(nl):\n",
" print('\\t{}'.format(i))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What relationship types are there?"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"frozenset({'ALTERNATE_DIRECTOR_OF',\n",
" 'APPOINTOR_OF',\n",
" 'ASSISTANT_SECRETARY_OF',\n",
" 'AUDITOR_OF',\n",
" 'AUTHORISED_PERSON_SIGNATORY_OF',\n",
" 'AUTHORIZED_SIGNATORY_OF',\n",
" 'AUTH_REPRESENTATIVE_OF',\n",
" 'BANK_SIGNATORY_OF',\n",
" 'BENEFICIAL_OWNER_OF',\n",
" 'BENEFICIARY_OF',\n",
" 'BENEFICIARY_SHAREHOLDER_AND_DIRECTOR_OF',\n",
" 'BOARD_REPRESENTATIVE_OF',\n",
" 'CHAIRMAN_OF',\n",
" 'CONNECTED_OF',\n",
" 'CORRESPONDENT_ADDR_OF',\n",
" 'COTRUSTEE_OF_TRUST_OF',\n",
" 'CUSTODIAN_OF',\n",
" 'DIRECTOR_AND_SHAREHOLDER_OF',\n",
" 'DIRECTOR_BENEFICIAL_OWNER_OF',\n",
" 'DIRECTOR_OF',\n",
" 'DIRECTOR_RAMI_MAKHLOUF_OF',\n",
" 'DIRECTOR_SHAREHOLDER_BENEFICIAL_OWNER_OF',\n",
" 'DIRECTOR_SHAREHOLDER_OF',\n",
" 'FIRST_BENEFICIARY_OF',\n",
" 'GENERAL_ACCOUNTANT_OF',\n",
" 'GRANTEE_OF_A_MORTGAGE_OF',\n",
" 'INTERMEDIARY_OF',\n",
" 'INVESTMENT_ADVISOR_OF',\n",
" 'JOINT_SETTLOR_OF',\n",
" 'LEGAL_ADVISOR_OF',\n",
" 'MEMBER_OF_FOUNDATION_COUNCIL_OF',\n",
" 'MEMBER_SHAREHOLDER_OF',\n",
" 'NOMINATED_PERSON_OF',\n",
" 'NOMINEE_BENEFICIAL_OWNER_OF',\n",
" 'NOMINEE_BENEFICIARY_OF',\n",
" 'NOMINEE_DIRECTOR_OF',\n",
" 'NOMINEE_INVESTMENT_ADVISOR_OF',\n",
" 'NOMINEE_NAME_OF',\n",
" 'NOMINEE_PROTECTOR_OF',\n",
" 'NOMINEE_SECRETARY_OF',\n",
" 'NOMINEE_SHAREHOLDER_OF',\n",
" 'NOMINEE_TRUST_SETTLOR_OF',\n",
" 'OFFICER_OF',\n",
" 'OWNER_DIRECTOR_AND_SHAREHOLDER_OF',\n",
" 'OWNER_OF',\n",
" 'PARTNER_OF',\n",
" 'PERSONAL_DIRECTORSHIP_OF',\n",
" 'POWER_OF_ATTORNEY_OF',\n",
" 'POWER_OF_ATTORNEY_SHAREHOLDER_OF',\n",
" 'PRESIDENT_AND_DIRECTOR_OF',\n",
" 'PRESIDENT_DIRECTOR_OF',\n",
" 'PRESIDENT_OF',\n",
" 'PRINCIPAL_BENEFICIARY_OF',\n",
" 'PROTECTOR_OF',\n",
" 'RECORDS_REGISTERS_OF',\n",
" 'REGISTERED_ADDRESS',\n",
" 'REGISTER_OF_DIRECTOR_OF',\n",
" 'REGISTER_OF_SHAREHOLDER_OF',\n",
" 'RELATED_ENTITY',\n",
" 'RESERVE_DIRECTOR_OF',\n",
" 'RESIDENT_DIRECTOR_OF',\n",
" 'SAFEKEEPING_OF',\n",
" 'SAME_ADDRESS_AS',\n",
" 'SAME_NAME_AND_REGISTRATION_DATE_AS',\n",
" 'SECRETARY_OF',\n",
" 'SHAREHOLDER_OF',\n",
" 'SHAREHOLDER_THROUGH_JULEX_FOUNDATION_OF',\n",
" 'SIGNATORY_OF',\n",
" 'SIMILAR_NAME_AND_ADDRESS_AS',\n",
" 'SOLE_SHAREHOLDER_OF',\n",
" 'SOLE_SIGNATORY_BENEFICIAL_OWNER_OF',\n",
" 'SOLE_SIGNATORY_OF',\n",
" 'STOCKBROKER_OF',\n",
" 'SUCCESSOR_PROTECTOR_OF',\n",
" 'TAX_ADVISOR_OF',\n",
" 'TREASURER_OF',\n",
" 'TRUSTEE_OF_TRUST_OF',\n",
" 'TRUST_SETTLOR_OF',\n",
" 'UNIT_TRUST_REGISTER_OF',\n",
" 'VICE_PRESIDENT_OF'})"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"graph.relationship_types"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Officer Searches\n",
"\n",
"Some simple searches on Officers."
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"('a.name': 'Derek Smith')\n",
"('a.name': 'Shane Micheal Smith')\n",
"('a.name': 'Howard James Smith')\n",
"('a.name': 'Michael Smith')\n",
"('a.name': 'Clive Chester Ambler-Smith & Jennifer Anne Ambler-Smith')\n",
"('a.name': 'Clive Chester Ambler-Smith+Jennifer Anne')\n",
"('a.name': 'Bruce & Lindsay Irvine-Smith')\n",
"('a.name': 'Clive Chester Amber Smith + Jennifer Anne Amber Smith')\n",
"('a.name': 'David Smith')\n",
"('a.name': 'Charles Jonathan Smith')\n"
]
}
],
"source": [
"#Find officers whose name partially string matches the supplied name\n",
"name='Smith'\n",
"for r in graph.run(\"MATCH (a:Officer) WHERE a.name CONTAINS '{}' RETURN a.name LIMIT 10\".format(name)):\n",
" print(r)"
]
},
{
"cell_type": "code",
"execution_count": 99,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"('a': (c8d59aa:Officer {icij_id:\"FFA4727CAD8680B2F2617CAF2E0E06F6\",name:\"Derek Smith\",node_id:\"12012101\",sourceID:\"Panama Papers\",valid_until:\"The Panama Papers data is current through 2015\"}))\n"
]
}
],
"source": [
"#Return the fill record for officers whose name partially string matches the supplied name\n",
"#Make case insenstive by casting all parts to uppercase\n",
"exact_name='David Smith'\n",
"for r in graph.run(\"MATCH (a:Officer) WHERE UPPER(a.name) ='{}' RETURN a LIMIT 10\".format(name.upper())):\n",
" print(r)"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(f34eb96:Officer {countries:\"Hong Kong\",country_codes:\"HKG\",icij_id:\"4F707CA3D79C9D37AB009C56AB40889E\",name:\"David Smith\",node_id:\"12101653\",sourceID:\"Panama Papers\",valid_until:\"The Panama Papers data is current through 2015\"})"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p=graph.find_one('Officer','name',exact_name)\n",
"p"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(f34eb96)-[:REGISTERED_ADDRESS]->(c91019d)\n",
"(f34eb96:Officer {countries:\"Hong Kong\",country_codes:\"HKG\",icij_id:\"4F707CA3D79C9D37AB009C56AB40889E\",name:\"David Smith\",node_id:\"12101653\",sourceID:\"Panama Papers\",valid_until:\"The Panama Papers data is current through 2015\"})\n",
"REGISTERED_ADDRESS\n",
"(c91019d:Address {address:\"19/F; Shiu Fung Hong Building; 239-241 Wing Lok Street; Hong Kong\",countries:\"Hong Kong\",country_codes:\"HKG\",icij_id:\"7EF9C62B86C7A2042760F15FF7D43C75\",node_id:\"14006567\",sourceID:\"Panama Papers\",valid_until:\"The Panama Papers data is current through 2015\"})\n",
"----\n",
"(f34eb96)-[:SHAREHOLDER_OF]->(b0574b0)\n",
"(f34eb96:Officer {countries:\"Hong Kong\",country_codes:\"HKG\",icij_id:\"4F707CA3D79C9D37AB009C56AB40889E\",name:\"David Smith\",node_id:\"12101653\",sourceID:\"Panama Papers\",valid_until:\"The Panama Papers data is current through 2015\"})\n",
"SHAREHOLDER_OF\n",
"(b0574b0:Entity {address:\"CAPITAL CORPORATE SERVICES LIMITED UNIT F; 7 FLOOR; CNT TOWER 338 HENNESSY ROAD WANCHAI; HONG KONG\",countries:\"Hong Kong\",country_codes:\"HKG\",ibcRUC:\"37837\",incorporation_date:\"10-DEC-1990\",internal_id:\"502746\",jurisdiction:\"BVI\",jurisdiction_description:\"British Virgin Islands\",name:\"VINA OSCAR HOTEL LIMITED\",node_id:\"10197522\",original_name:\"VINA OSCAR HOTEL LIMITED\",service_provider:\"Mossack Fonseca\",sourceID:\"Panama Papers\",status:\"Active\",valid_until:\"The Panama Papers data is current through 2015\"})\n",
"----\n"
]
}
],
"source": [
"#Find what things a person is connected to at the node level as a parent\n",
"for c in graph.match(start_node=p):\n",
" print(c)\n",
" print(c.start_node())\n",
" print(c.type())\n",
" print(c.end_node())\n",
" print('----')"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"('o': (c8d59aa:Officer {icij_id:\"FFA4727CAD8680B2F2617CAF2E0E06F6\",name:\"Derek Smith\",node_id:\"12012101\",sourceID:\"Panama Papers\",valid_until:\"The Panama Papers data is current through 2015\"}), 'e': (e9ff007:Entity {address:\"INFANTE & PEREZ ALMILLANO CL. 50 Y 74 SAN FRANCISCO; PH 909; PISOS 12 Y 14 APARTADO POSTAL 0830-00142; Z; PANAMA\",countries:\"Panama\",country_codes:\"PAN\",incorporation_date:\"09-JAN-2008\",internal_id:\"6036924\",jurisdiction:\"BVI\",jurisdiction_description:\"British Virgin Islands\",name:\"PORTONES DEL MAR INVESTORS INC.\",node_id:\"10208791\",original_name:\"PORTONES DEL MAR INVESTORS INC.\",service_provider:\"Mossack Fonseca\",sourceID:\"Panama Papers\",status:\"Active\",valid_until:\"The Panama Papers data is current through 2015\"}))\n"
]
}
],
"source": [
"#What companies is the person associated with in more detail for a specified relationship type?\n",
"q='''\n",
"MATCH (o:Officer), (e:Entity)\n",
"WHERE o.name='{}' AND (o)-[:SHAREHOLDER_OF]-(e)\n",
"\n",
"RETURN o,e LIMIT 10\n",
"'''.format(name)\n",
"\n",
"for r in graph.run(q):\n",
" print(r)"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(e9ff007:Entity {address:\"INFANTE & PEREZ ALMILLANO CL. 50 Y 74 SAN FRANCISCO; PH 909; PISOS 12 Y 14 APARTADO POSTAL 0830-00142; Z; PANAMA\",countries:\"Panama\",country_codes:\"PAN\",incorporation_date:\"09-JAN-2008\",internal_id:\"6036924\",jurisdiction:\"BVI\",jurisdiction_description:\"British Virgin Islands\",name:\"PORTONES DEL MAR INVESTORS INC.\",node_id:\"10208791\",original_name:\"PORTONES DEL MAR INVESTORS INC.\",service_provider:\"Mossack Fonseca\",sourceID:\"Panama Papers\",status:\"Active\",valid_until:\"The Panama Papers data is current through 2015\"})"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"exact_entity_name='PORTONES DEL MAR INVESTORS INC.'\n",
"e=graph.find_one('Entity','name',exact_entity_name)\n",
"e"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(b70503d)-[:INTERMEDIARY_OF]->(e9ff007)\n",
"(e6242b5)-[:SHAREHOLDER_OF]->(e9ff007)\n",
"(ff1d840)-[:SHAREHOLDER_OF]->(e9ff007)\n",
"(cfa8ea7)-[:SHAREHOLDER_OF]->(e9ff007)\n",
"(db2f19c)-[:SHAREHOLDER_OF]->(e9ff007)\n",
"(c59a7b6)-[:SHAREHOLDER_OF]->(e9ff007)\n",
"(a06968a)-[:SHAREHOLDER_OF]->(e9ff007)\n",
"(fb1cdaa)-[:SHAREHOLDER_OF]->(e9ff007)\n",
"(f5d83c5)-[:SHAREHOLDER_OF]->(e9ff007)\n",
"(c99c3f7)-[:SHAREHOLDER_OF]->(e9ff007)\n"
]
}
],
"source": [
"#Find what things a company is connected to at the node level - company is an \"end_node\"\n",
"for c in graph.match(end_node=e,limit=10):\n",
" print(c)"
]
},
{
"cell_type": "code",
"execution_count": 105,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"('a': (e9ff007:Entity {address:\"INFANTE & PEREZ ALMILLANO CL. 50 Y 74 SAN FRANCISCO; PH 909; PISOS 12 Y 14 APARTADO POSTAL 0830-00142; Z; PANAMA\",countries:\"Panama\",country_codes:\"PAN\",incorporation_date:\"09-JAN-2008\",internal_id:\"6036924\",jurisdiction:\"BVI\",jurisdiction_description:\"British Virgin Islands\",name:\"PORTONES DEL MAR INVESTORS INC.\",node_id:\"10208791\",original_name:\"PORTONES DEL MAR INVESTORS INC.\",service_provider:\"Mossack Fonseca\",sourceID:\"Panama Papers\",status:\"Active\",valid_until:\"The Panama Papers data is current through 2015\"}))\n"
]
}
],
"source": [
"iid=e['internal_id']\n",
"for r in graph.run(\"MATCH (a:Entity) WHERE a.internal_id ='{}' RETURN a LIMIT 10\".format(iid)):\n",
" print(r)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.1"
},
"widgets": {
"state": {},
"version": "1.1.2"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment