Skip to content

Instantly share code, notes, and snippets.

@jexp
Last active August 7, 2018 22:17
Show Gist options
  • Save jexp/90f590ad1d125284bc35223c86bebd33 to your computer and use it in GitHub Desktop.
Save jexp/90f590ad1d125284bc35223c86bebd33 to your computer and use it in GitHub Desktop.
Neo4j Import of TrumpWorld from Buzzfeed https://www.buzzfeed.com/johntemplon/help-us-map-trumpworld

The TrumpWorld Graph

Importing the Buzzfeed TrumpWorld Dataset into Neo4j

I came across this tweet by Sanchez Castro (@SCHZCAS) which I’m more than happy to support.

Hey @neo4j please do this once again!!! #panamapapers #neo4j
https://t.co/B8pjxNKCyA

— sanchezcastro (@SCHZCAS) January 15, 2017

As part of the Buzzfeed article Help Us Map TrumpWorld, the four investigative journalists, John Templon, Alex Campbell, Anthony Cormier, and Jeremy Singer-Vine asked the public to help them map and analyze the data that they investigated, confirmed and published:

Now we are asking the public to use our data to find connections we may have missed, and to give us context we don’t currently understand. We hope you will help us — and the public — learn more about TrumpWorld and how this unprecedented array of businesses might affect public policy.

sub buzz 31493 1484333437 1

Setup with Local Database

If you want to work with this data locally, please download and install Neo4j 3.1 or use a blank Neo4j Sandbox (of course you can also explore the already imported Trumpworld sandbox).

Looking at the Data

The Buzzfeed article linked int he tweet points to a Google Spreadsheet with collected and verified/researched data of relationships of Trump organizations to other organizations, in total we have 770 organizations (as of today) with 611 relationships. The same data is available as a CSV and GraphML file in their GitHub Repository.

The data is fortunately available as a public google doc, so we can use our old trick of loading the CSV-Download URL with LOAD CSV into Neo4j.

Let’s have a look at the data first, here as a sample:

WITH
'https://docs.google.com/spreadsheets/u/1/d/1Z5Vo5pbvxKJ5XpfALZXvCzW26Cl4we3OaN73K9Ae5Ss/export?format=csv&gid=1996904412' AS url
LOAD CSV WITH HEADERS FROM url AS row
RETURN row.`Entity A`,row.`Entity A Type`, row.`Entity B`,row.`Entity B Type`,row.Connection, row.`Source(s)`
LIMIT 5
╒════════════════════════════════════╤═════════════════════╤═══════════════════════════════╤═════════════════════╤════════════════╤══════════════════════════════════════════════════════════════════════════════════════╕
│"row.`Entity A`"                    │"row.`Entity A Type`"│"row.`Entity B`"               │"row.`Entity B Type`"│"row.Connection"│"row.`Source(s)`"                                                                     │
╞════════════════════════════════════╪═════════════════════╪═══════════════════════════════╪═════════════════════╪════════════════╪══════════════════════════════════════════════════════════════════════════════════════╡
│"4 SHADOW TREE LANE MEMBER CORP."   │"Organization"       │"4 SHADOW TREE LANE LLC"       │"Organization"       │"Ownership"     │"https://www.documentcloud.org/documents/2838696-Trump-2016-Financial-Disclosure.html"│
├────────────────────────────────────┼─────────────────────┼───────────────────────────────┼─────────────────────┼────────────────┼──────────────────────────────────────────────────────────────────────────────────────┤
│"40 WALL DEVELOPMENT ASSOCIATES LLC"│"Organization"       │"40 WALL STREET LLC"           │"Organization"       │"Ownership"     │"https://www.documentcloud.org/documents/2838696-Trump-2016-Financial-Disclosure.html"│
├────────────────────────────────────┼─────────────────────┼───────────────────────────────┼─────────────────────┼────────────────┼──────────────────────────────────────────────────────────────────────────────────────┤
│"40 WALL STREET LLC"                │"Organization"       │"40 WALL STREET COMMERCIAL LLC"│"Organization"       │"Ownership"     │"https://www.documentcloud.org/documents/2838696-Trump-2016-Financial-Disclosure.html"│
├────────────────────────────────────┼─────────────────────┼───────────────────────────────┼─────────────────────┼────────────────┼──────────────────────────────────────────────────────────────────────────────────────┤
│"40 WALL STREET MEMBER CORP."       │"Organization"       │"40 WALL STREET LLC"           │"Organization"       │"Ownership"     │"https://www.documentcloud.org/documents/2838696-Trump-2016-Financial-Disclosure.html"│
├────────────────────────────────────┼─────────────────────┼───────────────────────────────┼─────────────────────┼────────────────┼──────────────────────────────────────────────────────────────────────────────────────┤
│"401 MEZZ VENTURE LLC"              │"Organization"       │"401 NORTH WABASH VENTURE LLC" │"Organization"       │"Ownership"     │"https://www.documentcloud.org/documents/2838696-Trump-2016-Financial-Disclosure.html"│
└────────────────────────────────────┴─────────────────────┴───────────────────────────────┴─────────────────────┴────────────────┴──────────────────────────────────────────────────────────────────────────────────────┘

The most frequently mentioned organizations found by this query:

WITH
'https://docs.google.com/spreadsheets/u/1/d/1Z5Vo5pbvxKJ5XpfALZXvCzW26Cl4we3OaN73K9Ae5Ss/export?format=csv&gid=1996904412' AS url
LOAD CSV WITH HEADERS FROM url AS row
WITH row
WHERE row.`Entity A Type` = 'Organization' AND row.`Entity B Type` = 'Organization'
UNWIND  [row.`Entity A`, row.`Entity B`] AS org
RETURN org, count(*)
ORDER BY count(*) DESC LIMIT 10

We see the usual suspect on top.

╒══════════════════════════════╤══════════╕
│"org"                         │"count(*)"│
╞══════════════════════════════╪══════════╡
│"THRIVE CAPITAL"              │"84"      │
├──────────────────────────────┼──────────┤
│"MERCER FAMILY FOUNDATION"    │"41"      │
├──────────────────────────────┼──────────┤
│"40 WALL STREET LLC"          │"40"      │
├──────────────────────────────┼──────────┤
│"DJT HOLDINGS LLC"            │"35"      │
├──────────────────────────────┼──────────┤
│"KUSHNER COMPANIES"           │"30"      │
├──────────────────────────────┼──────────┤
│"TRUMP HOTELS & CASINO RESORTS│"28"      │
│, INC."                       │          │
├──────────────────────────────┼──────────┤
│"TRUMP TOWER COMMERCIAL LLC"  │"21"      │
├──────────────────────────────┼──────────┤
│"TRUMP ORGANIZATION LLC"      │"18"      │
├──────────────────────────────┼──────────┤
│"THE TRUMP ORGANIZATION, INC."│"13"      │
├──────────────────────────────┼──────────┤
│"TTTT VENTURE LLC"            │"8"       │
└──────────────────────────────┴──────────┘

What kind of relationships exist in the data:

WITH
'https://docs.google.com/spreadsheets/u/1/d/1Z5Vo5pbvxKJ5XpfALZXvCzW26Cl4we3OaN73K9Ae5Ss/export?format=csv&gid=1996904412' AS url
LOAD CSV WITH HEADERS FROM url AS row
RETURN row.Connection AS type, count(*)
ORDER BY count(*) DESC
╒═══════════════════════════════════════════════════════...═╤══════════╕
│"type"                                                 ... │"count(*)"│
╞═══════════════════════════════════════════════════════...═╪══════════╡
│"President (as of 2016 FEC filing)"                    ... │"475"     │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Ownership"                                            ... │"323"     │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Director"                                             ... │"126"     │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Investor"                                             ... │"120"     │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Reported member"                                      ... │"103"     │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Former director"                                      ... │"66"      │
....                                                    ...
│"\"Kushner and Cui have had an extensive ongoing dialog...i│"1"       │
│te House official,\" per Bloomberg"                    ... │          │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Marino served as Barry's law clerk"                   ... │"1"       │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Former director of African American outreach"         ... │"1"       │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Former director of advance"                           ... │"1"       │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Executive vice president and special counsel to Donald... │"1"       │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Ziya \"awarded a series of multimillion-dollar contrac... │"1"       │
├───────────────────────────────────────────────────────...─┼──────────┤
│"Former deputy director member relations"              ... │"1"       │
└───────────────────────────────────────────────────────...─┴──────────┘

While some of them like Ownership, or Investor are straightforward others (Ziya \"awarded a series of multimillion-dollar contracts to Azarpassillo,\" per The New Yorker) are very specific, and probably not a good choice for relationship-type to query on.

So we have two options, one could be to use a generic relationship and put all the Connection information into a property, or alternatively we do some cleanup/unification and have a richer set of relationships.

Simple, Direct Data Import

We look at the simpler variant here, to quickly get results. THe unification approach is demonstrated here, for those of you that want to have a bit more interesting graph model.

@jexp
Copy link
Author

jexp commented Jan 28, 2017

Some more queries to reproduce what @kimalbrecht did in his graph visualization of the TrumpWorld graph.
See: http://trump.kimalbrecht.com/

Connections to an organization sorted by connectedness of other side

MATCH (n:Organization {name:"THE TRUMP ORGANIZATION, INC."})-[r]-(o)
RETURN r.connection, o.name, head(labels(o)) as type, size( (o)--() ) as degree
ORDER BY degree DESC LIMIT 25
╒════════════════════════════════════════════════════════════════════════════════════════════════════╤═════════════════════════════════════════════════╤══════════════╤════════╕
│"r.connection"                                                                                      │"o.name"                                         │"type"        │"degree"│
╞════════════════════════════════════════════════════════════════════════════════════════════════════╪═════════════════════════════════════════════════╪══════════════╪════════╡
│"Chairman and president"                                                                            │"DONALD J. TRUMP"                                │"Person"      │"632"   │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"Will co-run company after Jan. 20, 2017"                                                           │"DONALD TRUMP JR."                               │"Person"      │"24"    │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"Will co-run company after Jan. 20, 2017"                                                           │"ERIC TRUMP"                                     │"Person"      │"6"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"Vice president, assistant to the president"                                                        │"RHONA GRAFF RICCIO"                             │"Person"      │"4"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"Bid on the Old Post Office together"                                                               │"COLONY CAPITAL, INC."                           │"Organization"│"4"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"The chamber said it had \"signed formal agreements\" with the Trump Organization to \"jointly servi│"RUSSIAN-AMERICAN CHAMBER OF COMMERCE IN THE USA"│"Organization"│"4"     │
│ce the Russian clients’ commercial, residential and industrial real-estate needs,” in an April 2009 │                                                 │              │        │
│newsletter, according to the Financial Times"                                                       │                                                 │              │        │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"\"Senior Advisor to Donald Trump\""                                                                │"FELIX SATER"                                    │"Person"      │"3"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"Chief Legal Officer"                                                                               │"JASON D. GREENBLATT"                            │"Person"      │"3"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"CFO"                                                                                               │"ALLEN WEISSELBERG"                              │"Person"      │"3"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"Ridloff \"spent eight months in 'acquisitions and finance',\" per the Financial Times"             │"DANIEL RIDLOFF"                                 │"Person"      │"3"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"Business partner"                                                                                  │"TONY TIAH THEE KIAN"                            │"Person"      │"2"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"Former EVP"                                                                                        │"RUSSELL FLICKER"                                │"Person"      │"2"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"Executive vice president and special counsel to Donald J. Trump"                                   │"MICHAEL D. COHEN"                               │"Person"      │"2"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"COO"                                                                                               │"MATTHEW CALAMARI"                               │"Person"      │"2"     │
├────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────┼──────────────┼────────┤
│"Director of Surveillance"                                                                          │"MATTHEW CALAMARI, JR."                          │"Person"      │"2"     │
└────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────┴──────────────┴────────┘

Virtual graph of the above

match (n:Organization {name:"THE TRUMP ORGANIZATION, INC."})-[r]-(o)
WITH n, r.connection as con, o, size( (o)--() ) as deg WHERE deg > 1
call apoc.create.vRelationship(n,con,{degree:deg},o) yield rel
RETURN n,rel,o

@jexp
Copy link
Author

jexp commented Jan 29, 2017

Creating a virtual (projection) graph result

Mark higher connected organizations with a 'More' label

match (n:Organization {name:"THE TRUMP ORGANIZATION, INC."})-[r]-(o)
WITH n, r.connection as con, labels(o) as labels, 
     collect(o.name) as names, sum(size((o)--())) as degrees, count(*) as c
call apoc.create.vNode(labels + case when degrees > c then ['More'] else [] end, 
                       {name:names, count:c ,degrees:degrees}) yield node
call apoc.create.vRelationship(n,con,{count:c},node) yield rel
RETURN n,rel,node

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment