Skip to content

Instantly share code, notes, and snippets.

@Orbifold
Last active Sep 21, 2019
Embed
What would you like to do?
The wonderful Stellargraph shows explicitly how node2vec works via a random walk on graphs to generate 'word' sequences one can use for word2vec.
{
"cells": [
{
"cell_type": "markdown",
"source": [
"# Node2Vec embedding\n",
"\n",
"Embedding of nodes happens via word2vec by means of a smart trick: using randomg walks over the graph to generate 'word' sequences.\n",
"\n",
"Stellargraph has its own direct method to perform the embedding but the intermediate methods highlights better the process. So, below we generate the node2vec embedding via an explicit walk and show how it generates a really good community detection separation."
],
"metadata": {}
},
{
"cell_type": "code",
"source": [
"import networkx as nx\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from sklearn.manifold import TSNE\n",
"%matplotlib inline"
],
"outputs": [],
"execution_count": 35,
"metadata": {
"collapsed": false,
"outputHidden": false,
"inputHidden": false
}
},
{
"cell_type": "markdown",
"source": [
"We'll use the karater club to demonstrate the process. The graph consists of two sets of nodes which are a well-separated according to the 'club' property."
],
"metadata": {}
},
{
"cell_type": "code",
"source": [
"g_nx = nx.karate_club_graph()\n",
"cols = [\"green\" if g_nx.nodes[n][\"club\"]=='Officer' else \"orange\" for n in g_nx.nodes()]\n",
"nx.draw(g_nx, node_color=cols)"
],
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"/Users/swa/conda/lib/python3.7/site-packages/networkx/drawing/nx_pylab.py:611: MatplotlibDeprecationWarning: isinstance(..., numbers.Number)\n",
" if cb.is_numlike(alpha):\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
],
"image/png": [
"\n"
]
},
"metadata": {}
}
],
"execution_count": 42,
"metadata": {
"collapsed": false,
"outputHidden": false,
"inputHidden": false
}
},
{
"cell_type": "markdown",
"source": [
"From this graph we create a Stellargraph and perform a biased random walk on it. This generates word sequences, in this case the string value of the node index."
],
"metadata": {}
},
{
"cell_type": "code",
"source": [
"from stellargraph.data import BiasedRandomWalk\n",
"from stellargraph import StellarGraph\n",
"from gensim.models import Word2Vec\n",
"\n",
"rw = BiasedRandomWalk(StellarGraph(g_nx))\n",
"\n",
"walks = rw.run(\n",
" nodes=list(g_nx.nodes()), # root nodes\n",
" length=100, # maximum length of a random walk\n",
" n=10, # number of random walks per root node \n",
" p=0.5, # Defines (unormalised) probability, 1/p, of returning to source node\n",
" q=2.0 # Defines (unormalised) probability, 1/q, for moving away from source node\n",
")\n",
"walks = [list(map(str, walk)) for walk in walks]\n",
"model = Word2Vec(walks, size=128, window=5, min_count=0, sg=1, workers=2, iter=1)\n",
"\n"
],
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"WARNING: Logging before flag parsing goes to stderr.\n",
"W0920 20:17:43.064015 4718831040 base_any2vec.py:1386] under 10 jobs per worker: consider setting a smaller `batch_words' for smoother alpha decay\n"
]
}
],
"execution_count": 11,
"metadata": {
"collapsed": false,
"outputHidden": false,
"inputHidden": false
}
},
{
"cell_type": "markdown",
"source": [
"The value of an embedding is for instance"
],
"metadata": {}
},
{
"cell_type": "code",
"source": [
"model.wv['29']"
],
"outputs": [
{
"output_type": "execute_result",
"execution_count": 12,
"data": {
"text/plain": [
"array([ 0.0283457 , 0.06906749, -0.09740856, 0.08761664, 0.0240158 ,\n",
" -0.04252268, 0.05366189, 0.12255755, -0.14192946, -0.12441556,\n",
" 0.14022443, 0.16821992, 0.01899681, 0.02525605, -0.129657 ,\n",
" -0.00075872, -0.10963597, -0.24603637, 0.14481993, 0.04069758,\n",
" -0.07761958, -0.20790713, 0.02872016, 0.02599382, -0.11296149,\n",
" -0.09854981, -0.04557071, 0.24646853, 0.00121522, 0.11991252,\n",
" 0.07983409, -0.08048143, 0.11442558, 0.21689218, -0.04602851,\n",
" -0.03846204, -0.14180224, -0.13806581, -0.2756952 , 0.21888009,\n",
" -0.2403047 , 0.10116432, -0.05692552, 0.29991767, 0.15323073,\n",
" -0.07367025, -0.22721468, 0.04121917, -0.1474215 , -0.12874171,\n",
" 0.11634433, 0.03715171, -0.02997661, 0.1846444 , 0.09240112,\n",
" -0.11611677, -0.05956773, 0.10450908, -0.0748686 , 0.0841886 ,\n",
" 0.00648758, -0.23513122, -0.03379333, -0.03147478, 0.15027955,\n",
" -0.2418554 , 0.03921422, 0.13003238, 0.1925577 , -0.02482169,\n",
" -0.17496283, -0.1193176 , 0.21379891, -0.06565908, -0.12362305,\n",
" -0.12427699, -0.1113233 , 0.0647103 , 0.2713958 , 0.28840208,\n",
" -0.08871546, 0.27347487, -0.15478794, -0.01204949, -0.07479579,\n",
" 0.05519324, -0.07719369, 0.37456694, -0.04905338, 0.10100004,\n",
" -0.26735783, 0.1140677 , 0.04411978, 0.0743799 , -0.20635152,\n",
" -0.1354365 , -0.09291358, 0.02883326, 0.1929327 , -0.03071002,\n",
" 0.20335366, -0.09941982, 0.04967964, 0.12711474, -0.09630557,\n",
" 0.07571112, -0.10136695, -0.13535073, -0.07481653, 0.01686542,\n",
" 0.0458274 , -0.15950125, -0.12564494, 0.09170009, 0.11469213,\n",
" -0.12933742, -0.07965938, 0.1902986 , 0.21146712, 0.08862376,\n",
" -0.08914143, -0.24147275, 0.02615149, 0.06191884, 0.04629293,\n",
" -0.03686432, 0.28888953, 0.06754036], dtype=float32)"
]
},
"metadata": {}
}
],
"execution_count": 12,
"metadata": {
"collapsed": false,
"outputHidden": false,
"inputHidden": false
}
},
{
"cell_type": "markdown",
"source": [
"In order to visualize the embedding one has to somehow reduce the dimension. This is most easily done via t-SNE."
],
"metadata": {}
},
{
"cell_type": "code",
"source": [
"# Retrieve node embeddings and corresponding subjects\n",
"node_ids = model.wv.index2word # list of node IDs\n",
"node_embeddings = model.wv.vectors # numpy.ndarray of size number of nodes times embeddings dimensionality\n",
"node_targets = [ g_nx.node[int(node_id)]['club'] for node_id in node_ids]"
],
"outputs": [],
"execution_count": 18,
"metadata": {
"collapsed": false,
"outputHidden": false,
"inputHidden": false
}
},
{
"cell_type": "code",
"source": [
"# Apply t-SNE transformation on node embeddings\n",
"tsne = TSNE(n_components=2)\n",
"node_embeddings_2d = tsne.fit_transform(node_embeddings)"
],
"outputs": [],
"execution_count": 21,
"metadata": {
"collapsed": false,
"outputHidden": false,
"inputHidden": false
}
},
{
"cell_type": "code",
"source": [
"alpha=0.9\n",
"label_map = { l: i for i, l in enumerate(np.unique(node_targets))}\n",
"node_colours = [ label_map[target] for target in node_targets]\n",
"\n",
"plt.figure(figsize=(10,8))\n",
"plt.scatter(node_embeddings_2d[:,0], \n",
" node_embeddings_2d[:,1], \n",
" c=node_colours, cmap=\"jet\", alpha=alpha)"
],
"outputs": [
{
"output_type": "execute_result",
"execution_count": 28,
"data": {
"text/plain": [
"<matplotlib.collections.PathCollection at 0x1463cfc50>"
]
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 720x576 with 1 Axes>"
],
"image/png": [
"iVBORw0KGgoAAAANSUhEUgAAAmIAAAHVCAYAAABScZe2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3XmYVOWZ/vH7qaUXWZUdBQFFxRVCBxdIFFdQFOOuiZpool7KODNxRk3yi0zGMdFJzKKJcSWamUHFRCIRNAoYTdwbJSqLioIiIouIQNNdyznv749usVtBULr7qeX7uS6urnrrVHProbrvOu97TlkIQQAAAGh/Ce8AAAAA5YoiBgAA4IQiBgAA4IQiBgAA4IQiBgAA4IQiBgAA4IQiBgAA4IQiBgAA4IQiBgAA4CTlHWBbdO/ePQwYMMA7BgAAwFbNmTNndQihx7ZsWxRFbMCAAaqtrfWOAQAAsFVm9ta2bsvUJAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGIBttmrBAv35wgt1+0EH6YHzz9fKV17xjgQARS3lHQBAcXh3zhzdPW6c8g0NslRK77/6ql6fPl2nT52qfgcf7B0PAIoSR8QAbJPZ3/++cg0NSlVXK5lOK1VdrSib1cwrrvCOBgBFiyNiKHprlyzRkscfV2WnTtrt6KNV0bGjd6SS9O6cOUpVVrYYS1ZWasU//qEQx7IE7+sA4POiiKGoPXH11Xr2hhskM1kioUQ6rVPvvVf9DjnEO1rJqd5pJ9V/8IGSzQpXiCJVde1KCQOAL4ifnihab//973r217+WJZNKpFKyREL5+nr98ayzFGWz3vFKzogJE2RqLF+SFOJYCkE1F1/sGwwAihhFDEXr5cmTFWUyLY7GJCsqFGUyevvJJx2TlaYREyZo+EUXbbof4lgHfPObOuSyyxxTAUBxY2oSRSvK5bb4WJzPt2OS8mCJhA6/+mqN/Pd/14dLl6rzLruoqksX71gAUNQ4IoaiNeSkk5SsqFAIYdNY3FTOWCPWdio7d1bPffahhAFAK6CIoWjtfswx2mPcOCmOlaurU75pmvK4m25SRYcO3vEAANgqpiZRtCyR0Al33KGlTz6pN2fOVGWnThpyyinquuuu3tGAVlW/Zo1emz5dubo6DTz8cHXbYw/vSABaCUUMRc3M1H/UKPUfNco7CtAm3pw1S1O/8Q3FUaQQRbJkUsMvvFCj//M/ZWbe8QBsJ6YmAaBA5TZu1J/OPVdRPq9EKqVkZaUskdCcW27RUs4MBkoCRQwACtTbf/+7QhQpmU5vGrNEQlE2q3lTpjgmA9BaKGIAUKBCHG/xsbjpwroAihtFDAAK1EdrH5tfMy/EsZIVFdrn1FO9YgFoRRQxAChQFR07atxttymRSCjKZJTbuFEhjrX/N76hXQ891DsegFbAWZMAUMD2HDdOO//jH1r4pz8pu2GDBh15pHoPHeodC0AroYgBQIHr2Lu3app9zieA0sHUJAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgBOKGAAAgJNWKWJmNsnMVprZK83GdjKzR83s9aavOzaNm5ndYGaLzOwlM/tSa2QAAAAoNq11ROxOSWM+MXalpFkhhMGSZjXdl6SxkgY3/blA0m9bKQMAAEBRaZUiFkJ4QtKaTwyPl3RX0+27JJ3YbPz3odEzkrqaWZ/WyAEAAFBM2nKNWK8QwvKm2+9J6tV0e2dJS5tt907TWAtmdoGZ1ZpZ7apVq9owJgAAgI92WawfQgiSwud8zq0hhJoQQk2PHj3aKBkAAICftixiKz6acmz6urJpfJmkfs2226VpDAAAoKy0ZRGbJuncptvnSnqg2fg5TWdPHiTpw2ZTmAAAAGUj1RrfxMzulnSYpO5m9o6kiZKulTTFzM6X9Jak05o2nyHpWEmLJG2U9K3WyAAAAFBsWqWIhRDO3MJDR2xm2yDpktb4ewEAAIoZV9YHAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhEDAABwQhFrJSEE7wgAAKDIUMS2QwhBc26/Xb/ec09d17WrbhsxQm88+qh3LAAAUCQoYtvh+d/8RrO/9z1tXLNGqR120Jo33tD9X/+6ljz+uHc0AABQBChiX1Ccz+upn/5UMlMynZaZKVVZqTif199+/GPveAAAoAhQxL6gzLp1ytbVKZFKtRhPpNN6/9VXnVIBAIBiQhH7gio7d1ZFx46K8/kW43Eup+577eWUCgAAFBOK2BeUSKU06sorpRAU5XIKISifySiRSumr/+//eccDAABFgCK2HYZfeKGOuv56dezZU1Emo+577aVT7rlH/UeN8o4GAACKgBXD9a9qampCbW2tdwwAAICtMrM5IYSabdmWI2IAAABOKGIAAABOKGIAAABOKGIAAABOUlvfBEChyKxfr3eeflqJVEr9R41SsqLCOxIAYDtQxIAiseD++zXjkks23U9WVurku+9Wv4MPdkxV+NYuWaLnb7pJ782dq14HHKAvX3yxdhw40DsWAEji8hVAUVi7ZIluGzFCkjZ9rFaUzSpdXa1LFi5URYcOnvEK1oqXX9b/jRmjXH29ZCaFoHR1tc6aMUO9DzjAOx6AEsXlK4ASM2/KFMX5fIvPNk1WVCjK5fTGI484JitsM6+8UtmNG5WqqlKqslKpqiplN27UzCuu8I4GAJIoYkBRyKxfrxBFnxoPcazs+vUOiYrDO08/rVRVVYuxVFWV3nnmGRXDbACA0kcRA4rAbkcfrVRVVYvyEOJYkjTgsMOcUhW+qi5dPlVgQxSpsnNnmZlTKgD4GEUMrj5YvFjP/PKXeupnP9PKefO84xSs/qNGafBxxylEkXIbNypXX68QxzroX/9VXfr3945XsIZfeKFCHG8qrR/dHn7BBc7JAKARi/XhZu6dd+rRyy9XnM8rxLGSFRUaMWGCDr3qKu9oBSnEsRb95S9aOHWqkpWV2veMM9R/5EjvWAUtyuX08L/8i+ZPmdK4pi6b1ZCTT9bYG29UMp32jgegRH2exfoUMbjYsGKFfrvffpI+Pgvwo6MVZz/yiHoPHeoZDyVmw4oV+uDNN7XjwIHq2Lu3dxwAJe7zFDGuIwYXb86cKUktzgK0REJRJqPXHnyQIoZW1bFXL3Xs1cs7BgB8CmvE4MISicbrOn1CkGTJZPsHAtCu4nxedStXKsrlvKMArihicLHb0UfLzBTn85vG4ihSMp3WXiee6JgMQFubc/vtumHwYN2077761YAB+vu11246oQIoNxQxuNihWzeNu/nmTWUsymalONZXr7pKPYYM8Y4HoI3MmzJFs7/3PeXq6pRIpRTlcnr6+uv1zK9+5R0NcMFifbiqW7VKix56SFEup92OOopLMQAl7taaGq1dsqTFB9bH+bzS1dX658WLG5ctAEWOxfooGh169NAB55zjHQNAO1m3bFmLk3SkxnWhmXXrFGWzn/okBKDU8dYDANBueu2/v6JMpsVYnMup8y67KFlZ6ZQK8EMRAwC0m9E/+pGSFRXK1dcrjiLlGxpkiYRGX3MNHzuFskQRAwC0m10OOkhnTZ+uAYcdpsqOHdW3pkan3Huv9jrhBO9ogAvWiAEA2lXfmhqdMXWqdwygIHBEDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAcbHjvPa1euFBRLucdBYCjlHcAACgnG99/X9POP19Ln3xSlkwqXV2tY37+c+31ta95RwPggCNiANCO7v/61/XWE0/IUilZIqHMhg168KKLtPzFF72jAXDQ5kXMzJaY2ctmNtfMapvGdjKzR83s9aavO7Z1jmKzeuFCzZsyRe8884xCCN5xUMZCCHr1z3/W5OOO0x0jR+pvP/mJGtau9Y5VlNYsWqTlL7ygZGWlzEySlEynFWWzqr35Zud0ADy019Tk6BDC6mb3r5Q0K4RwrZld2XT/inbKUtCiXE7Tzj9fix56SJZMSpJ2HDRIZzzwgDr06OGcDuXob9dco2dvuEFxPi9LJPT0woWaf999+ubjj6uyUyfveEWlbuVKJVKb+bGbSGjd0qXtHwiAO6+pyfGS7mq6fZekE51yFJza3/5Wr8+YsWnaQmZavXChZkyY4B0NZWjj6tV69oYbZImEUlVVSlZUKFlZqXVLl+rl//s/73hFp+e++ypEkeIoajFuZhp4xBFOqQB4ao8iFiQ9YmZzzOyCprFeIYTlTbffk9SrHXIUhRcnTZLMNk1bmJmSFRVaPGuWshs2OKdDuXlv7lwlPnpT0EwcRVo8e7ZTquJV2bmzDrniCikE5RsaFGWzymcy6tirl4add553PAAO2mNqclQIYZmZ9ZT0qJktbP5gCCGY2acWQTWVtgskqX///u0QszDk6+s3lbBNzKQQFGWzPqFQtjr07KkQRVIi0fLfZQjq3K+fX7Aidsh3v6seQ4bo+Ztu0sZVq7T7mDEaMWGCqndkqSxQjtq8iIUQljV9XWlmUyWNkLTCzPqEEJabWR9JKzfzvFsl3SpJNTU1ZbNafc8TTtALd9zRYh1J1NCgHvvso+qddnJMhnLUc7/91G2PPbRy3rxNC8yjXE7Jykp96dvf9o5XtAaPHavBY8d6xwBQANp0atLMOphZp49uSzpa0iuSpkk6t2mzcyU90JY5isnIK65Ql379FOfzym3cqCiXU7pjRx376197RysJcT6vF++4Q7/7yld024EH6qmf/lTZujrvWAXLzHTqH/6gnUeMUMjnFUJQZadOOuGOO9RjyBDveABQ9KwtL41gZoMkTW26m5I0OYRwjZl1kzRFUn9Jb0k6LYSwZkvfp6amJtTW1rZZzkKTravTgj/+Ue8884x2GjxY+511ljr2Yhlda3jg/PP16rRpCnG8aaqt5z776OyZM5VMp53TFbZ177yjzPr16jZ48ObP/AMASJLMbE4IoWabti2Ga1SVWxFD21g1f77uPOwwWTK5qYSFEKQ41rhbb9Ve48c7JwQAlILPU8S4sj7Kxrtz5khSi0XnZqZ8Q4OWPvmkVywAQBmjiKFsdOzd+1OXYZCkRDqtLpwBCABwQBFD2Rhw2GHaoXt35RsaFEJQaLqWU6qqSvucfrp3PABAGaKIoWwk02md9eCD6j10aOMZgFGkLv376/SpU9WhZ0/veACAMsSpTygrXQcM0LmzZ2v9u+8qymbVZdddP30BXQAA2glFDGWpU9++3hEAAGBqEgAAwAtFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwEnKOwAAAGgbuVykmTPf1JIla7Xvvj01cmR/JRLmHQvNUMQAAChBy5ev15gx/6cVKzYol4uUSiW1//49NXXqGerYscI7HpowNQkAQAm69NKHtXTph0okTJWVKSUS0gsvvKf//u8nvaOhGYoYAAAlpqEhr9mzF6uiIrlpzMyUTJruuecVx2T4JKYmAQDtbsmStbrxxudUW/uuhgzprksvPVB7793DO1bJCCEohCDbzHKwfD5u/0DYIooYAKBdLViwSkcd9T/auDGnRML00ksr9Kc/LdQf/nCaRo3q7x2vJFRXpzVyZD89+eRSVVU1/qoPISiKYp188t7O6dAcU5MAgHY1ceJftWFDVlVVKVVUJFVdnVI2G+nf/u0R72gl5cYbj1X37json49VV5dVCNLuu++kH/zgK97R0AxHxAAA7ervf39blZUtf/1UVib16qvvq74+p+rqtFOy0jJgQFe9+OKFmjbtVS1e3Hj5irFjd1c6ndz6k9FuKGIAgHbVrVu13ntvg5LJjwtBHIdNR8jQejp0qNCZZ+7nHQOfgalJAEC7mjDhQEmmOA6SGktYFAWdd95QJZP8WkJ54V88AKBdfec7X9JFFw1XHDee1RdFQSefPEQTJx7mHQ1odxZC8M6wVTU1NaG2ttY7BgCgFa1d26A33lijfv26qGfPDt5xgFZjZnNCCDXbsi1rxAAALrp2rdLw4X29YwCumJoEAABwQhEDUBbWr8/o7bc/5KriAAoKU5MASlpDQ16XXfYX3XfffJk1ns7/k58codNP39c7GgBwRAxAafuXf3lY9947T4mEKZlMaP36jP7pnx7SE0+85R0NAChiAErX2rUNuv/+BUqlEkokGj/9OJ1OKpeL9ItfPO2cDgAoYgBK2KpVdUokbFMJ+0gqldBbb33olAoAPkYRA1Cy+vfvonQ68akF+lEUNHJkP6dUAPAxihiAklVZmdLEiYdKaly0n8/Hqq/Pq2PHCl122SHO6QCAsyYBlLhvf3u4dt65s37xi2e0bNl6jRrVX1dcMVIDBnT1jgYAFDEApW/s2MEaO3awdwwA+BSmJgEAAJxwRAwocCEELX3qKS28/37JTENOPlm7HHSQzGzrTwYAFDSKmKTMunV6t7ZWFZ06qe/w4bIEBwpROGZ9//t68Y47FGWzkqR//P73qrnwQo2++mrnZACA7VX2RezFSZM06/vfbyxfIWiH7t112h//qG577OEdDdDKV17Ri3fcIUsklN5hB0lSiGPV3nKL9vv619V9r72cEwIAtkdZH/pZ/sILmnnllQpx4zWGgqR1y5bp3pNO2jQGeHpz1izFuVyLo7SWSCiOIi2ePdsxGQCgNZR1EfvHXXcpzuWUSDUeGDQzpaqqtHH1ai177jnndKVr1YIFWvTww1q7ZIl3lIKXrq6WJZOfGk8kEkpVVzskAgC0prIuYhvXrFHYzLglEsqsX9/ueUpdZv163X388brz0EP1wHnn6bYvf1nTvvMdRbmcd7SCtecJJyiRTCpu9v8oyuVkyaT2GDfOMRkAoDWUdRHbY9w4JdNphfBxHYujSCGKtPOXv+yYrDQ9evnlWvrUU5uO8FgqpYVTp+q5G290Tla4OvburXG33aZEKtX47zQEJdNpnTBpkjr06OEdDwCwnax5CSlUNTU1oba2ttW/b5TNavK4cXrvxRcV5/MKkpLptEZffbVqLryw1f++chblcvp5376yZLLFeqcol1OHHj10yfz5jukKX2b9ei35619lZhpw2GGq6NjROxIAYAvMbE4IoWZbti3rsyaTFRU668EHNf+Pf9Sr06apqmtXDfvWt7TziBHe0UpOnMspzueVTLX8J2eJhLIbNjilKh6VnTppz+OP944BAGhlZV3EpMYytt+ZZ2q/M8/0jlLS0jvsoF77768VL73UYpF5nM1qNwoGAKBMlfUaMbSvY375S6U7dFCUzSrf0KAom1V1t2469KqrvKMBAOCi7I+Iof30GTZM337mGb14xx1avXCh+o4YoQPOOUc7dOvmHQ0AABcUMbSrzrvsokMnTvSOAQBAQWBqEgAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDABKyMqVdfrOd6apV6+fqXfvn+nii6drzZp671gAtiDlHQAA0Dqy2UhHH/0/euutD1VZmVQIQffc84rmzFmuJ588T6kU772BQsOrEgBKxEMPva7lyzeoujqlRMKUTCZUWZnUW2+t1WOPLfaOB2AzKGIAUCJeffV91dfnWoyZmbLZSK+99r5TqvIRQtDy5eu1alWddxQUEaYmAaBEDB68k6qr0y3GQgiqqEhq8OBuTqnKw0svrdAFF/xZb7zxgUIIqqnpq1tuGaddd+3qHQ0FjiNiAFAijj12sHr16qD6+rziOCiOgzKZSP36ddHhhw/0jley1qyp17hxk/Xqq+8rlTKl0wk999wyHXfcZOVykXc8FDiKGACUiMrKlB555GyNH7+n8vlYURR0yil766GHvs5C/TZ0333zVV+fV3V1SmYmM1NVVUqrV2/U7NmszcNnc5uaNLMxkn4lKSnp9hDCtV5ZAKBU9O7dUXfeeaJCCJIa14ihbS1d+qGy2UjpdMuyG0VB77673ikVioXLWyQzS0r6jaSxkvaWdKaZ7e2RBQBK0UdHZtD2RozYWVVVqU3lV2pcm5dISMOG9XFMhmLgdax6hKRFIYQ3QwhZSfdIGu+UBQCAL2zs2N21557dlM3GyuUiZbORcrlYhx8+UEOH9vaOhwLnVcR2lrS02f13msY2MbMLzKzWzGpXrVrVruEAANhW6XRS06efpcsuO1i77NJFgwbtqB/9aLR+//uveUdDMytWbNCyZetaHLksBOYRyMxOkTQmhPDtpvtnSzowhDBhc9vX1NSE2tra9owIAABKwJIla/Xtb0/T3Lnvycw0YEBX3Xbb8W16tNLM5oQQarZlW68jYssk9Wt2f5emMQAAgFaRy0U67rjJeuGF5UqnE0qlTG+8sUYnnHB3wXwGq1cRe17SYDMbaGYVks6QNM0pCwAAKEGzZi3W++9vVFVVy0uL1Nfndd99873jSXK6fEUIIW9mEyT9RY2Xr5gUQpjnkQUAAJSm5cvXK4piJRLJFuPZbF7vvPOhU6qW3K4jFkKYIWmG198PwEcmk9cTT7yl+vq8vvKV/tpxx2rvSABK1LBhfWRmCiFsupxLCEFVVWmNGLHzVp7dPvisSQDt5tln39Hpp/9BmUwkKSiOpeuuO1Lf/OZQ72gAStDQob115JGD9Mgjb2waC0Had9+eGjNmd8dkH+MzLwC0i/r6nE477T6tX5+VmTa9S7388ke1YAGXqAHQNu6660T9138drt1330m77tpFl19+iKZPP0vpdHLrT24HHBED0C5mz16sbDZWZeXHP/xSqYTq6/O6555X9KMfjXZMB6BUpdNJXXRRjS66aJuuJtHuOCIGoF3U1+c3eyHFOA6qq8s5JAIAfxQxAO3iK1/prygKiqJ401jjotmkxo4tjLUaANDeKGIA2kWvXh01ceKhCqFxvdjGjTlFUdDYsYM1evRA73gA4II1YgDazYQJIzRyZD/dffcrqqvLavz4vXTkkYOUSJh3NABwQRED0K6GDeujYcP6eMcAgILA1CQAAIATihgAAIATihgAAIATihgAAIATihgAAIATihgAAIATihgAAIATihgAAIATihgAAIATihgAAIATihgAAIATihgAAIATiliZydbVae2SJYqyWe8oAACUvZR3ALSPKJfTYz/8oeb+7neSmZLptEZ973v68sUXe0cDAKBscUSsTDxx9dV64fbbJUmWSCifyeivEydq/h/+4JwMAIDyRRErA1Eupxduv12WTMqSSUlSIpVSiGM9df31zukAAChfFLEykNu4UVEmI0u03N2WSmnD8uVOqUpXZt06rZo/Xw0ffugdBQBQ4ChiZaCyc2d16N1bcS7XYjzOZtW3psYpVekJcazZP/yhbhw8WL8/4gjdOHiwHr3iCsX5vHc0AECBooiVATPTkdde27g2rL5ecT6vfH29UtXVOnTiRO94JeP53/xGc26+ufGOmSyR0IuTJunpn//cNxgAoGBRxMrEnscfr9Pvv1+7HnqoOvbqpT3Hj9c5s2ap1377eUcrGc/eeKOCtGkdniUSMjM9f9NNvsEAAAWLy1eUkf6jRqn/qFHeMUpW/Zo1SqRavqQsmVTDBx8oxPGn1ugBAMBvBqCV9B0+XPlMpsVYlMmo1wEHUMIAAJvFbweglRz+4x8rXVWlfH29olxO+fp6JSsqdOR113lHAwAUKIoY0Er6Dh+uc2bP1t6nnaadBg3SXiedpHNmzlS/gw/2jgYAKFCsEQNaUY8hQ3T8Lbd4xwAAFAmOiAEAADihiAEAADhhahIA0G5CCKqtfVePP/6WOneu1Pjxe6pXr47esQA3FDEAQLuI46CLLnpQDzywUA0NkdLphCZOfEz/+78n6YgjBnnHA1wwNQkAaBcPPfS6/vSnhUokTB06pFVRkVQuF+tb33pAmQyfyYryRBEDALSLe++dp1wukpltGvuojD377DLHZIAfpiYB4AuI46CnnlqqBQtWaeDAHTV69AAlk7y3/SzJZEIhbP6xRMI2/wBQ4ihiAPA5rV+f0Ykn3qtXXlmpfD5WOp1Qv35dNH36WerZs4N3vIJ15pn7avr01xTHYVPxymYjdepUqQMP3Nk5HeCDt28A8Dn9+Md/09y57ymRkCork0okTG+8sUaXXfaId7SCdtRRg3T22fsrjoMymUhRFKuyMqX//d+vKZ1OescDXFjY0nHiAlJTUxNqa2u9YwCAJGngwF+qri6nVOrj97JxHJTLxVqx4t9ajOPT5s9fpccfX6IuXao0btwe6ty50jsS0KrMbE4IoWZbtmVqEgA+pygKsk8saTJrLGPF8ObW295799Dee/fwjgEUBN62AcDndOKJeymXi1uMNTREGj16AFNsaHP19TktWbJW9fU57yhoBRQxAPicrrrqUA0Y0FVRFFRXl1UUBfXosYN+/vNjvKOhhIUQdO21f9egQTfooINu16BBv9LVVz+uOOYobDFjahIAPqfu3XfQ00+fr+nTX9fLL6/Q4MHdNH78nurQocI7GkrYrbfO0fXXP61EovFSIFEU64YbnlOXLlW69NIDvePhC2KxPgAARWDIkN9o9eq6FtPfuVyszp0r9cYblzomwyd9nsX6TE0CAFAEVq4756VDAAAPdUlEQVSs+9QZuamUafXqjZwkUsQoYgAAFIEDDuilTCZqMZbJRNp3354tPjYKxYUiBgBAEbjmmsNVUZFUfX1euVys+vq80umkfvKTI7yjYTtQxAAAKAIHH9xPDz30dY0Zs7t69+6oo48epBkzztJXv7qrdzRsB86aBACgSAwb1kd3332ydwy0Io6IAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQAAOKGIAQCAsrB8+Xq9/faHCiF4R9kk5R0AAACgLS1Zslbnn/+AXnpphcxM/fp10a23jtPw4X29o3FEDAAAlK5cLtJxx03Wiy++p1QqoWTStHjxBzrxxHu0evVG73gUMQAAULoee2yJ3n9/o6qqUjIzmZmqqlKqr89rypR53vEoYgAAoHQtX75eUfTpNWG5XKylSz90SNQSRQwAAJSs4cP7ykwtFuiHEFRVldLBB/dzTNaIIgYAAErWvvv21Jgxuyufj5XJRMpmI+Vysfbaq7vGjt3dOx5nTQIAgNI2adJ43XnnXP3ud3OVzeZ16qn76OKLv6x0OukdTVZI19LYkpqamlBbW+sdAwAAYKvMbE4IoWZbtmVqEgAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwEmbFTEz+w8zW2Zmc5v+HNvsse+Z2SIze9XMjmmrDAAAAIWsrT9r8hchhJ81HzCzvSWdIWkfSX0lzTSzPUIIURtnAQAAKCgeU5PjJd0TQsiEEBZLWiRphEMOAAAAV21dxCaY2UtmNsnMdmwa21nS0mbbvNM0BgAAUFa2a2rSzGZK6r2Zh34g6beSrpYUmr5eL+m8z/G9L5B0gST1799/e2ICaAdvv/2hpk9/TSFIY8bsrkGDdtz6kwCgzG1XEQshHLkt25nZbZIebLq7TFK/Zg/v0jT2ye99q6RbJammpiZsT04Abet3v3tRV1wxU1EUKwTpP/7jr/rBD76if/7ng7yjoYytXduge+55RQsWrNIBB/TWqafurU6dKr1jAS1YCG3TccysTwhhedPtf5V0YAjhDDPbR9JkNa4L6ytplqTBn7VYv6amJtTW1rZJTgDb591312vo0JslSalU42qHjwrZk0+ep8GDu3nGQ5l64401Ouqo/9GGDVlls5EqKpLq1q1as2adq759O3nHQ4kzszkhhJpt2bYt14j9t5m9bGYvSRot6V8lKYQwT9IUSfMlPSzpEs6YBIrXww8vUggflzBJSiYTyuViPfjga47JUM6++91H9MEHDUqlEtphh7RSqYTee69OV131mHc0oIU2u3xFCOHsz3jsGknXtNXfDaD9mG3pEVYUtIYQghYsWK1MJq/99uvVovBi8/L5WE888ZaqqpItxisqEpox43WnVMDm8YoGsF3GjNldZo2//D4SRbHS6aTGjdvDMVnxW7BglYYPv1WHH36Xjj12svbY40Y99thi71gFL5EwJZOmT668CUGqrExu/kmAE4oYgO3Sp08n/fSnR0mSMplImUykEKSrrvoq68O2QzYb6fjj79aSJWtl1njkcd26jM46634tW7bOO15BSyRMJ588RLlcpI/WQYcQFEVBZ599gHM6oKW2vrI+gDJw7rlDdfjhAzVjxuuKoqCxY3fXwIFcvmJ7zJr1purqsqqq+vjHdEVFUplMpHvvnafvfvdgx3SF77rrjtJrr72vBQtWK46DpKCRI/vpyitHeUcDWqCIAWgV/fp10YUXbtNJQtgG779frygKSqVaLsLL52OtWLHBKVXx6Nq1SrNnn6vnn39Xb775gfbcs5uGDu0t2/KiRsAFRQwACtDBB+8iSYrjoESisTyEEFRVldJhhw1wTFY8zEwjRuysESP48BYULtaIAUAB2m23nXT22fsrjoMaGvJqaMgrnw+qqemjo47azTsegFbCETEAKFA/+9nR+upXd9Wdd/5D9fU5nXbaPvrGN/bnEhZACaGIAUCBMjONH7+Xxo/fyzsKgDbC2yoAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnFDEAAAAnKe8AAACgOGSzkf72t7dUX5/XqFH91bVrlXekokcRAwAAW1Vb+65OO+0+1dfnJJniOOi6647UN7851DtaUWNqEgAAfKZMJq9TTpmiDz9skJnJTAoh6PLLH9WCBau84xU1ihgAAPhMf/3rEmUykSorP55IS6USyuViTZ78smOy4kcRAwAAn6muLqcQwqfGQwhaty7rkKh0UMQAAMBnGjWqv+I4KIriTWMhBFVWpjRu3GDHZMWPIgYAAD5Tz54dNHHioQpBqq/PaePGnKIo6KijBumIIwZ5xytqnDUJAAC26pJLRujgg/tp8uSXtWFDVuPH76mjj95NiYR5RytqFDEAALBNvvSlPvrSl/p4xygpTE0CAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAAA4oYgBAEpWfX1O8+at1MqVdd5RgM1KeQcAAKAt3HJLrf7zPx9XHEv5fKwxY3bTzTePU4cOFd7RgE04IgYAKDkPP7xIP/zhY8rlYplJqZRpxoxF+qd/esg7GtACRQwAUHJ+9atnlM/HSqUaf82ZmdLphP7859e0dm2DczrgYxQxAEDJWb58g5JJazGWSJiSSdOaNfVOqYBPo4gBAErO6NEDlM+HFmO5XKTq6rT69evsEwrYDIoYAKDkfPe7B2vHHavU0JBXNhupvj4nM9N11x2hdDrpHQ/YhLMmAaCNLViwSo89tkSdOlVo3Lg9tOOO1d6RSl6/fl30t799Szfc8KyeeOJt9e/fWZdeeqBGjuzvHQ1owUIIW9/KWU1NTaitrfWOAQCfSwhBV1wxU3feOVdRFCuZTCiVSmjy5JN12GEDvOMBaCNmNieEULMt2zI1CQBt5LHHlujOO+cqkTBVVqaUSiWUzUY655ypymTy3vEAFACKGAC0kXvvfUXZbKRE4uOz9yoqksrlYj355FLHZAAKBUUMANpIHG956UcxLAsB0PYoYgDQRk49dR9VVCRblK5cLlIyaTrkkH6OyQAUCooYALSRo44apNNP30dRFLRxY07ZbKRkMqFJk8arujrtHQ9AAeDyFQDQRsxMN9wwVuedN0yzZi1W586VGj9+T/Xq1dE7GoACQREDgDZkZho2rI+GDevjHQVAAWJqEgAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwMl2FTEzO9XM5plZbGY1n3jse2a2yMxeNbNjmo2PaRpbZGZXbs/fDwAAUMy294jYK5JOkvRE80Ez21vSGZL2kTRG0k1mljSzpKTfSBoraW9JZzZtCwAAUHa268r6IYQFUuOVoz9hvKR7QggZSYvNbJGkEU2PLQohvNn0vHuatp2/PTkAAACKUVutEdtZ0tJm999pGtvS+KeY2QVmVmtmtatWrWqjmAAAAH62ekTMzGZK6r2Zh34QQnig9SM1CiHcKulWSaqpqQlt9fcAAAB42WoRCyEc+QW+7zJJ/Zrd36VpTJ8xDgAAUFbaampymqQzzKzSzAZKGizpOUnPSxpsZgPNrEKNC/qntVEGAACAgrZdi/XN7GuSbpTUQ9J0M5sbQjgmhDDPzKaocRF+XtIlIYSo6TkTJP1FUlLSpBDCvO36LwAAAChSFkLhL78ys1WS3vLO8Tl1l7TaOwTaHPu5fLCvywP7uXy05b7eNYTQY1s2LIoiVozMrDaEULP1LVHM2M/lg31dHtjP5aNQ9jUfcQQAAOCEIgYAAOCEItZ2bvUOgHbBfi4f7OvywH4uHwWxr1kjBgAA4IQjYgAAAE4oYgAAAE4oYtvJzE41s3lmFptZzSce+56ZLTKzV83smGbjY5rGFpnZle2fGtvLzP7DzJaZ2dymP8c2e2yz+x3FiddraTOzJWb2ctPruLZpbCcze9TMXm/6uqN3Tnw+ZjbJzFaa2SvNxja7X63RDU2v8ZfM7EvtmZUitv1ekXSSpCeaD5rZ3mr8CKd9JI2RdJOZJc0sKek3ksZK2lvSmU3bovj8IoQwtOnPDGnL+90zJL44Xq9lY3TT6/ijN9NXSpoVQhgsaVbTfRSXO9X4M7i5Le3XsWr8KMbBki6Q9Nt2yiiJIrbdQggLQgivbuah8ZLuCSFkQgiLJS2SNKLpz6IQwpshhKyke5q2RWnY0n5HceL1Wp7GS7qr6fZdkk50zIIvIITwhKQ1nxje0n4dL+n3odEzkrqaWZ/2SUoRa0s7S1ra7P47TWNbGkfxmdB0GHtSs6kL9m9pYX+WviDpETObY2YXNI31CiEsb7r9nqRePtHQyra0X11f59v1od/lwsxmSuq9mYd+EEJ4oL3zoH181n5X46Hrq9X4Q/xqSddLOq/90gFoJaNCCMvMrKekR81sYfMHQwjBzLjOU4kppP1KEdsGIYQjv8DTlknq1+z+Lk1j+oxxFJBt3e9mdpukB5vuftZ+R/Fhf5a4EMKypq8rzWyqGqejV5hZnxDC8qYpqpWuIdFatrRfXV/nTE22nWmSzjCzSjMbqMZFgM9Jel7SYDMbaGYValzYPc0xJ76AT6wf+JoaT9qQtrzfUZx4vZYwM+tgZp0+ui3paDW+lqdJOrdps3MlMfNRGra0X6dJOqfp7MmDJH3YbAqzzXFEbDuZ2dck3Siph6TpZjY3hHBMCGGemU2RNF9SXtIlIYSo6TkTJP1FUlLSpBDCPKf4+OL+28yGqnFqcomkCyXps/Y7ik8IIc/rtaT1kjTVzKTG34eTQwgPm9nzkqaY2fmS3pJ0mmNGfAFmdrekwyR1N7N3JE2UdK02v19nSDpWjSdXbZT0rXbNykccAQAA+GBqEgAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwAlFDAAAwMn/B73Tllj6Ek4AAAAAAElFTkSuQmCC\n"
]
},
"metadata": {
"needs_background": "light"
}
}
],
"execution_count": 28,
"metadata": {
"collapsed": false,
"outputHidden": false,
"inputHidden": false
}
},
{
"cell_type": "markdown",
"source": [
"This looks like a clean separation indeed. The splitting is not 100% correct though, just by looking at the corresponding value of the 'club' property."
],
"metadata": {}
},
{
"cell_type": "code",
"source": [
"[g_nx.nodes[i] for i,v in enumerate(node_colours) if v==1]"
],
"outputs": [
{
"output_type": "execute_result",
"execution_count": 34,
"data": {
"text/plain": [
"[{'club': 'Mr. Hi'},\n",
" {'club': 'Mr. Hi'},\n",
" {'club': 'Mr. Hi'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Mr. Hi'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Mr. Hi'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Officer'},\n",
" {'club': 'Officer'}]"
]
},
"metadata": {}
}
],
"execution_count": 34,
"metadata": {
"collapsed": false,
"outputHidden": false,
"inputHidden": false
}
},
{
"cell_type": "markdown",
"source": [
"Five out of senteen are incorrect. This is still remarkable considering the fact that node2vec process did not know anything at about the 'club' property but that it's an emergent feature of the embedding."
],
"metadata": {}
}
],
"metadata": {
"kernel_info": {
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.7.2",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
},
"kernelspec": {
"name": "python3",
"language": "python",
"display_name": "Python 3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment