lmeyerov/uploader.ipynb

## uploader.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Demos: New Graphistry Upload API\n",
    "\n",
    "**NOTE**: A production version of the below `Uploader` reference helper class will be built into PyGraphistry\n",
    "\n",
    "The initial upload API release is REST-only. It enables faster and larger uploads. We recommend using a language client if one is available for you as it will be upgraded to automatically use it and keep your code more maintainable.\n",
    "\n",
    "The below example shares how to use via the Python `requests` package to directly call the raw REST API for several data types:\n",
    "\n",
    "**Reference: PyGraphistry API**\n",
    "\n",
    "**In-memory**:\n",
    "* PyGraphistry object\n",
    "* JSON dictionary\n",
    "* pandas dataframe\n",
    "* arrow (fastest)\n",
    "\n",
    "**File**:\n",
    "* json (multiple formats)\n",
    "* csv\n",
    "* arrow\n",
    "* parquet\n",
    "* NodeXL\n",
    "\n",
    "In the demos, `http://nginx` is the name of the upload server, such as an internal servername visible to a notebook kernel, and `http://localhost` is the public web server, which should be accessible through the viewer's browser.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Config"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
    "creds = {'username': 'my_account', 'password': 'my_pwd'}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Helpers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "import graphistry, io, json, pandas as pd, pyarrow as pa, requests\n",
    "from uploader import Uploader"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sample Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_small = pd.DataFrame({'s': ['a', 'b', 'c'], 'd': ['b', 'c', 'a']})\n",
    "df_med = pd.DataFrame({'s': [0, 1, 2, 3, 4] * 90000, 'd': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] * 45000})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "g = graphistry.bind(source='s', destination='d').settings(url_params={'play': 0})\n",
    "\n",
    "g_small = g.edges(df_small)\n",
    "g_med = g.edges(df_med)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Old API (reference)\n",
    "\n",
    "Subsequent Graphistry releases will use the new API internally"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 30.8 ms, sys: 237 µs, total: 31.1 ms\n",
      "Wall time: 83.7 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=07f6f10a123854d0a261e693198c0e5b&type=vgraph&viztoken=8ca64dd957bd419392a902a9145433cd&usertag=834f74a8-pygraphistry-0.10.6&splashAfter=1590714139&info=true&play=0&play=0'"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "graphistry.register(api=1)\n",
    "'http://localhost' + g_small.plot(render=False) + '&play=0'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 1.04 s, sys: 19.5 ms, total: 1.05 s\n",
      "Wall time: 1.12 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=137b170328e753a79020a7a7301b9e53&type=jsonMeta&viztoken=c092f5e75cab4398a04b1a60c189d9de&usertag=834f74a8-pygraphistry-0.10.6&splashAfter=1590714141&info=true&play=0&play=0'"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "graphistry.register(api=2)\n",
    "'http://localhost' + g_med.plot(render=False) + '&play=0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## From a PyGraphistry object"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4f9a8eca2d3448cd8ded525655ff2e48\n",
      "CPU times: user 22.7 ms, sys: 11.9 ms, total: 34.6 ms\n",
      "Wall time: 540 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=4f9a8eca2d3448cd8ded525655ff2e48&info=true&play=0&play=0'"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "u = Uploader('http://nginx').login(**creds).post_g(g_med)\n",
    "\n",
    "print(u.dataset_id)\n",
    "u.to_url('http://localhost') + '&play=0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Manual with in-memory API: Pandas to Arrow\n",
    "\n",
    "Convert to an Arrow object, such as from cudf, pandas, or spark"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'data': {'dataset_id': '0705c67c4571457cbad6edead36b9445'}, 'message': 'Dataset created', 'success': True}\n",
      "{'data': {'dataset_id': '0705c67c4571457cbad6edead36b9445', 'dtypes': {'d': 'int32', 's': 'int32'}, 'num_cols': 2, 'num_rows': 450000, 'time_parsing_s': 0}, 'message': 'Dataset edges created', 'success': True}\n",
      "CPU times: user 18.7 ms, sys: 3.84 ms, total: 22.5 ms\n",
      "Wall time: 488 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=0705c67c4571457cbad6edead36b9445&play=0'"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "u = Uploader('http://nginx').login(**creds)\n",
    "\n",
    "out = u.create_dataset({\n",
    "    \"node_encodings\": {\"bindings\": {}},\n",
    "    \"edge_encodings\": {\"bindings\": {\"source\": \"s\", \"destination\": \"d\"}},\n",
    "    \"metadata\": {},\n",
    "    \"name\": \"mytestviz\"\n",
    "})\n",
    "print(out)\n",
    "\n",
    "arr = pa.Table.from_pandas(g_med._edges, preserve_index=False).replace_schema_metadata({})\n",
    "out = u.post_edges_arrow(arr)\n",
    "print(out)\n",
    "\n",
    "u.to_url('http://localhost') + '&play=0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Manual with in-memory API: Both nodes and edges"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>n</th>\n",
       "      <th>some_val</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>aa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>aa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>aa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>aa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>aa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5</td>\n",
       "      <td>aa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>6</td>\n",
       "      <td>aa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>7</td>\n",
       "      <td>aa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>8</td>\n",
       "      <td>aa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>9</td>\n",
       "      <td>aa</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   n some_val\n",
       "0  0       aa\n",
       "1  1       aa\n",
       "2  2       aa\n",
       "3  3       aa\n",
       "4  4       aa\n",
       "5  5       aa\n",
       "6  6       aa\n",
       "7  7       aa\n",
       "8  8       aa\n",
       "9  9       aa"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ns = pd.concat([\n",
    "        g_med._edges[g_med._source],\n",
    "        g_med._edges[g_med._destination]\n",
    "    ], ignore_index=True, sort=False).unique()\n",
    "nodes_df_med = pd.DataFrame({'n': ns, 'some_val': ['aa'] * len(ns)})\n",
    "nodes_df_med"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "dataset_id 321e80d5b8ca43228eb962de8c494df8\n",
      "CPU times: user 21.6 ms, sys: 11.5 ms, total: 33.1 ms\n",
      "Wall time: 1.32 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=321e80d5b8ca43228eb962de8c494df8&play=0'"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "u = Uploader('http://nginx').login(**creds)\n",
    "\n",
    "u.create_dataset({\n",
    "    \"node_encodings\": {\"bindings\": {\"node\": \"n\"}},\n",
    "    \"edge_encodings\": {\"bindings\": {\"source\": \"s\", \"destination\": \"d\"}},\n",
    "    \"metadata\": {},\n",
    "    \"name\": \"mytestviz\"\n",
    "})\n",
    "print('dataset_id', u.dataset_id)\n",
    "\n",
    "arr = pa.Table.from_pandas(g_med._edges, preserve_index=False).replace_schema_metadata({})\n",
    "u.post_edges_arrow(arr)\n",
    "\n",
    "arr = pa.Table.from_pandas(nodes_df_med, preserve_index=False).replace_schema_metadata({})\n",
    "u.post_nodes_arrow(arr)\n",
    "\n",
    "u.to_url('http://localhost') + '&play=0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Manual with in-memory API: JSON to Arrow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'s': 0, 'd': 0}, {'s': 1, 'd': 1}, {'s': 2, 'd': 2}]"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sample_json = df_med.to_dict(orient='rows')\n",
    "sample_json[:3]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>s</th>\n",
       "      <th>d</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>449995</th>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>449996</th>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>449997</th>\n",
       "      <td>2</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>449998</th>\n",
       "      <td>3</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>449999</th>\n",
       "      <td>4</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>450000 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        s  d\n",
       "0       0  0\n",
       "1       1  1\n",
       "2       2  2\n",
       "3       3  3\n",
       "4       4  4\n",
       "...    .. ..\n",
       "449995  0  5\n",
       "449996  1  6\n",
       "449997  2  7\n",
       "449998  3  8\n",
       "449999  4  9\n",
       "\n",
       "[450000 rows x 2 columns]"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame(sample_json)\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'data': {'dataset_id': 'b3c045c759cd433ea33a447cbc619b41'}, 'message': 'Dataset created', 'success': True}\n",
      "{'data': {'dataset_id': 'b3c045c759cd433ea33a447cbc619b41', 'dtypes': {'d': 'int32', 's': 'int32'}, 'num_cols': 2, 'num_rows': 450000, 'time_parsing_s': 0}, 'message': 'Dataset edges created', 'success': True}\n",
      "CPU times: user 19.4 ms, sys: 4.41 ms, total: 23.8 ms\n",
      "Wall time: 480 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=b3c045c759cd433ea33a447cbc619b41&play=0'"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "u = Uploader('http://nginx').login(**creds)\n",
    "\n",
    "out = u.create_dataset({\n",
    "    \"node_encodings\": {\"bindings\": {}},\n",
    "    \"edge_encodings\": {\"bindings\": {\"source\": \"s\", \"destination\": \"d\"}},\n",
    "    \"metadata\": {},\n",
    "    \"name\": \"mytestviz\"\n",
    "})\n",
    "print(out)\n",
    "\n",
    "arr = pa.Table.from_pandas(g_med._edges, preserve_index=False).replace_schema_metadata({})\n",
    "out = u.post_edges_arrow(arr)\n",
    "print(out)\n",
    "\n",
    "u.to_url('http://localhost') + '&play=0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## From a file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### CSV"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_med.to_csv('./edges.csv', index=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'data': {'dataset_id': '73ba4f91df344ceb9bddd183bb39c0a0'}, 'message': 'Dataset created', 'success': True}\n",
      "{'data': {'dataset_id': '73ba4f91df344ceb9bddd183bb39c0a0', 'dtypes': {'d': 'int32', 's': 'int32'}, 'num_cols': 2, 'num_rows': 450000, 'time_parsing_s': 0}, 'message': 'Dataset edges created', 'success': True}\n",
      "CPU times: user 16.1 ms, sys: 4.01 ms, total: 20.1 ms\n",
      "Wall time: 524 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=73ba4f91df344ceb9bddd183bb39c0a0&play=0'"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "u = Uploader('http://nginx').login(**creds)\n",
    "\n",
    "out = u.create_dataset({\n",
    "    \"node_encodings\": {\"bindings\": {}},\n",
    "    \"edge_encodings\": {\"bindings\": {\"source\": \"s\", \"destination\": \"d\"}},\n",
    "    \"metadata\": {},\n",
    "    \"name\": \"mytestviz\"\n",
    "})\n",
    "print(out)\n",
    "\n",
    "out = u.post_edges_file('./edges.csv', 'csv')\n",
    "print(out)\n",
    "\n",
    "u.to_url('http://localhost') + '&play=0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### JSON"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Columnar"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'s': ['a', 'b', 'c'], 'd': ['b', 'c', 'a'], 'ccc': [5, 5, 5]}"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "json_dict = df_small.assign(ccc=5).to_dict(orient='list')\n",
    "json_dict"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open('df_med.json', 'w') as outfile:\n",
    "    json.dump(json_dict, outfile)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'data': {'dataset_id': 'f8b94415c1ca4241a68dc8cfdc89ea32'}, 'message': 'Dataset created', 'success': True}\n",
      "{'data': {'dataset_id': 'f8b94415c1ca4241a68dc8cfdc89ea32', 'dtypes': {'ccc': 'int32', 'd': 'object', 's': 'object'}, 'num_cols': 3, 'num_rows': 3, 'time_parsing_s': 0}, 'message': 'Dataset edges created', 'success': True}\n",
      "CPU times: user 16.1 ms, sys: 2 µs, total: 16.1 ms\n",
      "Wall time: 526 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=f8b94415c1ca4241a68dc8cfdc89ea32&play=0'"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "u = Uploader('http://nginx').login(**creds)\n",
    "\n",
    "out = u.create_dataset({\n",
    "    \"node_encodings\": {\"bindings\": {}},\n",
    "    \"edge_encodings\": {\"bindings\": {\"source\": \"s\", \"destination\": \"d\", \"edge_color\": \"ccc\"}},\n",
    "    \"metadata\": {},\n",
    "    \"name\": \"mytestviz\"\n",
    "})\n",
    "print(out)\n",
    "\n",
    "out = u.post_edges_file('./df_med.json', 'json')\n",
    "print(out)\n",
    "\n",
    "u.to_url('http://localhost') + '&play=0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'s': 0, 'd': 0}, {'s': 1, 'd': 1}, {'s': 2, 'd': 2}]"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "json_rows = df_med.to_dict(orient='rows')\n",
    "json_rows[:3]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open('df_med_rows.json', 'w') as outfile:\n",
    "    json.dump(json_rows, outfile)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'data': {'dataset_id': '5a1110c07aae43109027df39bf8fc645'}, 'message': 'Dataset created', 'success': True}\n",
      "{'data': {'dataset_id': '5a1110c07aae43109027df39bf8fc645', 'dtypes': {'d': 'int32', 's': 'int32'}, 'num_cols': 2, 'num_rows': 450000, 'time_parsing_s': 2}, 'message': 'Dataset edges created', 'success': True}\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=5a1110c07aae43109027df39bf8fc645&play=0'"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "u = Uploader('http://nginx').login(**creds)\n",
    "\n",
    "out = u.create_dataset({\n",
    "    \"node_encodings\": {\"bindings\": {}},\n",
    "    \"edge_encodings\": {\"bindings\": {\"source\": \"s\", \"destination\": \"d\"}},\n",
    "    \"metadata\": {},\n",
    "    \"name\": \"mytestviz\"\n",
    "})\n",
    "print(out)\n",
    "\n",
    "out = u.post_edges_file('./df_med_rows.json', 'json')\n",
    "print(out)\n",
    "\n",
    "u.to_url('http://localhost') + '&play=0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Parquet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_med.to_parquet('./edges.parquet')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-rw-r--r-- 1 graphistry graphistry 21K May 29 01:02 edges.parquet\r\n"
     ]
    }
   ],
   "source": [
    "! ls -alh edges.parquet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'data': {'dataset_id': '51b51f6b9f664c50ba1f6c4ca073e200'}, 'message': 'Dataset created', 'success': True}\n",
      "{'data': {'dataset_id': '51b51f6b9f664c50ba1f6c4ca073e200', 'dtypes': {'d': 'int32', 's': 'int32'}, 'num_cols': 2, 'num_rows': 450000, 'time_parsing_s': 0}, 'message': 'Dataset edges created', 'success': True}\n",
      "CPU times: user 24.4 ms, sys: 4.02 ms, total: 28.4 ms\n",
      "Wall time: 624 ms\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=51b51f6b9f664c50ba1f6c4ca073e200&play=0'"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "u = Uploader('http://nginx').login(**creds)\n",
    "\n",
    "out = u.create_dataset({\n",
    "    \"node_encodings\": {\"bindings\": {}},\n",
    "    \"edge_encodings\": {\"bindings\": {\"source\": \"s\", \"destination\": \"d\"}},\n",
    "    \"metadata\": {},\n",
    "    \"name\": \"mytestviz\"\n",
    "})\n",
    "print(out)\n",
    "\n",
    "out = u.post_edges_file('./edges.parquet', 'parquet')\n",
    "print(out)\n",
    "\n",
    "u.to_url('http://localhost') + '&play=0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Arrow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "arr = pa.Table.from_pandas(g_med._edges, preserve_index=False).replace_schema_metadata({})\n",
    "writer = pa.RecordBatchFileWriter('./edges.arrow', arr.schema)\n",
    "writer.write_table(arr)\n",
    "writer.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-rw-r--r-- 1 graphistry graphistry 6.9M May 29 01:02 ./edges.arrow\r\n"
     ]
    }
   ],
   "source": [
    "! ls -alh ./edges.arrow "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'data': {'dataset_id': '59f20c76617a4da2a44dc844d75351ac'}, 'message': 'Dataset created', 'success': True}\n",
      "{'data': {'dataset_id': '59f20c76617a4da2a44dc844d75351ac', 'dtypes': {'d': 'int32', 's': 'int32'}, 'num_cols': 2, 'num_rows': 450000, 'time_parsing_s': 0}, 'message': 'Dataset edges created', 'success': True}\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=59f20c76617a4da2a44dc844d75351ac&play=0'"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "u = Uploader('http://nginx').login(**creds)\n",
    "\n",
    "out = u.create_dataset({\n",
    "    \"node_encodings\": {\"bindings\": {}},\n",
    "    \"edge_encodings\": {\"bindings\": {\"source\": \"s\", \"destination\": \"d\"}},\n",
    "    \"metadata\": {},\n",
    "    \"name\": \"mytestviz\"\n",
    "})\n",
    "print(out)\n",
    "\n",
    "out = u.post_edges_file('./edges.arrow', 'arrow')\n",
    "print(out)\n",
    "\n",
    "u.to_url('http://localhost') + '&play=0'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### NodeXL: From a public URL\n",
    "See also:\n",
    "  * `POST /upload/datasets/<dataset_id>/nodexl/file`\n",
    "  * `POST /upload/datasets/<dataset_id>/nodexl/url`\n",
    "  * `GET+POST /upload/nodexl/url`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "request header {'User-Agent': 'python-requests/2.23.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Authorization': 'Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VybmFtZSI6Imxlb3Rlc3QiLCJpYXQiOjE1OTA3MTQxNTUsImV4cCI6MTU5MDcxNzc1NSwidXNlcl9pZCI6MSwib3JpZ19pYXQiOjE1OTA3MTQxNTV9.6CX2TpZOlppKBUQQ-4hiI8b3xQNh8nSl6cutwOgVmNA'}\n",
      "{'data': {'dataset_id': '554756890172494c9692a1f71c893574', 'edges': {'dtypes': {'Add Your Own Columns Here': 'float32', 'Added By Extended Analysis': 'object', 'Color': 'object', 'ColorInt': 'int32', 'Corrected By Extended Analysis': 'object', 'Date': 'datetime64[ns]', 'Domains in Tweet': 'object', 'Dynamic Filter': 'float32', 'Edge Content Word Count': 'object', 'Edge Weight': 'object', 'Favorite Count': 'object', 'Favorited': 'bool', 'Hashtags in Tweet': 'object', 'ID': 'object', 'Imported ID': 'object', 'Imported Tweet Type': 'object', 'In-Reply-To Tweet ID': 'object', 'In-Reply-To User ID': 'object', 'Is Quote Status': 'bool', 'Label': 'float32', 'Label Font Size': 'float32', 'Label Text Color': 'float32', 'Language': 'object', 'Latitude': 'float32', 'Longitude': 'float32', 'Media in Tweet': 'object', 'Non-categorized Word Count': 'object', 'Non-categorized Word Percentage (%)': 'object', 'Opacity': 'object', 'Place Bounding Box': 'object', 'Place Country': 'object', 'Place Country Code': 'object', 'Place Full Name': 'object', 'Place ID': 'object', 'Place Name': 'object', 'Place Type': 'object', 'Place URL': 'object', 'Possibly Sensitive': 'float32', 'Quoted Status ID': 'object', 'Reciprocated?': 'object', 'Relationship': 'object', 'Relationship Date (UTC)': 'datetime64[ns]', 'Retweet Count': 'object', 'Retweet ID': 'object', 'Retweeted': 'bool', 'Sentiment List #1: List1 Word Count': 'object', 'Sentiment List #1: List1 Word Percentage (%)': 'object', 'Sentiment List #2: List2 Word Count': 'object', 'Sentiment List #2: List2 Word Percentage (%)': 'object', 'Sentiment List #3: List3 Word Count': 'object', 'Sentiment List #3: List3 Word Percentage (%)': 'object', 'Source': 'object', 'Style': 'object', 'Time': 'datetime64[ns]', 'Truncated': 'bool', 'Tweet': 'object', 'Tweet Date (UTC)': 'datetime64[ns]', 'Tweet Image File': 'object', 'Twitter Page for Tweet': 'object', 'URLs in Tweet': 'object', 'Unified Twitter ID': 'object', 'Vertex 1': 'object', 'Vertex 1 Group': 'object', 'Vertex 2': 'object', 'Vertex 2 Group': 'object', 'Visibility': 'float32', 'Width': 'object'}, 'num_cols': 67, 'num_rows': 3142}, 'nodes': {'dtypes': {'Add Your Own Columns Here': 'float32', 'Betweenness Centrality': 'object', 'Closeness Centrality': 'object', 'Clustering Coefficient': 'object', 'Color': 'float32', 'Color2': 'int32', 'Custom Menu Item': 'object', 'Default Profile': 'bool', 'Default Profile Image': 'bool', 'Degree': 'float32', 'Description': 'object', 'Domains in Tweet by Count': 'object', 'Domains in Tweet by Salience': 'object', 'Dynamic Filter': 'float32', 'Eigenvector Centrality': 'object', 'Favorites': 'object', 'Followed': 'object', 'Followers': 'object', 'Geo Enabled': 'bool', 'Hashtags in Tweet by Count': 'object', 'Hashtags in Tweet by Salience': 'object', 'ID': 'object', 'Image File': 'object', 'In-Degree': 'object', 'Joined Twitter Date (UTC)': 'datetime64[ns]', 'Label': 'object', 'Label Fill Color': 'float32', 'Label Position': 'object', 'Language': 'object', 'Layout Order': 'object', 'Listed Count': 'object', 'Location': 'object', 'Locked?': 'float32', 'Name': 'object', 'Non-categorized Word Count': 'object', 'Non-categorized Word Percentage (%)': 'object', 'Opacity': 'float32', 'Out-Degree': 'object', 'PageRank': 'object', 'Polar Angle': 'float32', 'Polar R': 'float32', 'Profile Background Image Url': 'object', 'Profile Banner Url': 'object', 'Reciprocated Vertex Pair Ratio': 'object', 'Sentiment List #1: List1 Word Count': 'object', 'Sentiment List #1: List1 Word Percentage (%)': 'object', 'Sentiment List #2: List2 Word Count': 'object', 'Sentiment List #2: List2 Word Percentage (%)': 'object', 'Sentiment List #3: List3 Word Count': 'object', 'Sentiment List #3: List3 Word Percentage (%)': 'object', 'Shape': 'object', 'Size': 'object', 'Time Zone': 'object', 'Time Zone UTC Offset (Seconds)': 'object', 'Tooltip': 'object', 'Top Word Pairs in Tweet by Count': 'object', 'Top Word Pairs in Tweet by Salience': 'object', 'Top Words in Tweet by Count': 'object', 'Top Words in Tweet by Salience': 'object', 'Tweeted Search Term?': 'object', 'Tweets': 'object', 'URLs in Tweet by Count': 'object', 'URLs in Tweet by Salience': 'object', 'Verified': 'bool', 'Vertex': 'object', 'Vertex Content Word Count': 'object', 'Vertex Group': 'object', 'Visibility': 'float32', 'Web': 'object', 'x': 'object', 'y': 'object'}, 'num_cols': 71, 'num_rows': 529}, 'url': '/graph/graph.html?dataset=554756890172494c9692a1f71c893574&splashAfter=1590714179&play=0'}, 'message': 'Dataset created', 'success': True}\n",
      "CPU times: user 20.4 ms, sys: 2.92 ms, total: 23.3 ms\n",
      "Wall time: 23.7 s\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'http://localhost/graph/graph.html?dataset=554756890172494c9692a1f71c893574&splashAfter=1590714179&play=0'"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "import urllib.parse\n",
    "#urllib.parse.quote(\n",
    "\n",
    "file_url = urllib.parse.quote(\"https://nodexlgraphgallery.org/Pages/Workbook.ashx?graphID=227114\", safe='')\n",
    "base_path = 'http://nginx'\n",
    "\n",
    "u = Uploader(base_path).login(**creds)\n",
    "tok = u.token\n",
    "\n",
    "resp = requests.get(\n",
    "    f'{base_path}/api/v2/upload/nodexl/url?template=twitter&url={file_url}',\n",
    "    headers={'Authorization': f'Bearer {tok}'})\n",
    "\n",
    "print('request header', resp.request.headers)\n",
    "\n",
    "out = resp.json()\n",
    "print(out)\n",
    "\n",
    "subpath = out['data']['url']\n",
    "f'http://localhost{subpath}'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.7 (RAPIDS)",
   "language": "python",
   "name": "rapids"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}

## uploader.py
import graphistry, io, json, logging, pandas as pd, pyarrow as pa, requests

logger = logging.getLogger('Uploader')

class Uploader:

    @property
    def token(self) -> str:
        if self.__token is None:
            raise Exception("Not logged in")
        return self.__token

    @property
    def dataset_id(self) -> str:
        if self.__dataset_id is None:
            raise Exception("Must first create a dataset")
        return self.__dataset_id

    @property
    def base_path(self) -> str:
        return self.__base_path

    @property
    def view_base_path(self) -> str:
        return self.__view_base_path

    @property
    def url_params(self) -> dict:
        if self.__url_params is None:
            return {}
        else:
            return self.__url_params

    def settings(self, url_params=None):
        if not (url_params is None):
            self.__url_params = url_params
        return self

    def __init__(self, base_path='http://nginx', view_base_path='http://localhost'):
        self.__base_path = base_path
        self.__view_base_path = view_base_path
        self.__token = None
        self.__dataset_id = None
        self.__url_params = None

    def login(self, username, password):
        base_path = self.base_path
        out = requests.post(
            f'{base_path}/api-token-auth/',
            json={'username': username, 'password': password})
        json_response = None
        try:
            json_response = out.json()
            if not ('token' in json_response):
                raise Exception(out.text)
        except Exception as e:
            logger.error('Error: %s', out)
            raise Exception(out.text)

        self.__token = out.json()['token']
        return self

    def create_dataset(self, json):
        tok = self.token

        out = requests.post(
            self.base_path + '/api/v2/upload/datasets/',
             headers={'Authorization': f'Bearer {tok}'},
             json=json).json()

        if not out['success']:
            raise Exception(out)

        self.__dataset_id = out['data']['dataset_id']

        return out

    #PyArrow's table.getvalues().to_pybytes() fails to hydrate some reason,
    #  so work around by consolidate into a virtual file and sending that
    def arrow_to_buffer(self, table: pa.Table):
        b = io.BytesIO()
        writer = pa.RecordBatchFileWriter(b, table.schema)
        writer.write_table(table)
        writer.close()
        return b.getvalue()

    def post_g(self, g, name=None):

        def maybe_bindings(g, bindings):
            out = {}
            for old_field_name, new_field_name in bindings:
                try:
                    val = getattr(g, old_field_name)
                    if val is None:
                        continue
                    else:
                        out[new_field_name] = val
                except AttributeError:
                    continue
            logger.debug('bindings: %s', out)
            return out

        self.__url_params = g._url_params if not (g._url_params is None) else {}

        node_encodings = maybe_bindings(
                g,
                [
                    ['_node', 'node'],
                    ['_point_color', 'node_color'],
                    ['_point_label', 'node_label'],
                    ['_point_opacity', 'node_opacity'],
                    ['_point_size', 'node_size'],
                    ['_point_title', 'node_title'],
                    ['_point_weight', 'node_weight']
                ])
        if not (g._nodes is None):
            if 'x' in g._nodes:
                node_encodings['x'] = 'x'
            if 'y' in g._nodes:
                node_encodings['y'] = 'y'


        self.create_dataset({
            "node_encodings": {"bindings": node_encodings},
            "edge_encodings": {"bindings": maybe_bindings(
                g,
                [
                    ['_source', 'source'],
                    ['_destination', 'destination'],
                    ['_edge_color', 'edge_color'],
                    ['_edge_label', 'edge_label'],
                    ['_edge_opacity', 'edge_opacity'],
                    ['_edge_size', 'edge_size'],
                    ['_edge_title', 'edge_title'],
                    ['_edge_weight', 'edge_weight']
                ])
            },
            "metadata": {},
            "name": ("mytestviz" if name is None else name)
        })

        self.g_post_edges(g)

        if not (g._nodes is None):
            self.g_post_nodes(g)

        return self

    def to_url(self, view_base_path=None):
        path = view_base_path if not (view_base_path is None) else self.view_base_path
        dataset_id = self.dataset_id
        params = [ str(k) + '=' + str(v) for k, v in self.url_params.items() ]
        url_params = ('&' + '&'.join(params)) if len(params) > 0 else ''

        return f'{path}/graph/graph.html?dataset={dataset_id}{url_params}'

    def plot(self, render=True):
        if render:
            try:
                from IPython.core.display import display, HTML
                url = self.to_url()
                logger.debug('url: %s', url)
                return display(HTML(f'<iframe src="{url}" width="100%" height="600"/>'))
            except Exception as e:
                logger.debug(e)
        return self.to_url()

    def g_post_edges(self, g):

        arr = pa.Table.from_pandas(g._edges, preserve_index=False).replace_schema_metadata({})
        buf = self.arrow_to_buffer(arr)

        dataset_id = self.dataset_id
        tok = self.token
        base_path = self.base_path

        out = requests.post(
            f'{base_path}/api/v2/upload/datasets/{dataset_id}/edges/arrow',
            headers={'Authorization': f'Bearer {tok}'},
            data=buf).json()

        if not out['success']:
            raise Exception(out)

        return out

    def g_post_nodes(self, g):

        arr = pa.Table.from_pandas(g._nodes, preserve_index=False).replace_schema_metadata({})
        buf = self.arrow_to_buffer(arr)

        dataset_id = self.dataset_id
        tok = self.token
        base_path = self.base_path

        out = requests.post(
            f'{base_path}/api/v2/upload/datasets/{dataset_id}/nodes/arrow',
            headers={'Authorization': f'Bearer {tok}'},
            data=buf).json()

        if not out['success']:
            raise Exception(out)

        return out

    def post_edges_arrow(self, arr, opts=''):
        return self.post_arrow(arr, 'edges', opts)

    def post_nodes_arrow(self, arr, opts=''):
        return self.post_arrow(arr, 'nodes', opts)

    def post_arrow(self, arr, graph_type, opts=''):
        buf = self.arrow_to_buffer(arr)

        dataset_id = self.dataset_id
        tok = self.token
        base_path = self.base_path

        url = f'{base_path}/api/v2/upload/datasets/{dataset_id}/{graph_type}/arrow'
        if len(opts) > 0:
            url = f'{url}?{opts}'
        out = requests.post(
            url,
            headers={'Authorization': f'Bearer {tok}'},
            data=buf).json()

        if not out['success']:
            raise Exception(out)

        return out

    def post_edges_file(self, file_path, file_type='csv'):
        return self.post_file(file_path, 'edges', file_type)

    def post_nodes_file(self, file_path, file_type='csv'):
        return self.post_file(file_path, 'nodes', file_type)

    def post_file(self, file_path, graph_type='edges', file_type='csv'):

        dataset_id = self.dataset_id
        tok = self.token
        base_path = self.base_path

        with open(file_path, 'rb') as file:
            out = requests.post(
                f'{base_path}/api/v2/upload/datasets/{dataset_id}/{graph_type}/{file_type}',
                headers={'Authorization': f'Bearer {tok}'},
                data=file.read()).json()
            if not out['success']:
                raise Exception(out)

            return out
	import graphistry, io, json, logging, pandas as pd, pyarrow as pa, requests

	logger = logging.getLogger('Uploader')

	class Uploader:

	@property
	def token(self) -> str:
	if self.__token is None:
	raise Exception("Not logged in")
	return self.__token

	@property
	def dataset_id(self) -> str:
	if self.__dataset_id is None:
	raise Exception("Must first create a dataset")
	return self.__dataset_id

	@property
	def base_path(self) -> str:
	return self.__base_path

	@property
	def view_base_path(self) -> str:
	return self.__view_base_path

	@property
	def url_params(self) -> dict:
	if self.__url_params is None:
	return {}
	else:
	return self.__url_params

	def settings(self, url_params=None):
	if not (url_params is None):
	self.__url_params = url_params
	return self

	def __init__(self, base_path='http://nginx', view_base_path='http://localhost'):
	self.__base_path = base_path
	self.__view_base_path = view_base_path
	self.__token = None
	self.__dataset_id = None
	self.__url_params = None

	def login(self, username, password):
	base_path = self.base_path
	out = requests.post(
	f'{base_path}/api-token-auth/',
	json={'username': username, 'password': password})
	json_response = None
	try:
	json_response = out.json()
	if not ('token' in json_response):
	raise Exception(out.text)
	except Exception as e:
	logger.error('Error: %s', out)
	raise Exception(out.text)

	self.__token = out.json()['token']
	return self

	def create_dataset(self, json):
	tok = self.token

	out = requests.post(
	self.base_path + '/api/v2/upload/datasets/',
	headers={'Authorization': f'Bearer {tok}'},
	json=json).json()

	if not out['success']:
	raise Exception(out)

	self.__dataset_id = out['data']['dataset_id']

	return out

	#PyArrow's table.getvalues().to_pybytes() fails to hydrate some reason,
	# so work around by consolidate into a virtual file and sending that
	def arrow_to_buffer(self, table: pa.Table):
	b = io.BytesIO()
	writer = pa.RecordBatchFileWriter(b, table.schema)
	writer.write_table(table)
	writer.close()
	return b.getvalue()

	def post_g(self, g, name=None):

	def maybe_bindings(g, bindings):
	out = {}
	for old_field_name, new_field_name in bindings:
	try:
	val = getattr(g, old_field_name)
	if val is None:
	continue
	else:
	out[new_field_name] = val
	except AttributeError:
	continue
	logger.debug('bindings: %s', out)
	return out

	self.__url_params = g._url_params if not (g._url_params is None) else {}

	node_encodings = maybe_bindings(
	g,
	[
	['_node', 'node'],
	['_point_color', 'node_color'],
	['_point_label', 'node_label'],
	['_point_opacity', 'node_opacity'],
	['_point_size', 'node_size'],
	['_point_title', 'node_title'],
	['_point_weight', 'node_weight']
	])
	if not (g._nodes is None):
	if 'x' in g._nodes:
	node_encodings['x'] = 'x'
	if 'y' in g._nodes:
	node_encodings['y'] = 'y'


	self.create_dataset({
	"node_encodings": {"bindings": node_encodings},
	"edge_encodings": {"bindings": maybe_bindings(
	g,
	[
	['_source', 'source'],
	['_destination', 'destination'],
	['_edge_color', 'edge_color'],
	['_edge_label', 'edge_label'],
	['_edge_opacity', 'edge_opacity'],
	['_edge_size', 'edge_size'],
	['_edge_title', 'edge_title'],
	['_edge_weight', 'edge_weight']
	])
	},
	"metadata": {},
	"name": ("mytestviz" if name is None else name)
	})

	self.g_post_edges(g)

	if not (g._nodes is None):
	self.g_post_nodes(g)

	return self

	def to_url(self, view_base_path=None):
	path = view_base_path if not (view_base_path is None) else self.view_base_path
	dataset_id = self.dataset_id
	params = [ str(k) + '=' + str(v) for k, v in self.url_params.items() ]
	url_params = ('&' + '&'.join(params)) if len(params) > 0 else ''

	return f'{path}/graph/graph.html?dataset={dataset_id}{url_params}'

	def plot(self, render=True):
	if render:
	try:
	from IPython.core.display import display, HTML
	url = self.to_url()
	logger.debug('url: %s', url)
	return display(HTML(f'<iframe src="{url}" width="100%" height="600"/>'))
	except Exception as e:
	logger.debug(e)
	return self.to_url()

	def g_post_edges(self, g):

	arr = pa.Table.from_pandas(g._edges, preserve_index=False).replace_schema_metadata({})
	buf = self.arrow_to_buffer(arr)

	dataset_id = self.dataset_id
	tok = self.token
	base_path = self.base_path

	out = requests.post(
	f'{base_path}/api/v2/upload/datasets/{dataset_id}/edges/arrow',
	headers={'Authorization': f'Bearer {tok}'},
	data=buf).json()

	if not out['success']:
	raise Exception(out)

	return out

	def g_post_nodes(self, g):

	arr = pa.Table.from_pandas(g._nodes, preserve_index=False).replace_schema_metadata({})
	buf = self.arrow_to_buffer(arr)

	dataset_id = self.dataset_id
	tok = self.token
	base_path = self.base_path

	out = requests.post(
	f'{base_path}/api/v2/upload/datasets/{dataset_id}/nodes/arrow',
	headers={'Authorization': f'Bearer {tok}'},
	data=buf).json()

	if not out['success']:
	raise Exception(out)

	return out

	def post_edges_arrow(self, arr, opts=''):
	return self.post_arrow(arr, 'edges', opts)

	def post_nodes_arrow(self, arr, opts=''):
	return self.post_arrow(arr, 'nodes', opts)

	def post_arrow(self, arr, graph_type, opts=''):
	buf = self.arrow_to_buffer(arr)

	dataset_id = self.dataset_id
	tok = self.token
	base_path = self.base_path

	url = f'{base_path}/api/v2/upload/datasets/{dataset_id}/{graph_type}/arrow'
	if len(opts) > 0:
	url = f'{url}?{opts}'
	out = requests.post(
	url,
	headers={'Authorization': f'Bearer {tok}'},
	data=buf).json()

	if not out['success']:
	raise Exception(out)

	return out

	def post_edges_file(self, file_path, file_type='csv'):
	return self.post_file(file_path, 'edges', file_type)

	def post_nodes_file(self, file_path, file_type='csv'):
	return self.post_file(file_path, 'nodes', file_type)

	def post_file(self, file_path, graph_type='edges', file_type='csv'):

	dataset_id = self.dataset_id
	tok = self.token
	base_path = self.base_path

	with open(file_path, 'rb') as file:
	out = requests.post(
	f'{base_path}/api/v2/upload/datasets/{dataset_id}/{graph_type}/{file_type}',
	headers={'Authorization': f'Bearer {tok}'},
	data=file.read()).json()
	if not out['success']:
	raise Exception(out)

	return out