bumatic/PyCatFlow.ipynb

## apt.txt
libcairo2

## PyCatFlow.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "eb905db4-6b79-4018-b454-5dcd370985e7",
   "metadata": {},
   "source": [
    "# [PyCatFlow](https://github.com/bumatic/PyCatFlow) \n",
    "\n",
    "This notebook allows you to create a [PyCatFlow](https://github.com/bumatic/PyCatFlow) visualization. Run the notebook cell by cell and adjust the required information. **Users do not have to change any code, but make inputs with so-called widgets.** Most of the code in the this notebook implements these interface elements. Using the tool in plain Python reqires far less code. An example of this is included in the code repository of [PyCatFlow](https://github.com/bumatic/PyCatFlow).\n",
    "\n",
    "A file with sample data can be downloaded [here](https://raw.githubusercontent.com/bumatic/PyCatFlow/main/example/sample_data_ChatterBot_Requirements.csv). In case you want to play around with this data make sure that it is saved with the extension '.csv'."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "65622b3d-803e-40cd-b2fe-95e4231e080e",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import pycatflow as pcf\n",
    "import ipywidgets as widgets\n",
    "from io import StringIO\n",
    "from IPython.display import display"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e7cfb77d-36b4-4afc-98b0-9427c9a81276",
   "metadata": {},
   "source": [
    "## Step 1: Loading data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a0606af9-1c1f-4436-98a6-78c3f9e92ecb",
   "metadata": {},
   "outputs": [],
   "source": [
    "uploader = widgets.FileUpload(accept='.csv', description='Select csv file')\n",
    "display(uploader)\n",
    "\n",
    "separator = widgets.Dropdown(options=[('Tabulator', '\\t'), ('Comma',','), ('Semicolon', ';')],\n",
    "                             value='\\t',\n",
    "                             description='Separator:',\n",
    "                             disabled=False)\n",
    "display(separator)\n",
    "\n",
    "# Once you ran the code and selected a file for upload with the interactive widget\n",
    "# DON'T rerun this cell because it resets the selection. Just proceed with the next cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "da54714d-6aaa-4f38-8972-ce7d6ec8e26f",
   "metadata": {},
   "outputs": [],
   "source": [
    "data_loaded = False\n",
    "if len(list(uploader.value.keys())) == 1:\n",
    "    data_file = list(uploader.value.keys())[0]\n",
    "    raw_data = uploader.value[list(uploader.value.keys())[0]]['content'].decode('UTF-8')\n",
    "    data = pd.read_csv(StringIO(raw_data), sep=separator.value)\n",
    "    columns = list(data.columns)\n",
    "    print('First 5 rows of loaded data:')\n",
    "    print()\n",
    "    print(data.head(5))\n",
    "    print()\n",
    "    data_loaded = True\n",
    "else: \n",
    "    print('Please select a data file before running this cell.')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c551a553-c1db-4fba-8899-98b5de51a4fe",
   "metadata": {},
   "source": [
    "## Step 2: Mapping data columns to the visualization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9d4f61d6-738a-4d6b-916c-b7b5d4e50c7a",
   "metadata": {},
   "outputs": [],
   "source": [
    "if data_loaded:\n",
    "    style = {'description_width': 'initial'}\n",
    "    print()\n",
    "    print('Select data columns for visualization. (Leave blank, if it does not apply.)')\n",
    "    print()\n",
    "    viz_columns = widgets.Dropdown(options=columns, value=columns[0], description='Viz columns*: ', disabled=False, style=style)\n",
    "    display(viz_columns)\n",
    "\n",
    "    viz_nodes = widgets.Dropdown(options=columns, value=columns[1], description='Viz nodes*: ', disabled=False, style=style)\n",
    "    display(viz_nodes)\n",
    "    \n",
    "    viz_category = widgets.Dropdown(options=columns, value=None, description='Viz category: ', disabled=False, style=style)\n",
    "    display(viz_category)\n",
    "    \n",
    "    viz_col_order = widgets.Dropdown(options=columns, value=None, description='Column order: ', disabled=False, style=style)\n",
    "    display(viz_col_order)\n",
    "else: \n",
    "    print('No data loaded. Start in the cells above.')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c02676cb-d223-4ca8-8f44-cee891a871ce",
   "metadata": {},
   "source": [
    "## Step 3: Set properties of the visualization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9a387d29-b2b5-4eea-b361-1c9310b6cdda",
   "metadata": {},
   "outputs": [],
   "source": [
    "data = pcf.read(raw_data, columns=viz_columns.value, \n",
    "                nodes=viz_nodes.value, categories=viz_category.value, \n",
    "                column_order=viz_col_order.value, delimiter=separator.value)\n",
    "\n",
    "if data:\n",
    "    viz_width = widgets.IntText(value=800, description='Width:', disabled=False, style=style)\n",
    "    display(viz_width)\n",
    "    viz_node_min = widgets.IntText(value=2, description='Node min size:', disabled=False, style=style) \n",
    "    display(viz_node_min)\n",
    "    viz_node_max = widgets.IntText(value=20, description='Node max size:', disabled=False, style=style) \n",
    "    display(viz_node_max)\n",
    "    viz_spacing = widgets.IntText(value=20, description='Node spacing:', disabled=False, style=style)\n",
    "    display(viz_spacing)\n",
    "    viz_connection_type = widgets.Dropdown(options=['semi-curved', 'curved', 'straight'], description='Connection type: ', \n",
    "                                           disabled=False, style=style )\n",
    "    display(viz_connection_type)\n",
    "    viz_order = widgets.Dropdown(options=['frequency', 'alphabetical', 'category'], description='Sort nodes by: ', \n",
    "                                           disabled=False, style=style )\n",
    "    display(viz_order)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "169a5d01-f810-4448-8f4d-fae36ee19beb",
   "metadata": {},
   "outputs": [],
   "source": [
    "viz = pcf.visualize(data, spacing=viz_spacing.value, width=viz_width.value, maxValue=viz_node_max.value, \n",
    "                    minValue=viz_node_min.value, connection_type=viz_connection_type.value, sort_by=viz_order.value)\n",
    "viz"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2a4981eb-0f39-48aa-9d62-54df0a20c550",
   "metadata": {},
   "source": [
    "## Step 4: Save the result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4938f8be-e04d-40bf-94e9-a4a207c1e3f4",
   "metadata": {},
   "outputs": [],
   "source": [
    "viz_file_name = widgets.Text(value='CatFlow_visual', description='File name: ')\n",
    "display(viz_file_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c4e8f24a-6162-4c07-913a-f10bd94269c0",
   "metadata": {},
   "outputs": [],
   "source": [
    "fname = viz_file_name.value+'.svg'\n",
    "viz.saveSvg(fname)\n",
    "fname = viz_file_name.value+'.png'\n",
    "viz.savePng(fname) "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21faf2f0-be22-4b78-ac66-6aa42f820667",
   "metadata": {},
   "source": [
    "**For dowloading the results file select it in the file list on the left hand side, open the context menu with a \"right click\" or \"two finger tap\" and choose \"Download\" from the options.**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "491323c0-3dbf-4ba6-a5dd-29a5290b44f6",
   "metadata": {},
   "source": [
    "## Advanced settings\n",
    "\n",
    "PyCatFlow offers more settings to adjust the visualization. If you want to make use of them you can genrate graphs by invoking the visualize funktion with your custom settings. The following example contains all parameters that can be passed to adjust the graph. Copy the code in a new code cell for running it and customizing your visualization.\n",
    "\n",
    "\n",
    "```Python\n",
    "viz = pcf. visualize(data, spacing=50, node_size=10, width=None, height=None, minValue=1, \n",
    "                     maxValue=10, node_scaling=\"linear\", connection_type=\"semi-curved\", \n",
    "                     color_startEnd=True, color_categories=True, nodes_color=\"gray\",\n",
    "                     start_node_color=\"green\", end_node_color=\"red\", palette=None, show_labels=True,\n",
    "                     label_text=\"item\", label_font=\"sans-serif\", label_color=\"black\", label_size=5,\n",
    "                     label_shortening=\"clip\", label_position=\"nodes\", line_opacity=0.5, \n",
    "                     line_stroke_color=\"white\", line_stroke_width=0.5, \n",
    "                     line_stroke_thick=0.5, legend=True, sort_by=\"frequency\")\n",
    "```\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7dd17601-33cc-4fb2-bc8f-ff7315c63ce1",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6279beef-a789-424d-a5f0-189fb96749d2",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}

## requirements.txt
pycatflow
pandas
	{
	"cells": [
	{
	"cell_type": "markdown",
	"id": "eb905db4-6b79-4018-b454-5dcd370985e7",
	"metadata": {},
	"source": [
	"# [PyCatFlow](https://github.com/bumatic/PyCatFlow) \n",
	"\n",
	"This notebook allows you to create a [PyCatFlow](https://github.com/bumatic/PyCatFlow) visualization. Run the notebook cell by cell and adjust the required information. Users do not have to change any code, but make inputs with so-called widgets. Most of the code in the this notebook implements these interface elements. Using the tool in plain Python reqires far less code. An example of this is included in the code repository of [PyCatFlow](https://github.com/bumatic/PyCatFlow).\n",
	"\n",
	"A file with sample data can be downloaded [here](https://raw.githubusercontent.com/bumatic/PyCatFlow/main/example/sample_data_ChatterBot_Requirements.csv). In case you want to play around with this data make sure that it is saved with the extension '.csv'."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "65622b3d-803e-40cd-b2fe-95e4231e080e",
	"metadata": {},
	"outputs": [],
	"source": [
	"import pandas as pd\n",
	"import pycatflow as pcf\n",
	"import ipywidgets as widgets\n",
	"from io import StringIO\n",
	"from IPython.display import display"
	]
	},
	{
	"cell_type": "markdown",
	"id": "e7cfb77d-36b4-4afc-98b0-9427c9a81276",
	"metadata": {},
	"source": [
	"## Step 1: Loading data"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "a0606af9-1c1f-4436-98a6-78c3f9e92ecb",
	"metadata": {},
	"outputs": [],
	"source": [
	"uploader = widgets.FileUpload(accept='.csv', description='Select csv file')\n",
	"display(uploader)\n",
	"\n",
	"separator = widgets.Dropdown(options=[('Tabulator', '\\t'), ('Comma',','), ('Semicolon', ';')],\n",
	" value='\\t',\n",
	" description='Separator:',\n",
	" disabled=False)\n",
	"display(separator)\n",
	"\n",
	"# Once you ran the code and selected a file for upload with the interactive widget\n",
	"# DON'T rerun this cell because it resets the selection. Just proceed with the next cell."
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "da54714d-6aaa-4f38-8972-ce7d6ec8e26f",
	"metadata": {},
	"outputs": [],
	"source": [
	"data_loaded = False\n",
	"if len(list(uploader.value.keys())) == 1:\n",
	" data_file = list(uploader.value.keys())[0]\n",
	" raw_data = uploader.value[list(uploader.value.keys())[0]]['content'].decode('UTF-8')\n",
	" data = pd.read_csv(StringIO(raw_data), sep=separator.value)\n",
	" columns = list(data.columns)\n",
	" print('First 5 rows of loaded data:')\n",
	" print()\n",
	" print(data.head(5))\n",
	" print()\n",
	" data_loaded = True\n",
	"else: \n",
	" print('Please select a data file before running this cell.')"
	]
	},
	{
	"cell_type": "markdown",
	"id": "c551a553-c1db-4fba-8899-98b5de51a4fe",
	"metadata": {},
	"source": [
	"## Step 2: Mapping data columns to the visualization"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "9d4f61d6-738a-4d6b-916c-b7b5d4e50c7a",
	"metadata": {},
	"outputs": [],
	"source": [
	"if data_loaded:\n",
	" style = {'description_width': 'initial'}\n",
	" print()\n",
	" print('Select data columns for visualization. (Leave blank, if it does not apply.)')\n",
	" print()\n",
	" viz_columns = widgets.Dropdown(options=columns, value=columns[0], description='Viz columns*: ', disabled=False, style=style)\n",
	" display(viz_columns)\n",
	"\n",
	" viz_nodes = widgets.Dropdown(options=columns, value=columns[1], description='Viz nodes*: ', disabled=False, style=style)\n",
	" display(viz_nodes)\n",
	" \n",
	" viz_category = widgets.Dropdown(options=columns, value=None, description='Viz category: ', disabled=False, style=style)\n",
	" display(viz_category)\n",
	" \n",
	" viz_col_order = widgets.Dropdown(options=columns, value=None, description='Column order: ', disabled=False, style=style)\n",
	" display(viz_col_order)\n",
	"else: \n",
	" print('No data loaded. Start in the cells above.')"
	]
	},
	{
	"cell_type": "markdown",
	"id": "c02676cb-d223-4ca8-8f44-cee891a871ce",
	"metadata": {},
	"source": [
	"## Step 3: Set properties of the visualization"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "9a387d29-b2b5-4eea-b361-1c9310b6cdda",
	"metadata": {},
	"outputs": [],
	"source": [
	"data = pcf.read(raw_data, columns=viz_columns.value, \n",
	" nodes=viz_nodes.value, categories=viz_category.value, \n",
	" column_order=viz_col_order.value, delimiter=separator.value)\n",
	"\n",
	"if data:\n",
	" viz_width = widgets.IntText(value=800, description='Width:', disabled=False, style=style)\n",
	" display(viz_width)\n",
	" viz_node_min = widgets.IntText(value=2, description='Node min size:', disabled=False, style=style) \n",
	" display(viz_node_min)\n",
	" viz_node_max = widgets.IntText(value=20, description='Node max size:', disabled=False, style=style) \n",
	" display(viz_node_max)\n",
	" viz_spacing = widgets.IntText(value=20, description='Node spacing:', disabled=False, style=style)\n",
	" display(viz_spacing)\n",
	" viz_connection_type = widgets.Dropdown(options=['semi-curved', 'curved', 'straight'], description='Connection type: ', \n",
	" disabled=False, style=style )\n",
	" display(viz_connection_type)\n",
	" viz_order = widgets.Dropdown(options=['frequency', 'alphabetical', 'category'], description='Sort nodes by: ', \n",
	" disabled=False, style=style )\n",
	" display(viz_order)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "169a5d01-f810-4448-8f4d-fae36ee19beb",
	"metadata": {},
	"outputs": [],
	"source": [
	"viz = pcf.visualize(data, spacing=viz_spacing.value, width=viz_width.value, maxValue=viz_node_max.value, \n",
	" minValue=viz_node_min.value, connection_type=viz_connection_type.value, sort_by=viz_order.value)\n",
	"viz"
	]
	},
	{
	"cell_type": "markdown",
	"id": "2a4981eb-0f39-48aa-9d62-54df0a20c550",
	"metadata": {},
	"source": [
	"## Step 4: Save the result"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "4938f8be-e04d-40bf-94e9-a4a207c1e3f4",
	"metadata": {},
	"outputs": [],
	"source": [
	"viz_file_name = widgets.Text(value='CatFlow_visual', description='File name: ')\n",
	"display(viz_file_name)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "c4e8f24a-6162-4c07-913a-f10bd94269c0",
	"metadata": {},
	"outputs": [],
	"source": [
	"fname = viz_file_name.value+'.svg'\n",
	"viz.saveSvg(fname)\n",
	"fname = viz_file_name.value+'.png'\n",
	"viz.savePng(fname) "
	]
	},
	{
	"cell_type": "markdown",
	"id": "21faf2f0-be22-4b78-ac66-6aa42f820667",
	"metadata": {},
	"source": [
	"For dowloading the results file select it in the file list on the left hand side, open the context menu with a \"right click\" or \"two finger tap\" and choose \"Download\" from the options."
	]
	},
	{
	"cell_type": "markdown",
	"id": "491323c0-3dbf-4ba6-a5dd-29a5290b44f6",
	"metadata": {},
	"source": [
	"## Advanced settings\n",
	"\n",
	"PyCatFlow offers more settings to adjust the visualization. If you want to make use of them you can genrate graphs by invoking the visualize funktion with your custom settings. The following example contains all parameters that can be passed to adjust the graph. Copy the code in a new code cell for running it and customizing your visualization.\n",
	"\n",
	"\n",
	"```Python\n",
	"viz = pcf. visualize(data, spacing=50, node_size=10, width=None, height=None, minValue=1, \n",
	" maxValue=10, node_scaling=\"linear\", connection_type=\"semi-curved\", \n",
	" color_startEnd=True, color_categories=True, nodes_color=\"gray\",\n",
	" start_node_color=\"green\", end_node_color=\"red\", palette=None, show_labels=True,\n",
	" label_text=\"item\", label_font=\"sans-serif\", label_color=\"black\", label_size=5,\n",
	" label_shortening=\"clip\", label_position=\"nodes\", line_opacity=0.5, \n",
	" line_stroke_color=\"white\", line_stroke_width=0.5, \n",
	" line_stroke_thick=0.5, legend=True, sort_by=\"frequency\")\n",
	"```\n",
	"\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "7dd17601-33cc-4fb2-bc8f-ff7315c63ce1",
	"metadata": {},
	"outputs": [],
	"source": []
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"id": "6279beef-a789-424d-a5f0-189fb96749d2",
	"metadata": {},
	"outputs": [],
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.8.8"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 5
	}