Skip to content

Instantly share code, notes, and snippets.

@infinite-Joy
Last active August 29, 2017 12:21
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save infinite-Joy/b56e914aba76b427829328a5313cb290 to your computer and use it in GitHub Desktop.
Save infinite-Joy/b56e914aba76b427829328a5313cb290 to your computer and use it in GitHub Desktop.
description for principal component analysis
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I have downloaded the data of three companies that are in the Indian stock market from [Quandl](https://www.quandl.com/). We Will try to understand the Indian ecosystem using this."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These data are for the Hitech corporation ltd. Housing and Development corporation limited. and Bhagyanagar India limited. The features of the dataset are Date, Open pricing, maximum price reached during the day, minimum price during the day, last price before closing, closing price, total traded quantity, and turn over in lakhs."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to load the dataset we will use the pandas library. I am not sharing the dataset with this as this is needed to be downloaded from the website. If you are interested you can create an account in the website which is free and download the datasets that you are interested in."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Date</th>\n",
" <th>Open</th>\n",
" <th>High</th>\n",
" <th>Low</th>\n",
" <th>Last</th>\n",
" <th>Close</th>\n",
" <th>Total Trade Quantity</th>\n",
" <th>Turnover (Lacs)</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2017-08-02</td>\n",
" <td>209.10</td>\n",
" <td>209.95</td>\n",
" <td>204.1</td>\n",
" <td>205.80</td>\n",
" <td>206.70</td>\n",
" <td>2935.0</td>\n",
" <td>6.09</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2017-08-01</td>\n",
" <td>212.00</td>\n",
" <td>214.95</td>\n",
" <td>208.3</td>\n",
" <td>208.30</td>\n",
" <td>209.15</td>\n",
" <td>5094.0</td>\n",
" <td>10.78</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2017-07-31</td>\n",
" <td>212.00</td>\n",
" <td>215.00</td>\n",
" <td>210.2</td>\n",
" <td>210.35</td>\n",
" <td>212.00</td>\n",
" <td>6803.0</td>\n",
" <td>14.47</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2017-07-28</td>\n",
" <td>208.00</td>\n",
" <td>213.95</td>\n",
" <td>208.0</td>\n",
" <td>211.95</td>\n",
" <td>211.25</td>\n",
" <td>2023.0</td>\n",
" <td>4.28</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2017-07-27</td>\n",
" <td>213.05</td>\n",
" <td>215.30</td>\n",
" <td>209.0</td>\n",
" <td>210.95</td>\n",
" <td>210.50</td>\n",
" <td>6714.0</td>\n",
" <td>14.23</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Date Open High Low Last Close Total Trade Quantity \\\n",
"0 2017-08-02 209.10 209.95 204.1 205.80 206.70 2935.0 \n",
"1 2017-08-01 212.00 214.95 208.3 208.30 209.15 5094.0 \n",
"2 2017-07-31 212.00 215.00 210.2 210.35 212.00 6803.0 \n",
"3 2017-07-28 208.00 213.95 208.0 211.95 211.25 2023.0 \n",
"4 2017-07-27 213.05 215.30 209.0 210.95 210.50 6714.0 \n",
"\n",
" Turnover (Lacs) \n",
"0 6.09 \n",
"1 10.78 \n",
"2 14.47 \n",
"3 4.28 \n",
"4 14.23 "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"df_hitech = pd.read_csv(\"NSE-HITECHCORP.csv\")\n",
"df_hitech.head()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Date</th>\n",
" <th>Open</th>\n",
" <th>High</th>\n",
" <th>Low</th>\n",
" <th>Last</th>\n",
" <th>Close</th>\n",
" <th>Total Trade Quantity</th>\n",
" <th>Turnover (Lacs)</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2017-08-02</td>\n",
" <td>24.50</td>\n",
" <td>24.60</td>\n",
" <td>23.80</td>\n",
" <td>23.85</td>\n",
" <td>24.00</td>\n",
" <td>5857.0</td>\n",
" <td>1.42</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2017-08-01</td>\n",
" <td>24.75</td>\n",
" <td>25.45</td>\n",
" <td>24.20</td>\n",
" <td>24.25</td>\n",
" <td>24.30</td>\n",
" <td>12622.0</td>\n",
" <td>3.10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2017-07-31</td>\n",
" <td>25.05</td>\n",
" <td>25.50</td>\n",
" <td>24.20</td>\n",
" <td>24.80</td>\n",
" <td>24.85</td>\n",
" <td>31276.0</td>\n",
" <td>7.80</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2017-07-28</td>\n",
" <td>25.15</td>\n",
" <td>25.40</td>\n",
" <td>24.60</td>\n",
" <td>24.80</td>\n",
" <td>24.90</td>\n",
" <td>11483.0</td>\n",
" <td>2.87</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2017-07-27</td>\n",
" <td>26.00</td>\n",
" <td>26.85</td>\n",
" <td>25.05</td>\n",
" <td>25.15</td>\n",
" <td>25.45</td>\n",
" <td>19290.0</td>\n",
" <td>4.94</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Date Open High Low Last Close Total Trade Quantity \\\n",
"0 2017-08-02 24.50 24.60 23.80 23.85 24.00 5857.0 \n",
"1 2017-08-01 24.75 25.45 24.20 24.25 24.30 12622.0 \n",
"2 2017-07-31 25.05 25.50 24.20 24.80 24.85 31276.0 \n",
"3 2017-07-28 25.15 25.40 24.60 24.80 24.90 11483.0 \n",
"4 2017-07-27 26.00 26.85 25.05 25.15 25.45 19290.0 \n",
"\n",
" Turnover (Lacs) \n",
"0 1.42 \n",
"1 3.10 \n",
"2 7.80 \n",
"3 2.87 \n",
"4 4.94 "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_bhagyanar = pd.read_csv(\"NSE-BHAGYANGR.csv\")\n",
"df_bhagyanar.head()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Date</th>\n",
" <th>Open</th>\n",
" <th>High</th>\n",
" <th>Low</th>\n",
" <th>Last</th>\n",
" <th>Close</th>\n",
" <th>Total Trade Quantity</th>\n",
" <th>Turnover (Lacs)</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2017-08-02</td>\n",
" <td>81.30</td>\n",
" <td>83.30</td>\n",
" <td>80.80</td>\n",
" <td>82.50</td>\n",
" <td>82.65</td>\n",
" <td>8086692.0</td>\n",
" <td>6663.36</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2017-08-01</td>\n",
" <td>82.05</td>\n",
" <td>82.45</td>\n",
" <td>80.75</td>\n",
" <td>81.00</td>\n",
" <td>81.05</td>\n",
" <td>4152813.0</td>\n",
" <td>3382.64</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2017-07-31</td>\n",
" <td>83.40</td>\n",
" <td>83.55</td>\n",
" <td>81.40</td>\n",
" <td>81.60</td>\n",
" <td>81.80</td>\n",
" <td>6389486.0</td>\n",
" <td>5269.27</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2017-07-28</td>\n",
" <td>82.20</td>\n",
" <td>84.40</td>\n",
" <td>80.75</td>\n",
" <td>82.95</td>\n",
" <td>83.30</td>\n",
" <td>19771547.0</td>\n",
" <td>16383.13</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2017-07-27</td>\n",
" <td>83.75</td>\n",
" <td>85.45</td>\n",
" <td>82.35</td>\n",
" <td>82.85</td>\n",
" <td>82.70</td>\n",
" <td>12314817.0</td>\n",
" <td>10310.98</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Date Open High Low Last Close Total Trade Quantity \\\n",
"0 2017-08-02 81.30 83.30 80.80 82.50 82.65 8086692.0 \n",
"1 2017-08-01 82.05 82.45 80.75 81.00 81.05 4152813.0 \n",
"2 2017-07-31 83.40 83.55 81.40 81.60 81.80 6389486.0 \n",
"3 2017-07-28 82.20 84.40 80.75 82.95 83.30 19771547.0 \n",
"4 2017-07-27 83.75 85.45 82.35 82.85 82.70 12314817.0 \n",
"\n",
" Turnover (Lacs) \n",
"0 6663.36 \n",
"1 3382.64 \n",
"2 5269.27 \n",
"3 16383.13 \n",
"4 10310.98 "
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_hudco = pd.read_csv(\"NSE-HUDCO.csv\")\n",
"df_hudco.head()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"((51, 8), (55, 8), (53, 8))"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_hitech.shape, df_bhagyanar.shape, df_hudco.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So our three datasets are three matrices of dimensions 48 x 8, 52 x 8 and 50 x 8 where the columns are the different features and every row represents a separate date sample."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Visualization\n",
"\n",
"To get the feeling of how the different stocks are distributed along opening stock prices and total trade quantities, let us visualise them using histograms."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Get the relevant data. Right now we are only interested in visualisation and as we cannot show all the feature in a plot, we will just focus on some specific columns."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"X1 = df_hitech.loc[0:5, ['Date', 'Open', 'Total Trade Quantity']].values\n",
"X2 = df_bhagyanar.loc[0:5,['Date', 'Open', 'Total Trade Quantity']].values\n",
"X3 = df_hudco.loc[0:5, ['Date', 'Open', 'Total Trade Quantity']].values"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from bokeh.charts import output_file, show, Line\n",
"from bokeh.charts import Area, show, TimeSeries\n",
"from bokeh.io import output_notebook"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" <div class=\"bk-root\">\n",
" <a href=\"http://bokeh.pydata.org\" target=\"_blank\" class=\"bk-logo bk-logo-small bk-logo-notebook\"></a>\n",
" <span id=\"27ba104c-e8e3-4673-b41e-001bdc8a5b5d\">Loading BokehJS ...</span>\n",
" </div>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"\n",
"(function(global) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
"\n",
" var force = true;\n",
"\n",
" if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n",
" window._bokeh_onload_callbacks = [];\n",
" window._bokeh_is_loading = undefined;\n",
" }\n",
"\n",
"\n",
" \n",
" if (typeof (window._bokeh_timeout) === \"undefined\" || force === true) {\n",
" window._bokeh_timeout = Date.now() + 5000;\n",
" window._bokeh_failed_load = false;\n",
" }\n",
"\n",
" var NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"<div style='background-color: #fdd'>\\n\"+\n",
" \"<p>\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"</p>\\n\"+\n",
" \"<ul>\\n\"+\n",
" \"<li>re-rerun `output_notebook()` to attempt to load from CDN again, or</li>\\n\"+\n",
" \"<li>use INLINE resources instead, as so:</li>\\n\"+\n",
" \"</ul>\\n\"+\n",
" \"<code>\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"</code>\\n\"+\n",
" \"</div>\"}};\n",
"\n",
" function display_loaded() {\n",
" if (window.Bokeh !== undefined) {\n",
" var el = document.getElementById(\"27ba104c-e8e3-4673-b41e-001bdc8a5b5d\");\n",
" el.textContent = \"BokehJS \" + Bokeh.version + \" successfully loaded.\";\n",
" } else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(display_loaded, 100)\n",
" }\n",
" }\n",
"\n",
" function run_callbacks() {\n",
" window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n",
" delete window._bokeh_onload_callbacks\n",
" console.info(\"Bokeh: all callbacks have finished\");\n",
" }\n",
"\n",
" function load_libs(js_urls, callback) {\n",
" window._bokeh_onload_callbacks.push(callback);\n",
" if (window._bokeh_is_loading > 0) {\n",
" console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" window._bokeh_is_loading = js_urls.length;\n",
" for (var i = 0; i < js_urls.length; i++) {\n",
" var url = js_urls[i];\n",
" var s = document.createElement('script');\n",
" s.src = url;\n",
" s.async = false;\n",
" s.onreadystatechange = s.onload = function() {\n",
" window._bokeh_is_loading--;\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: all BokehJS libraries loaded\");\n",
" run_callbacks()\n",
" }\n",
" };\n",
" s.onerror = function() {\n",
" console.warn(\"failed to load library \" + url);\n",
" };\n",
" console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" }\n",
" };var element = document.getElementById(\"27ba104c-e8e3-4673-b41e-001bdc8a5b5d\");\n",
" if (element == null) {\n",
" console.log(\"Bokeh: ERROR: autoload.js configured with elementid '27ba104c-e8e3-4673-b41e-001bdc8a5b5d' but no matching script tag was found. \")\n",
" return false;\n",
" }\n",
"\n",
" var js_urls = [\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.5.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.5.min.js\"];\n",
"\n",
" var inline_js = [\n",
" function(Bokeh) {\n",
" Bokeh.set_log_level(\"info\");\n",
" },\n",
" \n",
" function(Bokeh) {\n",
" \n",
" },\n",
" \n",
" function(Bokeh) {\n",
" \n",
" document.getElementById(\"27ba104c-e8e3-4673-b41e-001bdc8a5b5d\").textContent = \"BokehJS is loading...\";\n",
" },\n",
" function(Bokeh) {\n",
" console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-0.12.5.min.css\");\n",
" Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.5.min.css\");\n",
" console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.5.min.css\");\n",
" Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.5.min.css\");\n",
" }\n",
" ];\n",
"\n",
" function run_inline_js() {\n",
" \n",
" if ((window.Bokeh !== undefined) || (force === true)) {\n",
" for (var i = 0; i < inline_js.length; i++) {\n",
" inline_js[i](window.Bokeh);\n",
" }if (force === true) {\n",
" display_loaded();\n",
" }} else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!window._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" window._bokeh_failed_load = true;\n",
" } else if (force !== true) {\n",
" var cell = $(document.getElementById(\"27ba104c-e8e3-4673-b41e-001bdc8a5b5d\")).parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
"\n",
" }\n",
"\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(js_urls, function() {\n",
" console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
"}(this));"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output_notebook()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[array(['2017-08-02', 209.1, 2935.0], dtype=object), array(['2017-08-01', 212.0, 5094.0], dtype=object), array(['2017-07-31', 212.0, 6803.0], dtype=object), array(['2017-07-28', 208.0, 2023.0], dtype=object), array(['2017-07-27', 213.05, 6714.0], dtype=object), array(['2017-07-26', 218.95, 3389.0], dtype=object)]\n",
"[array(['2017-08-02', 24.5, 5857.0], dtype=object), array(['2017-08-01', 24.75, 12622.0], dtype=object), array(['2017-07-31', 25.05, 31276.0], dtype=object), array(['2017-07-28', 25.15, 11483.0], dtype=object), array(['2017-07-27', 26.0, 19290.0], dtype=object), array(['2017-07-26', 26.5, 17808.0], dtype=object)]\n"
]
}
],
"source": [
"# Get a feel for the data.\n",
"\n",
"print([x for x in X1])\n",
"print([x for x in X2])"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
" <div class=\"bk-root\">\n",
" <div class=\"bk-plotdiv\" id=\"146bdd8c-01e0-4bd3-9aa3-b0c4d6c4edb5\"></div>\n",
" </div>\n",
"<script type=\"text/javascript\">\n",
" \n",
" (function(global) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
" \n",
" var force = false;\n",
" \n",
" if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n",
" window._bokeh_onload_callbacks = [];\n",
" window._bokeh_is_loading = undefined;\n",
" }\n",
" \n",
" \n",
" \n",
" if (typeof (window._bokeh_timeout) === \"undefined\" || force === true) {\n",
" window._bokeh_timeout = Date.now() + 0;\n",
" window._bokeh_failed_load = false;\n",
" }\n",
" \n",
" var NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"<div style='background-color: #fdd'>\\n\"+\n",
" \"<p>\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"</p>\\n\"+\n",
" \"<ul>\\n\"+\n",
" \"<li>re-rerun `output_notebook()` to attempt to load from CDN again, or</li>\\n\"+\n",
" \"<li>use INLINE resources instead, as so:</li>\\n\"+\n",
" \"</ul>\\n\"+\n",
" \"<code>\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"</code>\\n\"+\n",
" \"</div>\"}};\n",
" \n",
" function display_loaded() {\n",
" if (window.Bokeh !== undefined) {\n",
" var el = document.getElementById(\"146bdd8c-01e0-4bd3-9aa3-b0c4d6c4edb5\");\n",
" el.textContent = \"BokehJS \" + Bokeh.version + \" successfully loaded.\";\n",
" } else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(display_loaded, 100)\n",
" }\n",
" }\n",
" \n",
" function run_callbacks() {\n",
" window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n",
" delete window._bokeh_onload_callbacks\n",
" console.info(\"Bokeh: all callbacks have finished\");\n",
" }\n",
" \n",
" function load_libs(js_urls, callback) {\n",
" window._bokeh_onload_callbacks.push(callback);\n",
" if (window._bokeh_is_loading > 0) {\n",
" console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" window._bokeh_is_loading = js_urls.length;\n",
" for (var i = 0; i < js_urls.length; i++) {\n",
" var url = js_urls[i];\n",
" var s = document.createElement('script');\n",
" s.src = url;\n",
" s.async = false;\n",
" s.onreadystatechange = s.onload = function() {\n",
" window._bokeh_is_loading--;\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: all BokehJS libraries loaded\");\n",
" run_callbacks()\n",
" }\n",
" };\n",
" s.onerror = function() {\n",
" console.warn(\"failed to load library \" + url);\n",
" };\n",
" console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" }\n",
" };var element = document.getElementById(\"146bdd8c-01e0-4bd3-9aa3-b0c4d6c4edb5\");\n",
" if (element == null) {\n",
" console.log(\"Bokeh: ERROR: autoload.js configured with elementid '146bdd8c-01e0-4bd3-9aa3-b0c4d6c4edb5' but no matching script tag was found. \")\n",
" return false;\n",
" }\n",
" \n",
" var js_urls = [];\n",
" \n",
" var inline_js = [\n",
" function(Bokeh) {\n",
" (function() {\n",
" var fn = function() {\n",
" var docs_json = {\"04087277-a012-417e-8657-250908958faf\":{\"roots\":{\"references\":[{\"attributes\":{},\"id\":\"59c5d333-f44c-479f-9986-a4aa375595cd\",\"type\":\"CategoricalTicker\"},{\"attributes\":{\"label\":{\"value\":\"bhagyagar\"},\"renderers\":[{\"id\":\"0a6a4e8b-728e-471a-9a19-4d2aca90467b\",\"type\":\"GlyphRenderer\"}]},\"id\":\"99e242fa-ade5-4a62-ad3b-b2acbe198060\",\"type\":\"LegendItem\"},{\"attributes\":{\"below\":[{\"id\":\"78d24946-d69a-45ff-a4d3-0aedbd2f7125\",\"type\":\"CategoricalAxis\"}],\"css_classes\":null,\"left\":[{\"id\":\"23dae81b-1eb2-4c7e-bded-d94d09422c10\",\"type\":\"LinearAxis\"}],\"renderers\":[{\"id\":\"41250487-143b-465a-b937-eb1b05137e2e\",\"type\":\"BoxAnnotation\"},{\"id\":\"0a6a4e8b-728e-471a-9a19-4d2aca90467b\",\"type\":\"GlyphRenderer\"},{\"id\":\"23e019e1-7cac-4a8a-aa9e-f427de144c31\",\"type\":\"GlyphRenderer\"},{\"id\":\"2c95e039-9772-484e-b44d-c23632aea6de\",\"type\":\"Legend\"},{\"id\":\"78d24946-d69a-45ff-a4d3-0aedbd2f7125\",\"type\":\"CategoricalAxis\"},{\"id\":\"23dae81b-1eb2-4c7e-bded-d94d09422c10\",\"type\":\"LinearAxis\"},{\"id\":\"1490b735-0deb-4eed-94bc-8ee8470ba8ba\",\"type\":\"Grid\"},{\"id\":\"9e05d22b-0594-46f9-98f4-17355e55f738\",\"type\":\"Grid\"}],\"title\":{\"id\":\"99e4df21-12e6-46b8-adb5-70e41ea8b0fd\",\"type\":\"Title\"},\"tool_events\":{\"id\":\"8548d8ea-09b4-4042-940c-4e513edbf872\",\"type\":\"ToolEvents\"},\"toolbar\":{\"id\":\"7c016c85-8a3a-47a5-a35a-754489d8b6b2\",\"type\":\"Toolbar\"},\"x_mapper_type\":\"auto\",\"x_range\":{\"id\":\"34a31230-7a34-4605-8e4f-8147ba4ed268\",\"type\":\"FactorRange\"},\"y_mapper_type\":\"auto\",\"y_range\":{\"id\":\"cd8ff79e-d3e5-4a38-9681-f3a358c65c48\",\"type\":\"Range1d\"}},\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"},{\"attributes\":{\"axis_label\":\"memory\",\"formatter\":{\"id\":\"f7164ad8-f29f-4d23-9c24-b1c24f974e7f\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"a0651bcd-ade6-42f6-9e6f-7d4f4ab357f6\",\"type\":\"BasicTicker\"}},\"id\":\"23dae81b-1eb2-4c7e-bded-d94d09422c10\",\"type\":\"LinearAxis\"},{\"attributes\":{},\"id\":\"8548d8ea-09b4-4042-940c-4e513edbf872\",\"type\":\"ToolEvents\"},{\"attributes\":{\"plot\":null,\"text\":\"Area Chart\"},\"id\":\"99e4df21-12e6-46b8-adb5-70e41ea8b0fd\",\"type\":\"Title\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"a0651bcd-ade6-42f6-9e6f-7d4f4ab357f6\",\"type\":\"BasicTicker\"}},\"id\":\"9e05d22b-0594-46f9-98f4-17355e55f738\",\"type\":\"Grid\"},{\"attributes\":{\"active_drag\":\"auto\",\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"6370a1af-3974-4c43-a55f-cbc64ac2c6d4\",\"type\":\"PanTool\"},{\"id\":\"5d86b0b3-e4fe-4d37-bdec-8a131bc51509\",\"type\":\"WheelZoomTool\"},{\"id\":\"7da6c207-fa22-4e04-a91d-386fbdd30313\",\"type\":\"BoxZoomTool\"},{\"id\":\"c10192f8-eaaa-4afa-839c-21a79a1fd4e7\",\"type\":\"SaveTool\"},{\"id\":\"b8b7c80c-177f-46e1-8536-fee4d9032b69\",\"type\":\"ResetTool\"},{\"id\":\"1fdca34e-26a9-468b-9ca2-5fafd1b102c0\",\"type\":\"HelpTool\"}]},\"id\":\"7c016c85-8a3a-47a5-a35a-754489d8b6b2\",\"type\":\"Toolbar\"},{\"attributes\":{\"callback\":null,\"factors\":[\"2017-08-02\",\"2017-08-01\",\"2017-07-31\",\"2017-07-28\",\"2017-07-27\",\"2017-07-26\"]},\"id\":\"34a31230-7a34-4605-8e4f-8147ba4ed268\",\"type\":\"FactorRange\"},{\"attributes\":{},\"id\":\"a0651bcd-ade6-42f6-9e6f-7d4f4ab357f6\",\"type\":\"BasicTicker\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"41250487-143b-465a-b937-eb1b05137e2e\",\"type\":\"BoxAnnotation\"},{\"attributes\":{\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"}},\"id\":\"6370a1af-3974-4c43-a55f-cbc64ac2c6d4\",\"type\":\"PanTool\"},{\"attributes\":{\"callback\":null,\"column_names\":[\"x_values\",\"y_values\"],\"data\":{\"chart_index\":[{\"series\":\"bhagyagar\"},{\"series\":\"bhagyagar\"},{\"series\":\"bhagyagar\"},{\"series\":\"bhagyagar\"},{\"series\":\"bhagyagar\"},{\"series\":\"bhagyagar\"}],\"series\":[\"bhagyagar\",\"bhagyagar\",\"bhagyagar\",\"bhagyagar\",\"bhagyagar\",\"bhagyagar\"],\"x_values\":[\"2017-08-02\",\"2017-08-01\",\"2017-07-31\",\"2017-07-28\",\"2017-07-27\",\"2017-07-26\"],\"y_values\":{\"__ndarray__\":\"AAAAAACAOEAAAAAAAMA4QM3MzMzMDDlAZmZmZmYmOUAAAAAAAAA6QAAAAAAAgDpA\",\"dtype\":\"float64\",\"shape\":[6]}}},\"id\":\"f578c130-702b-454c-ae8c-8d74efbb5d84\",\"type\":\"ColumnDataSource\"},{\"attributes\":{\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"}},\"id\":\"5d86b0b3-e4fe-4d37-bdec-8a131bc51509\",\"type\":\"WheelZoomTool\"},{\"attributes\":{\"overlay\":{\"id\":\"41250487-143b-465a-b937-eb1b05137e2e\",\"type\":\"BoxAnnotation\"},\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"}},\"id\":\"7da6c207-fa22-4e04-a91d-386fbdd30313\",\"type\":\"BoxZoomTool\"},{\"attributes\":{\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"}},\"id\":\"c10192f8-eaaa-4afa-839c-21a79a1fd4e7\",\"type\":\"SaveTool\"},{\"attributes\":{\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"}},\"id\":\"b8b7c80c-177f-46e1-8536-fee4d9032b69\",\"type\":\"ResetTool\"},{\"attributes\":{\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"59c5d333-f44c-479f-9986-a4aa375595cd\",\"type\":\"CategoricalTicker\"}},\"id\":\"1490b735-0deb-4eed-94bc-8ee8470ba8ba\",\"type\":\"Grid\"},{\"attributes\":{\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"}},\"id\":\"1fdca34e-26a9-468b-9ca2-5fafd1b102c0\",\"type\":\"HelpTool\"},{\"attributes\":{\"data_source\":{\"id\":\"6e707145-20e3-4459-8363-d9aa4ef59ce5\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"16761f5b-5144-4b5a-87da-58242c08f288\",\"type\":\"Line\"},\"hover_glyph\":null,\"muted_glyph\":null},\"id\":\"23e019e1-7cac-4a8a-aa9e-f427de144c31\",\"type\":\"GlyphRenderer\"},{\"attributes\":{\"callback\":null,\"end\":238.39499999999998,\"start\":5.055},\"id\":\"cd8ff79e-d3e5-4a38-9681-f3a358c65c48\",\"type\":\"Range1d\"},{\"attributes\":{\"callback\":null,\"column_names\":[\"x_values\",\"y_values\"],\"data\":{\"chart_index\":[{\"series\":\"hitech\"},{\"series\":\"hitech\"},{\"series\":\"hitech\"},{\"series\":\"hitech\"},{\"series\":\"hitech\"},{\"series\":\"hitech\"}],\"series\":[\"hitech\",\"hitech\",\"hitech\",\"hitech\",\"hitech\",\"hitech\"],\"x_values\":[\"2017-08-02\",\"2017-08-01\",\"2017-07-31\",\"2017-07-28\",\"2017-07-27\",\"2017-07-26\"],\"y_values\":{\"__ndarray__\":\"MzMzMzMjakAAAAAAAIBqQAAAAAAAgGpAAAAAAAAAakCamZmZmaFqQGZmZmZmXmtA\",\"dtype\":\"float64\",\"shape\":[6]}}},\"id\":\"6e707145-20e3-4459-8363-d9aa4ef59ce5\",\"type\":\"ColumnDataSource\"},{\"attributes\":{\"label\":{\"value\":\"hitech\"},\"renderers\":[{\"id\":\"23e019e1-7cac-4a8a-aa9e-f427de144c31\",\"type\":\"GlyphRenderer\"}]},\"id\":\"c2906fdc-7191-41fe-80b7-fe8360899f14\",\"type\":\"LegendItem\"},{\"attributes\":{\"items\":[{\"id\":\"99e242fa-ade5-4a62-ad3b-b2acbe198060\",\"type\":\"LegendItem\"},{\"id\":\"c2906fdc-7191-41fe-80b7-fe8360899f14\",\"type\":\"LegendItem\"}],\"location\":\"top_left\",\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"}},\"id\":\"2c95e039-9772-484e-b44d-c23632aea6de\",\"type\":\"Legend\"},{\"attributes\":{\"line_color\":{\"value\":\"#5ab738\"},\"line_width\":{\"value\":2},\"x\":{\"field\":\"x_values\"},\"y\":{\"field\":\"y_values\"}},\"id\":\"16761f5b-5144-4b5a-87da-58242c08f288\",\"type\":\"Line\"},{\"attributes\":{\"line_color\":{\"value\":\"#f22c40\"},\"line_width\":{\"value\":2},\"x\":{\"field\":\"x_values\"},\"y\":{\"field\":\"y_values\"}},\"id\":\"e4c28955-c253-409e-870f-d3016f041a61\",\"type\":\"Line\"},{\"attributes\":{},\"id\":\"9b2e4e46-7c43-404e-ae0e-f0e3f02697b7\",\"type\":\"CategoricalTickFormatter\"},{\"attributes\":{},\"id\":\"f7164ad8-f29f-4d23-9c24-b1c24f974e7f\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"axis_label\":\"time\",\"formatter\":{\"id\":\"9b2e4e46-7c43-404e-ae0e-f0e3f02697b7\",\"type\":\"CategoricalTickFormatter\"},\"major_label_orientation\":0.7853981633974483,\"plot\":{\"id\":\"4e35c719-3653-482a-b07a-167bc9716d2b\",\"subtype\":\"Chart\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"59c5d333-f44c-479f-9986-a4aa375595cd\",\"type\":\"CategoricalTicker\"}},\"id\":\"78d24946-d69a-45ff-a4d3-0aedbd2f7125\",\"type\":\"CategoricalAxis\"},{\"attributes\":{\"data_source\":{\"id\":\"f578c130-702b-454c-ae8c-8d74efbb5d84\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"e4c28955-c253-409e-870f-d3016f041a61\",\"type\":\"Line\"},\"hover_glyph\":null,\"muted_glyph\":null},\"id\":\"0a6a4e8b-728e-471a-9a19-4d2aca90467b\",\"type\":\"GlyphRenderer\"}],\"root_ids\":[\"4e35c719-3653-482a-b07a-167bc9716d2b\"]},\"title\":\"Bokeh Application\",\"version\":\"0.12.5\"}};\n",
" var render_items = [{\"docid\":\"04087277-a012-417e-8657-250908958faf\",\"elementid\":\"146bdd8c-01e0-4bd3-9aa3-b0c4d6c4edb5\",\"modelid\":\"4e35c719-3653-482a-b07a-167bc9716d2b\"}];\n",
" \n",
" Bokeh.embed.embed_items(docs_json, render_items);\n",
" };\n",
" if (document.readyState != \"loading\") fn();\n",
" else document.addEventListener(\"DOMContentLoaded\", fn);\n",
" })();\n",
" },\n",
" function(Bokeh) {\n",
" }\n",
" ];\n",
" \n",
" function run_inline_js() {\n",
" \n",
" if ((window.Bokeh !== undefined) || (force === true)) {\n",
" for (var i = 0; i < inline_js.length; i++) {\n",
" inline_js[i](window.Bokeh);\n",
" }if (force === true) {\n",
" display_loaded();\n",
" }} else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!window._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" window._bokeh_failed_load = true;\n",
" } else if (force !== true) {\n",
" var cell = $(document.getElementById(\"146bdd8c-01e0-4bd3-9aa3-b0c4d6c4edb5\")).parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" \n",
" }\n",
" \n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(js_urls, function() {\n",
" console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
" }(this));\n",
"</script>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# show how the data is distributed for one company.\n",
"\n",
"data = dict(\n",
" hitech=[x[1] for x in X1],\n",
" bhagyagar=[x[1] for x in X2],\n",
" date=[x[0] for x in X1],\n",
")\n",
"\n",
"line_plot = Line(data, x='date', title=\"Area Chart\", legend=\"top_left\",\n",
" xlabel='time', ylabel='memory')\n",
"show(line_plot)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Standardizing\n",
"\n",
"We want to compare apples to apples and oranges to oranges. Generally we are more interested in how the changes are happening and not too much concerned about the scale on which the dataset was published. In this case you can see the dataset is comprised of numbers that were gathered on various scales, so it would be wise to standardise the dataset to the unit scale with mean as 0 and variance as 1."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(159, 7)\n"
]
}
],
"source": [
"from sklearn.preprocessing import StandardScaler\n",
"\n",
"column_names = ['Open', 'High', 'Low', 'Last', 'Close', 'Total Trade Quantity', 'Turnover (Lacs)']\n",
"\n",
"X_hitech = df_hitech.loc[:, column_names].values\n",
"X_bhagyanagar = df_bhagyanar.loc[:, column_names].values\n",
"X_hudco = df_hudco.loc[:, column_names].values\n",
"\n",
"X = np.concatenate([X_hitech, X_bhagyanagar, X_hudco], axis=0)\n",
"\n",
"X_std = StandardScaler().fit_transform(X)\n",
"\n",
"print(X_std.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Covariance matrix\n",
"\n",
"--------------\n",
"\n",
"One route is to perform the eigen decomposition on the covariance matrix.\n",
"\n",
"In terms of code this can be represented using the below format.\n",
"\n",
"Please note that the shape of the standardized matrix is 150 x 7. So we should get a covariance matrix of 7 x 7"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Covariance matrix \n",
"[[ 1.00632911 1.00550536 1.00597911 1.00542205 1.00545316 -0.06315605\n",
" -0.05901863]\n",
" [ 1.00550536 1.00632911 1.00541311 1.00598897 1.00602285 -0.0542054\n",
" -0.0497978 ]\n",
" [ 1.00597911 1.00541311 1.00632911 1.00562847 1.00565687 -0.06570958\n",
" -0.06198619]\n",
" [ 1.00542205 1.00598897 1.00562847 1.00632911 1.00628689 -0.06051605\n",
" -0.05646143]\n",
" [ 1.00545316 1.00602285 1.00565687 1.00628689 1.00632911 -0.05997139\n",
" -0.05587568]\n",
" [-0.06315605 -0.0542054 -0.06570958 -0.06051605 -0.05997139 1.00632911\n",
" 0.99963087]\n",
" [-0.05901863 -0.0497978 -0.06198619 -0.05646143 -0.05587568 0.99963087\n",
" 1.00632911]]\n",
"Covariance matrix shape is: (7, 7)\n"
]
}
],
"source": [
"mean_vec = np.mean(X_std, axis=0)\n",
"\n",
"cov_mat = (X_std - mean_vec).T.dot((X_std - mean_vec)) / (X_std.shape[0]-1)\n",
"\n",
"print('Covariance matrix \\n{}'.format(cov_mat))\n",
"print('Covariance matrix shape is: {}'.format(cov_mat.shape))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We showed the code just to show how the covariance works. In general the `cov` function is used."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 1.00632911 1.00550536 1.00597911 1.00542205 1.00545316 -0.06315605\n",
" -0.05901863]\n",
" [ 1.00550536 1.00632911 1.00541311 1.00598897 1.00602285 -0.0542054\n",
" -0.0497978 ]\n",
" [ 1.00597911 1.00541311 1.00632911 1.00562847 1.00565687 -0.06570958\n",
" -0.06198619]\n",
" [ 1.00542205 1.00598897 1.00562847 1.00632911 1.00628689 -0.06051605\n",
" -0.05646143]\n",
" [ 1.00545316 1.00602285 1.00565687 1.00628689 1.00632911 -0.05997139\n",
" -0.05587568]\n",
" [-0.06315605 -0.0542054 -0.06570958 -0.06051605 -0.05997139 1.00632911\n",
" 0.99963087]\n",
" [-0.05901863 -0.0497978 -0.06198619 -0.05646143 -0.05587568 0.99963087\n",
" 1.00632911]]\n"
]
}
],
"source": [
"covariance_matrix = np.cov(X_std.T)\n",
"print(covariance_matrix)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next is to perform an eigen decomposition on the above matrix. Eigen decomposition is the method where we decompose a square matrix into its eigen vectors and eigen values. The eigen vectors are basically hyperplanes to which the data gets projected into. Take a look at [wikipedia](https://en.wikipedia.org/wiki/Eigendecomposition_of_a_matrix) to learn more."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We find the eigen vectors and the eigen values using the numpy library. In PCA the eigen vectors are called loadings."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[ 5.04062300e+00 1.99469506e+00 6.70989466e-03 1.52221819e-03\n",
" 5.08344416e-04 2.04592597e-04 4.06913481e-05]\n",
"[[-0.44636953 -0.0255697 -0.00233638 0.61091357 0.39241169 -0.52200819\n",
" 0.02000256]\n",
" [-0.44626454 -0.03201096 -0.03801748 -0.31356635 0.66065443 0.51116882\n",
" -0.04791904]\n",
" [-0.44644488 -0.02362002 0.04386902 0.47716468 -0.50791923 0.55804311\n",
" -0.03353175]\n",
" [-0.44641063 -0.02742468 0.0018537 -0.3959805 -0.28922772 -0.3236239\n",
" -0.67437107]\n",
" [-0.44640923 -0.02782594 -0.0024592 -0.37808739 -0.25605417 -0.22367243\n",
" 0.73579569]\n",
" [ 0.04405545 -0.70563431 0.70623273 -0.01919751 0.03123937 -0.00518154\n",
" 0.0018351 ]\n",
" [ 0.04224484 -0.705916 -0.70558526 0.02560183 -0.03707468 0.00362514\n",
" -0.00206853]]\n"
]
}
],
"source": [
"eigen_vals, eigen_vecs = np.linalg.eig(covariance_matrix)\n",
"\n",
"print(eigen_vals)\n",
"print(eigen_vecs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Why are the eigen values and eigen vectors important, take a look at the math below. if this sounds too heady you can skip this section and just focus on the code.\n",
"\n",
"So we know that after performing the eigen decompostion we get the eigen values and the eigen vectors. Lets call them $\\lambda$ and $W$. So in terms of principal component analysis, we will say that the scores are the product of matrices $X$ and $W$, i.e.,\n",
"\n",
"\\begin{equation*}\n",
"T = X W\n",
"\\end{equation*}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Singlular Value Decomposition\n",
"\n",
"-----------\n",
"\n",
"Finding the eigen values and eigen vectors is computationally not the most efficient way of finding the loadings or the matrix W. We kind of have a shortcut to do the same thing and that is through Singlular Value Decomposition or SVD.\n",
"\n",
"How it works is that the same input matrix X can be broken down to three matrices $U$, $\\Sigma$, and $V^{*}$. \n",
"\n",
"\\begin{equation*}\n",
"X = U \\Sigma V^{*} \\quad\\quad\\quad\\quad\\quad\\quad \\text{- equation (1)}\n",
"\\end{equation*}\n",
"\n",
"$U$ is the left singular vector and $V^{*}$ is the right singular vectors. So $\\Sigma$ will contain singular values in its diagonal and they are ordered. Please note that although $V^{*}$ means conjugate of V, since we will be dealing with real world data you can think of this as transpose of V."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now interestingly the matrix $V$ turns out to be identical to the loadings or W."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How is this transformation beneficial to us. Lets explore that below. We know that the scores T can be shown with the given equation\n",
"\n",
"\\begin{equation*}\n",
"T = X W\n",
"\\end{equation*}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since W is the same as V. So we can do the following - \n",
"\n",
"\\begin{equation*}\n",
"X V = U \\Sigma V^{*}\n",
"\\end{equation*}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Coming from equation 1 above we can write -\n",
"\n",
"\\begin{equation*}\n",
"X V = U \\Sigma V V^{*}\n",
"\\end{equation*}\n",
"\n",
"Since $V V^{*}$ is the Identity Matrix we get -\n",
"\n",
"\\begin{equation*}\n",
"T = U \\Sigma\n",
"\\end{equation*}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So we easily find the Scores if we are able to compute $U$ and $\\Sigma$, which can be easily done. If you are interested in doing an example by hand take a look at [this SVD computation example](http://www.d.umn.edu/~mhampton/m4326svd_example.pdf)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Selecting Principal Components\n",
"\n",
"Now we will select the principal components. This is mostly based on human judgement but many thumbrules can be followed. Understanding of how many components you will incorporate will ultimately come through experience and experimentation."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"Right now with the help of the eigen decomposition we transformed our data to the new set of planes with the eigen vectors as the axes."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We need to decide which eigen vectors can be dropped without losing too much information. Please note that the eigen values with the least amount of value are the ones that have the least amount of information regarding the data distribution. Why is that so? Remember the definition of eigen vectors and eigen values.\n",
"\n",
"![eigen decomposition](https://latex.codecogs.com/png.latex?\\fn_jvn&space;A\\nu&space;=&space;\\lambda\\nu)\n",
"\n",
"\n",
"The eigen value λ is the amount the vectors elongate or shrink. So if the value of λ is small that would mean that the effect ν has on matrix A will be small."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will try to see how much each values contributes to the overall variance."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[71.556013843924504, 99.872439612636256, 99.967692385037282, 99.989301591992117, 99.996517981727536, 99.999422351033218, 99.999999999999986]\n",
"7\n",
"[0, 71.556013843924504, 99.872439612636256, 99.967692385037282, 99.989301591992117, 99.996517981727536, 99.999422351033218]\n",
"7\n",
"[0, 2, 4, 6, 8, 10, 12]\n",
"[1, 3, 5, 7, 9, 11, 13]\n"
]
}
],
"source": [
"total = sum(eigen_vals)\n",
"distribution = [100 * (i/total) for i in sorted(eigen_vals, reverse=True)]\n",
"\n",
"cumulative_distribution = []\n",
"sum = 0\n",
"for i in distribution:\n",
" sum += i\n",
" cumulative_distribution.append(sum)\n",
"\n",
"bottom = [0] + cumulative_distribution[:-1]\n",
"left = [2*x for x in range(7)]\n",
"right = map(lambda x: x+1, left)\n",
"print(cumulative_distribution)\n",
"print(len(cumulative_distribution))\n",
"print(bottom)\n",
"print(len(bottom))\n",
"print(list(left))\n",
"print(list(right))"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
" <div class=\"bk-root\">\n",
" <div class=\"bk-plotdiv\" id=\"c635add8-7ce1-4772-afd0-31b7f0a04aa4\"></div>\n",
" </div>\n",
"<script type=\"text/javascript\">\n",
" \n",
" (function(global) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
" \n",
" var force = false;\n",
" \n",
" if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n",
" window._bokeh_onload_callbacks = [];\n",
" window._bokeh_is_loading = undefined;\n",
" }\n",
" \n",
" \n",
" \n",
" if (typeof (window._bokeh_timeout) === \"undefined\" || force === true) {\n",
" window._bokeh_timeout = Date.now() + 0;\n",
" window._bokeh_failed_load = false;\n",
" }\n",
" \n",
" var NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"<div style='background-color: #fdd'>\\n\"+\n",
" \"<p>\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"</p>\\n\"+\n",
" \"<ul>\\n\"+\n",
" \"<li>re-rerun `output_notebook()` to attempt to load from CDN again, or</li>\\n\"+\n",
" \"<li>use INLINE resources instead, as so:</li>\\n\"+\n",
" \"</ul>\\n\"+\n",
" \"<code>\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"</code>\\n\"+\n",
" \"</div>\"}};\n",
" \n",
" function display_loaded() {\n",
" if (window.Bokeh !== undefined) {\n",
" var el = document.getElementById(\"c635add8-7ce1-4772-afd0-31b7f0a04aa4\");\n",
" el.textContent = \"BokehJS \" + Bokeh.version + \" successfully loaded.\";\n",
" } else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(display_loaded, 100)\n",
" }\n",
" }\n",
" \n",
" function run_callbacks() {\n",
" window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n",
" delete window._bokeh_onload_callbacks\n",
" console.info(\"Bokeh: all callbacks have finished\");\n",
" }\n",
" \n",
" function load_libs(js_urls, callback) {\n",
" window._bokeh_onload_callbacks.push(callback);\n",
" if (window._bokeh_is_loading > 0) {\n",
" console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" window._bokeh_is_loading = js_urls.length;\n",
" for (var i = 0; i < js_urls.length; i++) {\n",
" var url = js_urls[i];\n",
" var s = document.createElement('script');\n",
" s.src = url;\n",
" s.async = false;\n",
" s.onreadystatechange = s.onload = function() {\n",
" window._bokeh_is_loading--;\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: all BokehJS libraries loaded\");\n",
" run_callbacks()\n",
" }\n",
" };\n",
" s.onerror = function() {\n",
" console.warn(\"failed to load library \" + url);\n",
" };\n",
" console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" }\n",
" };var element = document.getElementById(\"c635add8-7ce1-4772-afd0-31b7f0a04aa4\");\n",
" if (element == null) {\n",
" console.log(\"Bokeh: ERROR: autoload.js configured with elementid 'c635add8-7ce1-4772-afd0-31b7f0a04aa4' but no matching script tag was found. \")\n",
" return false;\n",
" }\n",
" \n",
" var js_urls = [];\n",
" \n",
" var inline_js = [\n",
" function(Bokeh) {\n",
" (function() {\n",
" var fn = function() {\n",
" var docs_json = {\"502285b9-bd66-4ecf-850e-3acb09e4995b\":{\"roots\":{\"references\":[{\"attributes\":{\"formatter\":{\"id\":\"1f844584-ecbc-4854-b2ec-1316a03430b9\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"fb34d75e-a9c3-494f-a14a-9381956744fe\",\"type\":\"BasicTicker\"}},\"id\":\"f4c86cc3-2d73-4cd7-8167-fe001907746d\",\"type\":\"LinearAxis\"},{\"attributes\":{\"active_drag\":\"auto\",\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"fa666ec0-385c-42e6-907d-5eb611573353\",\"type\":\"PanTool\"},{\"id\":\"cb75a9d6-93ae-4b39-837b-7a1fca435538\",\"type\":\"WheelZoomTool\"},{\"id\":\"0a859ea4-5058-4dd5-9de0-76eabafcb6e6\",\"type\":\"BoxZoomTool\"},{\"id\":\"541dcc06-602f-43e3-a9cd-6f8a5814866a\",\"type\":\"SaveTool\"},{\"id\":\"3d9d1b5e-9e9e-479e-ab12-9b82e1a3d435\",\"type\":\"ResetTool\"},{\"id\":\"39738093-a1bb-4703-90a0-19be5f4777ea\",\"type\":\"HelpTool\"}]},\"id\":\"efba7139-3076-4e9c-9bc4-800627dd2f2c\",\"type\":\"Toolbar\"},{\"attributes\":{\"plot\":{\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"e6d9e86f-f0c8-4312-9192-ae151b92e9d3\",\"type\":\"BasicTicker\"}},\"id\":\"7baf36e6-6600-4645-8c06-750cba24b80e\",\"type\":\"Grid\"},{\"attributes\":{},\"id\":\"4a77aeb7-189f-44c4-8ea5-5244e6851666\",\"type\":\"ToolEvents\"},{\"attributes\":{},\"id\":\"1f844584-ecbc-4854-b2ec-1316a03430b9\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{},\"id\":\"fb34d75e-a9c3-494f-a14a-9381956744fe\",\"type\":\"BasicTicker\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"fb34d75e-a9c3-494f-a14a-9381956744fe\",\"type\":\"BasicTicker\"}},\"id\":\"96a7745b-bbc2-4510-86a9-2b6b7a97d329\",\"type\":\"Grid\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"66b20522-f8c6-49e5-b878-b9760ef8e7ef\",\"type\":\"BoxAnnotation\"},{\"attributes\":{\"callback\":null,\"column_names\":[\"left\",\"right\",\"top\",\"bottom\"],\"data\":{\"bottom\":[0,71.5560138439245,99.87243961263626,99.96769238503728,99.98930159199212,99.99651798172754,99.99942235103322],\"left\":[1,2,3,4,5,6,7],\"right\":[1.2,2.2,3.2,4.2,5.2,6.2,7.2],\"top\":[71.5560138439245,99.87243961263626,99.96769238503728,99.98930159199212,99.99651798172754,99.99942235103322,99.99999999999999]}},\"id\":\"e299804a-f399-486f-84d5-0a828ea8dd5e\",\"type\":\"ColumnDataSource\"},{\"attributes\":{\"plot\":{\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"fa666ec0-385c-42e6-907d-5eb611573353\",\"type\":\"PanTool\"},{\"attributes\":{\"bottom\":{\"field\":\"bottom\"},\"fill_color\":{\"value\":\"#B3DE69\"},\"left\":{\"field\":\"left\"},\"line_color\":{\"value\":\"#B3DE69\"},\"right\":{\"field\":\"right\"},\"top\":{\"field\":\"top\"}},\"id\":\"4fc59f1c-b4aa-4592-a6cf-67ed993e200e\",\"type\":\"Quad\"},{\"attributes\":{\"overlay\":{\"id\":\"66b20522-f8c6-49e5-b878-b9760ef8e7ef\",\"type\":\"BoxAnnotation\"},\"plot\":{\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"0a859ea4-5058-4dd5-9de0-76eabafcb6e6\",\"type\":\"BoxZoomTool\"},{\"attributes\":{\"plot\":{\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"cb75a9d6-93ae-4b39-837b-7a1fca435538\",\"type\":\"WheelZoomTool\"},{\"attributes\":{\"plot\":{\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"541dcc06-602f-43e3-a9cd-6f8a5814866a\",\"type\":\"SaveTool\"},{\"attributes\":{\"data_source\":{\"id\":\"e299804a-f399-486f-84d5-0a828ea8dd5e\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"4fc59f1c-b4aa-4592-a6cf-67ed993e200e\",\"type\":\"Quad\"},\"hover_glyph\":null,\"muted_glyph\":null,\"nonselection_glyph\":{\"id\":\"935f00a4-432e-475a-ab32-851b2188938a\",\"type\":\"Quad\"},\"selection_glyph\":null},\"id\":\"ef3ec58e-7a0c-4ac0-ad3c-489d52f5a4c7\",\"type\":\"GlyphRenderer\"},{\"attributes\":{\"plot\":{\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"3d9d1b5e-9e9e-479e-ab12-9b82e1a3d435\",\"type\":\"ResetTool\"},{\"attributes\":{\"plot\":{\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"39738093-a1bb-4703-90a0-19be5f4777ea\",\"type\":\"HelpTool\"},{\"attributes\":{\"bottom\":{\"field\":\"bottom\"},\"fill_alpha\":{\"value\":0.1},\"fill_color\":{\"value\":\"#1f77b4\"},\"left\":{\"field\":\"left\"},\"line_alpha\":{\"value\":0.1},\"line_color\":{\"value\":\"#1f77b4\"},\"right\":{\"field\":\"right\"},\"top\":{\"field\":\"top\"}},\"id\":\"935f00a4-432e-475a-ab32-851b2188938a\",\"type\":\"Quad\"},{\"attributes\":{\"plot\":null,\"text\":\"\"},\"id\":\"43cbbf17-ae1e-455d-a2a5-1eb9712486d5\",\"type\":\"Title\"},{\"attributes\":{},\"id\":\"17e9891c-2270-4963-9a51-3211183520d8\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"below\":[{\"id\":\"1c681cfe-593c-4d45-b4a9-b7826c62931c\",\"type\":\"LinearAxis\"}],\"left\":[{\"id\":\"f4c86cc3-2d73-4cd7-8167-fe001907746d\",\"type\":\"LinearAxis\"}],\"plot_height\":400,\"plot_width\":400,\"renderers\":[{\"id\":\"1c681cfe-593c-4d45-b4a9-b7826c62931c\",\"type\":\"LinearAxis\"},{\"id\":\"7baf36e6-6600-4645-8c06-750cba24b80e\",\"type\":\"Grid\"},{\"id\":\"f4c86cc3-2d73-4cd7-8167-fe001907746d\",\"type\":\"LinearAxis\"},{\"id\":\"96a7745b-bbc2-4510-86a9-2b6b7a97d329\",\"type\":\"Grid\"},{\"id\":\"66b20522-f8c6-49e5-b878-b9760ef8e7ef\",\"type\":\"BoxAnnotation\"},{\"id\":\"ef3ec58e-7a0c-4ac0-ad3c-489d52f5a4c7\",\"type\":\"GlyphRenderer\"}],\"title\":{\"id\":\"43cbbf17-ae1e-455d-a2a5-1eb9712486d5\",\"type\":\"Title\"},\"tool_events\":{\"id\":\"4a77aeb7-189f-44c4-8ea5-5244e6851666\",\"type\":\"ToolEvents\"},\"toolbar\":{\"id\":\"efba7139-3076-4e9c-9bc4-800627dd2f2c\",\"type\":\"Toolbar\"},\"x_range\":{\"id\":\"db286247-63f4-4cf0-81a8-47b93270114b\",\"type\":\"DataRange1d\"},\"y_range\":{\"id\":\"9faae2aa-d12f-493f-b5bd-3da08a14abb7\",\"type\":\"DataRange1d\"}},\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"},{\"attributes\":{\"callback\":null},\"id\":\"db286247-63f4-4cf0-81a8-47b93270114b\",\"type\":\"DataRange1d\"},{\"attributes\":{\"callback\":null},\"id\":\"9faae2aa-d12f-493f-b5bd-3da08a14abb7\",\"type\":\"DataRange1d\"},{\"attributes\":{\"formatter\":{\"id\":\"17e9891c-2270-4963-9a51-3211183520d8\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"44703491-df68-456e-ac5b-5a840d161af1\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"e6d9e86f-f0c8-4312-9192-ae151b92e9d3\",\"type\":\"BasicTicker\"}},\"id\":\"1c681cfe-593c-4d45-b4a9-b7826c62931c\",\"type\":\"LinearAxis\"},{\"attributes\":{},\"id\":\"e6d9e86f-f0c8-4312-9192-ae151b92e9d3\",\"type\":\"BasicTicker\"}],\"root_ids\":[\"44703491-df68-456e-ac5b-5a840d161af1\"]},\"title\":\"Bokeh Application\",\"version\":\"0.12.5\"}};\n",
" var render_items = [{\"docid\":\"502285b9-bd66-4ecf-850e-3acb09e4995b\",\"elementid\":\"c635add8-7ce1-4772-afd0-31b7f0a04aa4\",\"modelid\":\"44703491-df68-456e-ac5b-5a840d161af1\"}];\n",
" \n",
" Bokeh.embed.embed_items(docs_json, render_items);\n",
" };\n",
" if (document.readyState != \"loading\") fn();\n",
" else document.addEventListener(\"DOMContentLoaded\", fn);\n",
" })();\n",
" },\n",
" function(Bokeh) {\n",
" }\n",
" ];\n",
" \n",
" function run_inline_js() {\n",
" \n",
" if ((window.Bokeh !== undefined) || (force === true)) {\n",
" for (var i = 0; i < inline_js.length; i++) {\n",
" inline_js[i](window.Bokeh);\n",
" }if (force === true) {\n",
" display_loaded();\n",
" }} else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!window._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" window._bokeh_failed_load = true;\n",
" } else if (force !== true) {\n",
" var cell = $(document.getElementById(\"c635add8-7ce1-4772-afd0-31b7f0a04aa4\")).parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" \n",
" }\n",
" \n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(js_urls, function() {\n",
" console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
" }(this));\n",
"</script>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from bokeh.plotting import figure, show\n",
"\n",
"p = figure(width=400, height=400)\n",
"p.quad(top=cumulative_distribution,\n",
" bottom=bottom,\n",
" left=[1, 2, 3, 4, 5, 6, 7],\n",
" right=[1.2, 2.2, 3.2, 4.2, 5.2, 6.2, 7.2],\n",
" color=\"#B3DE69\")\n",
"\n",
"show(p)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can see that after the first two components the contribution of the remaining to the total is almost non existent and cannot be seen at all. They are there and you can zoom in to see the others. The first two principal components are able to explain more than 99% of the data that we have."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that you know where the majority of the data in your dataset is coming from it is becomes obvious to just trim those extra dimensions. Effectively we will construct a projection matrix to transform our data to a new feature subspace. In this case we will trim all the other features and take into account only two features. So we are reducing our 7 dimensional feature space to 2 dimensional feature subspace. This is possible since we are only choosing the top 2 eigen vectors with the highest eigen values to construct our (150 x 2) eigen vector matrix W."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[5.0406230005245538, 1.9946950557326693, 0.0067098946634397017, 0.0015222181861223813, 0.00050834441617230038, 0.00020459259729253709, 4.0691348102287684e-05]\n"
]
}
],
"source": [
"# Mapping eigen value to eigen vector\n",
"eigen_pairs = [(np.abs(eigen_vals[i]), eigen_vecs[:,i]) for i in range(len(eigen_vals))]\n",
"\n",
"# Sort the (eigenvalue, eigenvector) tuples from high to low\n",
"eigen_pairs.sort()\n",
"eigen_pairs.reverse()\n",
"\n",
"# check if the value is decreasing\n",
"print([e[0] for e in eigen_pairs])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can see that the values are in descending order."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[-0.44636953 -0.0255697 ]\n",
" [-0.44626454 -0.03201096]\n",
" [-0.44644488 -0.02362002]\n",
" [-0.44641063 -0.02742468]\n",
" [-0.44640923 -0.02782594]\n",
" [ 0.04405545 -0.70563431]\n",
" [ 0.04224484 -0.705916 ]]\n"
]
}
],
"source": [
"matrix_w = np.hstack((eigen_pairs[0][1].reshape(7,1), \n",
" eigen_pairs[1][1].reshape(7,1)))\n",
"\n",
"print(matrix_w)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"# Dimensionality Reduction\n",
"\n",
"Using this new loading matrix W, we can reduce the dimensions of the original matrix X. This can be done using the matrix multiplication property, whereby if you multiply two matrices of dimensions (m x n) and (n x p), you get a new matrix of dimensions (m x p). So in this case we multiply matrix X, which is (159 x 7) and matrix W, which is (7 x 2) to get a new matrix (159 x 2)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![reduce the dimensions](http://latex.codecogs.com/png.latex?\\large&space;Y_{150&space;\\times&space;2}&space;=&space;X_{150&space;\\times&space;7}&space;\\times&space;W_{7&space;\\times&space;2})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Taking a look at the code below."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"Y = X_std.dot(matrix_w)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[-3.12570525, 0.1778062 ],\n",
" [-3.22460186, 0.17146108],\n",
" [-3.26460636, 0.16888734],\n",
" [-3.227314 , 0.17157395]])"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Y[:4]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now if you remember the shapes of hitech, bhagyanar and hudco were (51, 8), (55, 8), (53, 8). So to create the mappings we create a Z vector/list with shape (159, 1)."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from six.moves import zip\n",
"from bokeh.plotting import figure, show"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" <div class=\"bk-root\">\n",
" <a href=\"http://bokeh.pydata.org\" target=\"_blank\" class=\"bk-logo bk-logo-small bk-logo-notebook\"></a>\n",
" <span id=\"a5af2be0-5fc5-496f-be86-9125a993041c\">Loading BokehJS ...</span>\n",
" </div>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/javascript": [
"\n",
"(function(global) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
"\n",
" var force = true;\n",
"\n",
" if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n",
" window._bokeh_onload_callbacks = [];\n",
" window._bokeh_is_loading = undefined;\n",
" }\n",
"\n",
"\n",
" \n",
" if (typeof (window._bokeh_timeout) === \"undefined\" || force === true) {\n",
" window._bokeh_timeout = Date.now() + 5000;\n",
" window._bokeh_failed_load = false;\n",
" }\n",
"\n",
" var NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"<div style='background-color: #fdd'>\\n\"+\n",
" \"<p>\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"</p>\\n\"+\n",
" \"<ul>\\n\"+\n",
" \"<li>re-rerun `output_notebook()` to attempt to load from CDN again, or</li>\\n\"+\n",
" \"<li>use INLINE resources instead, as so:</li>\\n\"+\n",
" \"</ul>\\n\"+\n",
" \"<code>\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"</code>\\n\"+\n",
" \"</div>\"}};\n",
"\n",
" function display_loaded() {\n",
" if (window.Bokeh !== undefined) {\n",
" var el = document.getElementById(\"a5af2be0-5fc5-496f-be86-9125a993041c\");\n",
" el.textContent = \"BokehJS \" + Bokeh.version + \" successfully loaded.\";\n",
" } else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(display_loaded, 100)\n",
" }\n",
" }\n",
"\n",
" function run_callbacks() {\n",
" window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n",
" delete window._bokeh_onload_callbacks\n",
" console.info(\"Bokeh: all callbacks have finished\");\n",
" }\n",
"\n",
" function load_libs(js_urls, callback) {\n",
" window._bokeh_onload_callbacks.push(callback);\n",
" if (window._bokeh_is_loading > 0) {\n",
" console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" window._bokeh_is_loading = js_urls.length;\n",
" for (var i = 0; i < js_urls.length; i++) {\n",
" var url = js_urls[i];\n",
" var s = document.createElement('script');\n",
" s.src = url;\n",
" s.async = false;\n",
" s.onreadystatechange = s.onload = function() {\n",
" window._bokeh_is_loading--;\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: all BokehJS libraries loaded\");\n",
" run_callbacks()\n",
" }\n",
" };\n",
" s.onerror = function() {\n",
" console.warn(\"failed to load library \" + url);\n",
" };\n",
" console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" }\n",
" };var element = document.getElementById(\"a5af2be0-5fc5-496f-be86-9125a993041c\");\n",
" if (element == null) {\n",
" console.log(\"Bokeh: ERROR: autoload.js configured with elementid 'a5af2be0-5fc5-496f-be86-9125a993041c' but no matching script tag was found. \")\n",
" return false;\n",
" }\n",
"\n",
" var js_urls = [\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.5.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.5.min.js\"];\n",
"\n",
" var inline_js = [\n",
" function(Bokeh) {\n",
" Bokeh.set_log_level(\"info\");\n",
" },\n",
" \n",
" function(Bokeh) {\n",
" \n",
" },\n",
" \n",
" function(Bokeh) {\n",
" \n",
" document.getElementById(\"a5af2be0-5fc5-496f-be86-9125a993041c\").textContent = \"BokehJS is loading...\";\n",
" },\n",
" function(Bokeh) {\n",
" console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-0.12.5.min.css\");\n",
" Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.5.min.css\");\n",
" console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.5.min.css\");\n",
" Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.5.min.css\");\n",
" }\n",
" ];\n",
"\n",
" function run_inline_js() {\n",
" \n",
" if ((window.Bokeh !== undefined) || (force === true)) {\n",
" for (var i = 0; i < inline_js.length; i++) {\n",
" inline_js[i](window.Bokeh);\n",
" }if (force === true) {\n",
" display_loaded();\n",
" }} else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!window._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" window._bokeh_failed_load = true;\n",
" } else if (force !== true) {\n",
" var cell = $(document.getElementById(\"a5af2be0-5fc5-496f-be86-9125a993041c\")).parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
"\n",
" }\n",
"\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(js_urls, function() {\n",
" console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
"}(this));"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output_notebook()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
" <div class=\"bk-root\">\n",
" <div class=\"bk-plotdiv\" id=\"a8ccc9c9-4c49-4c4d-b835-97ca1d75e9b4\"></div>\n",
" </div>\n",
"<script type=\"text/javascript\">\n",
" \n",
" (function(global) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
" \n",
" var force = false;\n",
" \n",
" if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n",
" window._bokeh_onload_callbacks = [];\n",
" window._bokeh_is_loading = undefined;\n",
" }\n",
" \n",
" \n",
" \n",
" if (typeof (window._bokeh_timeout) === \"undefined\" || force === true) {\n",
" window._bokeh_timeout = Date.now() + 0;\n",
" window._bokeh_failed_load = false;\n",
" }\n",
" \n",
" var NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"<div style='background-color: #fdd'>\\n\"+\n",
" \"<p>\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"</p>\\n\"+\n",
" \"<ul>\\n\"+\n",
" \"<li>re-rerun `output_notebook()` to attempt to load from CDN again, or</li>\\n\"+\n",
" \"<li>use INLINE resources instead, as so:</li>\\n\"+\n",
" \"</ul>\\n\"+\n",
" \"<code>\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"</code>\\n\"+\n",
" \"</div>\"}};\n",
" \n",
" function display_loaded() {\n",
" if (window.Bokeh !== undefined) {\n",
" var el = document.getElementById(\"a8ccc9c9-4c49-4c4d-b835-97ca1d75e9b4\");\n",
" el.textContent = \"BokehJS \" + Bokeh.version + \" successfully loaded.\";\n",
" } else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(display_loaded, 100)\n",
" }\n",
" }\n",
" \n",
" function run_callbacks() {\n",
" window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n",
" delete window._bokeh_onload_callbacks\n",
" console.info(\"Bokeh: all callbacks have finished\");\n",
" }\n",
" \n",
" function load_libs(js_urls, callback) {\n",
" window._bokeh_onload_callbacks.push(callback);\n",
" if (window._bokeh_is_loading > 0) {\n",
" console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" window._bokeh_is_loading = js_urls.length;\n",
" for (var i = 0; i < js_urls.length; i++) {\n",
" var url = js_urls[i];\n",
" var s = document.createElement('script');\n",
" s.src = url;\n",
" s.async = false;\n",
" s.onreadystatechange = s.onload = function() {\n",
" window._bokeh_is_loading--;\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: all BokehJS libraries loaded\");\n",
" run_callbacks()\n",
" }\n",
" };\n",
" s.onerror = function() {\n",
" console.warn(\"failed to load library \" + url);\n",
" };\n",
" console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" }\n",
" };var element = document.getElementById(\"a8ccc9c9-4c49-4c4d-b835-97ca1d75e9b4\");\n",
" if (element == null) {\n",
" console.log(\"Bokeh: ERROR: autoload.js configured with elementid 'a8ccc9c9-4c49-4c4d-b835-97ca1d75e9b4' but no matching script tag was found. \")\n",
" return false;\n",
" }\n",
" \n",
" var js_urls = [];\n",
" \n",
" var inline_js = [\n",
" function(Bokeh) {\n",
" (function() {\n",
" var fn = function() {\n",
" var docs_json = {\"678e78cd-8de4-435c-9e90-4db36ca1341f\":{\"roots\":{\"references\":[{\"attributes\":{\"fill_alpha\":{\"value\":0.1},\"fill_color\":{\"value\":\"#1f77b4\"},\"line_alpha\":{\"value\":0.1},\"line_color\":{\"value\":\"#1f77b4\"},\"radius\":{\"units\":\"data\",\"value\":0.1},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"7af5e933-a84a-4757-b4cd-694bf3cf31ce\",\"type\":\"Circle\"},{\"attributes\":{\"callback\":null},\"id\":\"adedf9e9-5ba3-4285-b0f8-7617ebdc3bcf\",\"type\":\"DataRange1d\"},{\"attributes\":{\"formatter\":{\"id\":\"75674d76-518d-4175-b467-0e8cf856fcd6\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"2b9e6861-833a-48f0-972b-7df0b5d7827a\",\"type\":\"BasicTicker\"}},\"id\":\"c0b29853-f29a-4c79-b553-f84ea270d023\",\"type\":\"LinearAxis\"},{\"attributes\":{\"callback\":null},\"id\":\"8ef2d6f2-56b6-4fe7-b07c-0bfe5e1a6fa8\",\"type\":\"DataRange1d\"},{\"attributes\":{},\"id\":\"19971597-9eed-4b02-ad36-19d1241d9889\",\"type\":\"ToolEvents\"},{\"attributes\":{},\"id\":\"2b9e6861-833a-48f0-972b-7df0b5d7827a\",\"type\":\"BasicTicker\"},{\"attributes\":{},\"id\":\"f53936d3-4220-4849-bc98-00301739b411\",\"type\":\"BasicTicker\"},{\"attributes\":{\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"2b9e6861-833a-48f0-972b-7df0b5d7827a\",\"type\":\"BasicTicker\"}},\"id\":\"9cf23c74-7b67-4ce6-bdfa-86a5fade06cd\",\"type\":\"Grid\"},{\"attributes\":{\"formatter\":{\"id\":\"09b11739-3ed0-433f-9244-707c52f5e57c\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"f53936d3-4220-4849-bc98-00301739b411\",\"type\":\"BasicTicker\"}},\"id\":\"61b82ade-afe0-4fd2-b4bf-0428ec2e58d3\",\"type\":\"LinearAxis\"},{\"attributes\":{\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"2a14dbb9-b046-494d-bef9-58a2c98671bd\",\"type\":\"CrosshairTool\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"f53936d3-4220-4849-bc98-00301739b411\",\"type\":\"BasicTicker\"}},\"id\":\"e3faba8b-9262-4225-84e4-a7dc8adbb1d0\",\"type\":\"Grid\"},{\"attributes\":{\"data_source\":{\"id\":\"d150e53d-dd38-47a7-9722-e1cc47203952\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"9d7ef3bd-d4b3-4006-9d1b-3f3ba816043c\",\"type\":\"Circle\"},\"hover_glyph\":null,\"muted_glyph\":null,\"nonselection_glyph\":{\"id\":\"7af5e933-a84a-4757-b4cd-694bf3cf31ce\",\"type\":\"Circle\"},\"selection_glyph\":null},\"id\":\"2af25d98-6ba4-48c2-aac6-c0b1581854a9\",\"type\":\"GlyphRenderer\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.6},\"fill_color\":{\"field\":\"fill_color\"},\"line_color\":{\"value\":null},\"radius\":{\"units\":\"data\",\"value\":0.1},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"9d7ef3bd-d4b3-4006-9d1b-3f3ba816043c\",\"type\":\"Circle\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"ba5cc905-77e9-468c-9968-7f071e4f8d04\",\"type\":\"BoxAnnotation\"},{\"attributes\":{\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"a6b3fed1-4e60-4163-a0db-2e04c12a5c56\",\"type\":\"ResizeTool\"},{\"attributes\":{\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"e46c5bf5-293f-46d5-b2dd-395b79528383\",\"type\":\"PanTool\"},{\"attributes\":{\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"ffc8cc6f-95a5-4c22-b856-3210d86d4811\",\"type\":\"WheelZoomTool\"},{\"attributes\":{\"plot\":null,\"text\":\"\"},\"id\":\"46c89e89-4c95-4345-a4af-4c66c66a599a\",\"type\":\"Title\"},{\"attributes\":{\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"6c3f7196-bf1a-414e-830f-a629270cc563\",\"type\":\"SaveTool\"},{\"attributes\":{\"overlay\":{\"id\":\"ba5cc905-77e9-468c-9968-7f071e4f8d04\",\"type\":\"BoxAnnotation\"},\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"56b8824e-7697-4049-a67c-fdf9f06b0d1a\",\"type\":\"BoxZoomTool\"},{\"attributes\":{\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"c769b085-5fed-4e81-b948-86cc05faf6dc\",\"type\":\"ResetTool\"},{\"attributes\":{\"callback\":null,\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"a9423a26-9f4b-4eb4-a0cc-a29265871f41\",\"type\":\"TapTool\"},{\"attributes\":{\"callback\":null,\"overlay\":{\"id\":\"4bd1bbf3-c822-4b5e-8317-704642cebe74\",\"type\":\"BoxAnnotation\"},\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"renderers\":[{\"id\":\"2af25d98-6ba4-48c2-aac6-c0b1581854a9\",\"type\":\"GlyphRenderer\"}]},\"id\":\"bd8cdc0e-306a-4006-aa08-cb8c566108dd\",\"type\":\"BoxSelectTool\"},{\"attributes\":{\"overlay\":{\"id\":\"870572b2-d387-477a-b144-9632e78af3c9\",\"type\":\"PolyAnnotation\"},\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"adcebf04-865d-416f-bb66-cb72912e76e9\",\"type\":\"PolySelectTool\"},{\"attributes\":{\"callback\":null,\"overlay\":{\"id\":\"0af952fa-dfc2-43e7-b144-7c48dd080626\",\"type\":\"PolyAnnotation\"},\"plot\":{\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"ce485aac-5919-4e9c-a253-8f93db96dfc9\",\"type\":\"LassoSelectTool\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"4bd1bbf3-c822-4b5e-8317-704642cebe74\",\"type\":\"BoxAnnotation\"},{\"attributes\":{},\"id\":\"09b11739-3ed0-433f-9244-707c52f5e57c\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{},\"id\":\"75674d76-518d-4175-b467-0e8cf856fcd6\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"xs_units\":\"screen\",\"ys_units\":\"screen\"},\"id\":\"870572b2-d387-477a-b144-9632e78af3c9\",\"type\":\"PolyAnnotation\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"xs_units\":\"screen\",\"ys_units\":\"screen\"},\"id\":\"0af952fa-dfc2-43e7-b144-7c48dd080626\",\"type\":\"PolyAnnotation\"},{\"attributes\":{\"callback\":null,\"column_names\":[\"x\",\"y\",\"fill_color\"],\"data\":{\"fill_color\":[\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\"],\"x\":{\"__ndarray__\":\"r1sTwXEBCcCj89UP/MsJwDZHDPHpHQrAlwkjAIrRCcDkn7EJsBQKwLn3bpKRyArApLo1GL15C8DjCLEUHcsMwEKYFR0ZOw3ABDFuqtQYDcBcWk5S0PAMwMwpM9VVqwzAQhVGoCQJDMANMWJ1rvgIwGHTcuK6VAjAeTCioBe+CMC9KvXb8QoJwDQ3cSyQSgrAxJ5a06ElCsC8rYburlMKwFelV/UqmArA45EJ3tFtCsDBEFFBNUcLwOenf51HpgvA5D/URgnBC8DplhFiBPgLwMU9nA5yhgzAqnuawS3KCcAOiCA9d6EHwGfqpjiqgAbAUVEy137RBsAQeKSN1tgGwKccEXryvQbAQaFXydzCBsBiKSOb4g8HwLD/wLY7CAfAAE7CWJGUBsCD3z5jGWQGwNJwusDL8AXA4Hw+ZlcKBsDiXhrzkSAHwMa8LunwRAfA+somp6ohB8D2RtMvXwoHwFdMO3CJ3gbAF+xWlg8LB8DNfU+GfZcIwFcqhsgLwwfAJSW3/8NLBsBSabxsMboGwKik2alvZwfAR5Ub/VWDAUDr/eopUmkBQB5cobQnWAFA0Qw7JZVSAUDFcdy+cycBQIypGd8qBwFAcgCNKdPrAEB04ogl0tIAQM4iSDg33ABA6W0XquTWAEBmb/rI3RABQHUI/5pOPwFAJD98Cib+AEBwBYeMlPYAQKGRrO4UzQBA5HWMym/zAEBpI3GwzUwBQHhLyq7HfwFAnmZvN+JcAUAtUjrbq1sBQJrkbo45awFAjxx6f11qAUBih9fpiU4BQL8QLsbBagFA0zUV5uhDAUD9l+ZLPTwBQJ6edBN0awFAmOhuETVMAUC6iL5ozAoBQLSWzOwKLQFA4HZqGXt1AUDeEnMlb4MBQDtI+amaYwFAfSWx4+ZaAUBzKaXBfU4BQDLgK45+OwFAklH0gdQbAUAs8fQh82IBQAQjB3v4pgFArdrjJDp+AUDoyqJpEEsBQDhQPbktMgFAhlDz3B1JAUBEUMoxR/oAQDcBRb+L/gBAi43wEPqeAUD9kMFj7JsBQBd8SNdvygFAJkmMqaXAAUDAqkUBZqkBQLVsAFIDjwFAxiC3LtNTAUCDlylraUUBQOVC/g64PgFAai3/UWvQAECstBAniCvhPw7AuaJlSeE/tO8rheW84D+CEb/rRgviP6+R+DT6e+A//XJsg9ga3j+uxKH9TkTaP5SOz6Ct99Y/PQW9L0nU1T+pCcOIXtraPzyIjCF979g/pGVASXg92j/7DpkzUIHePwesz+y/COM/bLOjmxu34D+w9Guok+noPyaAw1NC0uw/Cgmbpz2g6T9q0r2LbJ3sP3tgCsWrJu0/XfUiOZm97T+0dIkxC9jtP+oWJ/iT5e0/JwK1Ypoj7j/Y7ATygynuP7cn5Gx03u4/HfdRD5re7T+I7mmXOMLsP5gdy9O5ZO0/rCAVC71u7T/gr+aUY3/tP9Vh7wdMY+0/OMSaUiaf7D/G2a+juafsP1/WClWjHe0/7jRzwsXt7D+s2K0hvVzsP9kM4cwwc+s/Psai5GQV6z8dLohg+AHqP1WqdJMVzuk/rcflMn9N6T8co02az+boP+PlTeX15Og/KgqL5ZFZ6T9C+PyhqN7pP4jx8g8uSuk/jlfEjCbM6j8qv3NOXVLtP57mQm0Bsu4/j9wiXgKP7j8KEhONKpHsP1IdYqghffg/\",\"dtype\":\"float64\",\"shape\":[159]},\"y\":{\"__ndarray__\":\"oNTjgVrCxj/OGZLDb/LFP3Vx5aoZnsU/QSygpiL2xT8PhstEF6bFP3BhMQmXAcU/mmj/yhBTxD8FKgDFhwfDP60ArVJ1k8I/5N6E1y2swj9f4//9FL3CP9yZLbul3sI/3BWZkPcFwz+4+7bzAkHGP+x/zlSvU8c/9LWJZpEFxz8Dlt4acYPGPwiiHJPLgcU/Ga/jy0aXxT8/WSLBelbFP5kx++GsBMU/nVd8s+YKxT/xK+jOYGLEP+8qvwOO28M/3rPkiSv1wz8whroBVCjDP8pGDskzZcE/6u+ue61xxD+W6creb8jHP3v6ZDicJMk/HAm3P8HPyD/yGAm1JbjIPzEbVTyI18g/0oDAx4XcyD/LiPkiMaPIP/xdbfZxo8g/pp7Ic8b0yD+V7dCaeDrJP7col3ffsMk/S34Si9FHyT9oVXtBqH7IP3/kzkWmdMg/4Fq4J3WayD93WqGZj6fIP0FiauFQzMg/r0vQAhSvyD+cHpMmEifHP9Whju3kysc/JzwvYOVYyT/TbAIODefIP8KdKoG2J8g/GHaS8p0V4D891WqcwAzgP8/6KGdGAuA/KtGDzMUH4D+WGVmfTvTfP8yTRcr45d8/Q46dUV3f3z/SlzqHysjfP5Nn1anoW98/A+m7jirP3z9aB8zGnenfP9/akoq08N8//6wZtlDp3z/Cni/zsuDfP9A+FRCVtd8/t8Of6IBg3z++kdtBdgjgPwvw4nIyFuA/XUgq24kL4D+8WCmb7wvgP5wTFcn2DuA/cfZYTIAN4D8Z7ocIcQfgP2pQk06gBuA/2bZtnNgD4D9vGScqt6vfP/jshBJ2DOA/PU6wp4oI4D8N7X885+3fPzbW8Cg94d8/d741/rUQ4D/R/Vsh7BXgP9MQ1C3sDeA/50aDsXQK4D/oEjOhvgngP2MIgcRj9d8/QGY0k9XB3z/lQ2A+ugzgP1p9XbLMHeA/cKo5+8EU4D9PJKVN+QjgP2Izi/UqAOA/ywJTQBgG4D8hoI7tc+ffP2HvENx24t8/S7swN2QX4D+0tfJTHx3gP8ehKuxNJ+A/DhO6XSgm4D+NbBIvpiDgP96G+razGuA/gTImaWYL4D+YBU1PFPzfPz4XiCq4BeA/AivybajR3z9nPtJHOx+7v0dzDZe5ScI/f/sCxp9IM7+GJtXhBO7qvx51WBlFQdi/3BIQqrkf0L/tWq26rUvjv5PRsD8mzsq/477YL0nm27/ClhsNr9H4v4bYYDhF7tO/dzk0VFzm6r+TCcvNL3wHwHAkQ2VkCxvAUlnSnyu1GcALpkHrOHMZwIea50kUaQbAqGjVuAIz2T8HErWRxy/TP1CWue0WNco/FWzbdJup2D9xWmFvG9HUP4XjInQzi9c/kHOx15to1j+paoasYX7UP29D+W4aF9U/lfepkIJU1D8Dj/ONDAPNPw170TmV9dU/rtgGrZHP1D+XwBQHF2zUP3Ozo78zNNM/Q3nwM2f61T/ayH8tZ1PSP5vZxFV7Idc/QqgXHyOP0z9kOXCbtDHTP0fONbfE5NQ/v9lWypJb0j/RZ2kxk/nVP6ID8kwo7NI/pXbV3LVb1D9yZCuXtKnNP1CdDZLdcsk/YRZ8KC7ptT8w8ytxx1jJv2TRFi1MtLy/rhsA7d+M0r87rNbDCwXWvzpd2GC4iLG/jKnrii+32r9iUqEDpY/mv69PH3LJSSjA\",\"dtype\":\"float64\",\"shape\":[159]}}},\"id\":\"d150e53d-dd38-47a7-9722-e1cc47203952\",\"type\":\"ColumnDataSource\"},{\"attributes\":{\"active_drag\":\"auto\",\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"a6b3fed1-4e60-4163-a0db-2e04c12a5c56\",\"type\":\"ResizeTool\"},{\"id\":\"2a14dbb9-b046-494d-bef9-58a2c98671bd\",\"type\":\"CrosshairTool\"},{\"id\":\"e46c5bf5-293f-46d5-b2dd-395b79528383\",\"type\":\"PanTool\"},{\"id\":\"ffc8cc6f-95a5-4c22-b856-3210d86d4811\",\"type\":\"WheelZoomTool\"},{\"id\":\"56b8824e-7697-4049-a67c-fdf9f06b0d1a\",\"type\":\"BoxZoomTool\"},{\"id\":\"c769b085-5fed-4e81-b948-86cc05faf6dc\",\"type\":\"ResetTool\"},{\"id\":\"a9423a26-9f4b-4eb4-a0cc-a29265871f41\",\"type\":\"TapTool\"},{\"id\":\"6c3f7196-bf1a-414e-830f-a629270cc563\",\"type\":\"SaveTool\"},{\"id\":\"bd8cdc0e-306a-4006-aa08-cb8c566108dd\",\"type\":\"BoxSelectTool\"},{\"id\":\"adcebf04-865d-416f-bb66-cb72912e76e9\",\"type\":\"PolySelectTool\"},{\"id\":\"ce485aac-5919-4e9c-a253-8f93db96dfc9\",\"type\":\"LassoSelectTool\"}]},\"id\":\"d5302aa2-170b-431c-a824-db567bc969f9\",\"type\":\"Toolbar\"},{\"attributes\":{\"below\":[{\"id\":\"c0b29853-f29a-4c79-b553-f84ea270d023\",\"type\":\"LinearAxis\"}],\"left\":[{\"id\":\"61b82ade-afe0-4fd2-b4bf-0428ec2e58d3\",\"type\":\"LinearAxis\"}],\"renderers\":[{\"id\":\"c0b29853-f29a-4c79-b553-f84ea270d023\",\"type\":\"LinearAxis\"},{\"id\":\"9cf23c74-7b67-4ce6-bdfa-86a5fade06cd\",\"type\":\"Grid\"},{\"id\":\"61b82ade-afe0-4fd2-b4bf-0428ec2e58d3\",\"type\":\"LinearAxis\"},{\"id\":\"e3faba8b-9262-4225-84e4-a7dc8adbb1d0\",\"type\":\"Grid\"},{\"id\":\"ba5cc905-77e9-468c-9968-7f071e4f8d04\",\"type\":\"BoxAnnotation\"},{\"id\":\"4bd1bbf3-c822-4b5e-8317-704642cebe74\",\"type\":\"BoxAnnotation\"},{\"id\":\"870572b2-d387-477a-b144-9632e78af3c9\",\"type\":\"PolyAnnotation\"},{\"id\":\"0af952fa-dfc2-43e7-b144-7c48dd080626\",\"type\":\"PolyAnnotation\"},{\"id\":\"2af25d98-6ba4-48c2-aac6-c0b1581854a9\",\"type\":\"GlyphRenderer\"}],\"title\":{\"id\":\"46c89e89-4c95-4345-a4af-4c66c66a599a\",\"type\":\"Title\"},\"tool_events\":{\"id\":\"19971597-9eed-4b02-ad36-19d1241d9889\",\"type\":\"ToolEvents\"},\"toolbar\":{\"id\":\"d5302aa2-170b-431c-a824-db567bc969f9\",\"type\":\"Toolbar\"},\"x_range\":{\"id\":\"adedf9e9-5ba3-4285-b0f8-7617ebdc3bcf\",\"type\":\"DataRange1d\"},\"y_range\":{\"id\":\"8ef2d6f2-56b6-4fe7-b07c-0bfe5e1a6fa8\",\"type\":\"DataRange1d\"}},\"id\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\",\"subtype\":\"Figure\",\"type\":\"Plot\"}],\"root_ids\":[\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\"]},\"title\":\"Bokeh Application\",\"version\":\"0.12.5\"}};\n",
" var render_items = [{\"docid\":\"678e78cd-8de4-435c-9e90-4db36ca1341f\",\"elementid\":\"a8ccc9c9-4c49-4c4d-b835-97ca1d75e9b4\",\"modelid\":\"1ac25a9a-40d6-4057-ac8f-7d4d78d61ff4\"}];\n",
" \n",
" Bokeh.embed.embed_items(docs_json, render_items);\n",
" };\n",
" if (document.readyState != \"loading\") fn();\n",
" else document.addEventListener(\"DOMContentLoaded\", fn);\n",
" })();\n",
" },\n",
" function(Bokeh) {\n",
" }\n",
" ];\n",
" \n",
" function run_inline_js() {\n",
" \n",
" if ((window.Bokeh !== undefined) || (force === true)) {\n",
" for (var i = 0; i < inline_js.length; i++) {\n",
" inline_js[i](window.Bokeh);\n",
" }if (force === true) {\n",
" display_loaded();\n",
" }} else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!window._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" window._bokeh_failed_load = true;\n",
" } else if (force !== true) {\n",
" var cell = $(document.getElementById(\"a8ccc9c9-4c49-4c4d-b835-97ca1d75e9b4\")).parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" \n",
" }\n",
" \n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(js_urls, function() {\n",
" console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
" }(this));\n",
"</script>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x = Y[:, 0]\n",
"y = Y[:, 1]\n",
"radii = 0.1\n",
"\n",
"colors = ['#9f4496'] * 51 + ['#59b996'] * 55 + ['#3288bd'] * 53\n",
"\n",
"TOOLS=\"resize,crosshair,pan,wheel_zoom,box_zoom,reset,tap,previewsave,box_select,poly_select,lasso_select\"\n",
"\n",
"p = figure(tools=TOOLS)\n",
"p.scatter(x, y, radius=radii, fill_color=colors, fill_alpha=0.6, line_color=None)\n",
"\n",
"show(p)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can see that two companies are very tighly packed, saying that there is not much variation in them while the third company has a lot of variation. This company has had a rocky year."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# In production\n",
"\n",
"The code that is shown above is mainly for learning purposes only. In production, we will be using the inbuilt function from the sklearn library."
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 3.12570525 -0.1778062 ]\n",
" [ 3.22460186 -0.17146108]]\n",
"[[-3.12570525 0.1778062 ]\n",
" [-3.22460186 0.17146108]]\n"
]
}
],
"source": [
"from sklearn.decomposition import PCA as productionPCA\n",
"prod_pca = productionPCA(n_components=2)\n",
"Y_prod = prod_pca.fit_transform(X_std)\n",
"\n",
"print(Y_prod[:2])\n",
"\n",
"print(Y[:2])"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
" <div class=\"bk-root\">\n",
" <div class=\"bk-plotdiv\" id=\"a773ecfa-47f7-41e5-beb5-c3386da4a45e\"></div>\n",
" </div>\n",
"<script type=\"text/javascript\">\n",
" \n",
" (function(global) {\n",
" function now() {\n",
" return new Date();\n",
" }\n",
" \n",
" var force = false;\n",
" \n",
" if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n",
" window._bokeh_onload_callbacks = [];\n",
" window._bokeh_is_loading = undefined;\n",
" }\n",
" \n",
" \n",
" \n",
" if (typeof (window._bokeh_timeout) === \"undefined\" || force === true) {\n",
" window._bokeh_timeout = Date.now() + 0;\n",
" window._bokeh_failed_load = false;\n",
" }\n",
" \n",
" var NB_LOAD_WARNING = {'data': {'text/html':\n",
" \"<div style='background-color: #fdd'>\\n\"+\n",
" \"<p>\\n\"+\n",
" \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n",
" \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n",
" \"</p>\\n\"+\n",
" \"<ul>\\n\"+\n",
" \"<li>re-rerun `output_notebook()` to attempt to load from CDN again, or</li>\\n\"+\n",
" \"<li>use INLINE resources instead, as so:</li>\\n\"+\n",
" \"</ul>\\n\"+\n",
" \"<code>\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"</code>\\n\"+\n",
" \"</div>\"}};\n",
" \n",
" function display_loaded() {\n",
" if (window.Bokeh !== undefined) {\n",
" var el = document.getElementById(\"a773ecfa-47f7-41e5-beb5-c3386da4a45e\");\n",
" el.textContent = \"BokehJS \" + Bokeh.version + \" successfully loaded.\";\n",
" } else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(display_loaded, 100)\n",
" }\n",
" }\n",
" \n",
" function run_callbacks() {\n",
" window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n",
" delete window._bokeh_onload_callbacks\n",
" console.info(\"Bokeh: all callbacks have finished\");\n",
" }\n",
" \n",
" function load_libs(js_urls, callback) {\n",
" window._bokeh_onload_callbacks.push(callback);\n",
" if (window._bokeh_is_loading > 0) {\n",
" console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n",
" return null;\n",
" }\n",
" if (js_urls == null || js_urls.length === 0) {\n",
" run_callbacks();\n",
" return null;\n",
" }\n",
" console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n",
" window._bokeh_is_loading = js_urls.length;\n",
" for (var i = 0; i < js_urls.length; i++) {\n",
" var url = js_urls[i];\n",
" var s = document.createElement('script');\n",
" s.src = url;\n",
" s.async = false;\n",
" s.onreadystatechange = s.onload = function() {\n",
" window._bokeh_is_loading--;\n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: all BokehJS libraries loaded\");\n",
" run_callbacks()\n",
" }\n",
" };\n",
" s.onerror = function() {\n",
" console.warn(\"failed to load library \" + url);\n",
" };\n",
" console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n",
" document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
" }\n",
" };var element = document.getElementById(\"a773ecfa-47f7-41e5-beb5-c3386da4a45e\");\n",
" if (element == null) {\n",
" console.log(\"Bokeh: ERROR: autoload.js configured with elementid 'a773ecfa-47f7-41e5-beb5-c3386da4a45e' but no matching script tag was found. \")\n",
" return false;\n",
" }\n",
" \n",
" var js_urls = [];\n",
" \n",
" var inline_js = [\n",
" function(Bokeh) {\n",
" (function() {\n",
" var fn = function() {\n",
" var docs_json = {\"7ad8d395-7667-4342-847c-c3734e5ba013\":{\"roots\":{\"references\":[{\"attributes\":{\"callback\":null,\"column_names\":[\"x\",\"y\",\"fill_color\"],\"data\":{\"fill_color\":[\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#9f4496\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#59b996\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\",\"#3288bd\"],\"x\":{\"__ndarray__\":\"olsTwXEBCUCh89UP/MsJQDRHDPHpHQpAlwkjAIrRCUDjn7EJsBQKQLn3bpKRyApAoro1GL15C0DiCLEUHcsMQD+YFR0ZOw1AAjFuqtQYDUBcWk5S0PAMQMopM9VVqwxAQhVGoCQJDEALMWJ1rvgIQGDTcuK6VAhAeTCioBe+CEC8KvXb8QoJQDI3cSyQSgpAw55a06ElCkC8rYburlMKQFalV/UqmApA4pEJ3tFtCkDAEFFBNUcLQOanf51HpgtA4j/URgnBC0DolhFiBPgLQMY9nA5yhgxAqHuawS3KCUANiCA9d6EHQGfqpjiqgAZAUlEy137RBkAPeKSN1tgGQKYcEXryvQZAQaFXydzCBkBiKSOb4g8HQK//wLY7CAdA/03CWJGUBkCD3z5jGWQGQNFwusDL8AVA33w+ZlcKBkDhXhrzkSAHQMW8LunwRAdA+comp6ohB0D1RtMvXwoHQFdMO3CJ3gZAFexWlg8LB0DLfU+GfZcIQFUqhsgLwwdAIiW3/8NLBkBSabxsMboGQKik2alvZwdAR5Ub/VWDAcDr/eopUmkBwB1cobQnWAHA0Aw7JZVSAcDEcdy+cycBwIqpGd8qBwHAcQCNKdPrAMBz4ogl0tIAwM8iSDg33ADA6G0XquTWAMBmb/rI3RABwHUI/5pOPwHAJD98Cib+AMBwBYeMlPYAwKGRrO4UzQDA5HWMym/zAMBpI3GwzUwBwHhLyq7HfwHAnGZvN+JcAcAsUjrbq1sBwJnkbo45awHAkBx6f11qAcBgh9fpiU4BwMAQLsbBagHA1DUV5uhDAcD8l+ZLPTwBwJ2edBN0awHAmOhuETVMAcC4iL5ozAoBwLSWzOwKLQHA4XZqGXt1AcDeEnMlb4MBwDtI+amaYwHAfCWx4+ZaAcB0KaXBfU4BwDLgK45+OwHAkVH0gdQbAcAs8fQh82IBwAQjB3v4pgHArtrjJDp+AcDpyqJpEEsBwDdQPbktMgHAhVDz3B1JAcBDUMoxR/oAwDQBRb+L/gDAiI3wEPqeAcD7kMFj7JsBwBZ8SNdvygHAJUmMqaXAAcC/qkUBZqkBwLRsAFIDjwHAxSC3LtNTAcCClylraUUBwOVC/g64PgHAai3/UWvQAMCrtBAniCvhvw3AuaJlSeG/s+8rheW84L+BEb/rRgviv7CR+DT6e+C//XJsg9ga3r+uxKH9TkTav5WOz6Ct99a/PAW9L0nU1b+qCcOIXtravz2IjCF979i/pWVASXg92r/6DpkzUIHevwesz+y/COO/bLOjmxu34L+x9Guok+novymAw1NC0uy/Bwmbpz2g6b9r0r2LbJ3sv3lgCsWrJu2/XPUiOZm97b+wdIkxC9jtv+wWJ/iT5e2/JwK1Ypoj7r/W7ATygynuv7Yn5Gx03u6/H/dRD5re7b+J7mmXOMLsv5Udy9O5ZO2/qiAVC71u7b/gr+aUY3/tv9hh7wdMY+2/OMSaUiaf7L/G2a+juafsv2DWClWjHe2/7zRzwsXt7L+o2K0hvVzsv9gM4cwwc+u/Pcai5GQV678dLohg+AHqv1OqdJMVzum/rcflMn9N6b8ao02az+bov+PlTeX15Oi/KgqL5ZFZ6b9D+PyhqN7pv4fx8g8uSum/jlfEjCbM6r8pv3NOXVLtv5/mQm0Bsu6/jdwiXgKP7r8KEhONKpHsv1EdYqghffi/\",\"dtype\":\"float64\",\"shape\":[159]},\"y\":{\"__ndarray__\":\"yNTjgVrCxr/JGZLDb/LFv1Nx5aoZnsW/HCygpiL2xb8khstEF6bFv2lhMQmXAcW/umj/yhBTxL8WKgDFhwfDv6MArVJ1k8K/5N6E1y2swr9L4//9FL3Cv9WZLbul3sK/wxWZkPcFw7+Z+7bzAkHGv/h/zlSvU8e/+LWJZpEFx78Wlt4acYPGvwWiHJPLgcW/EK/jy0aXxb9BWSLBelbFv7gx++GsBMW/sld8s+YKxb/vK+jOYGLEv+0qvwOO28O/5LPkiSv1w79BhroBVCjDv/VGDskzZcG/wO+ue61xxL+H6creb8jHv3b6ZDicJMm/Nwm3P8HPyL8FGQm1JbjIvzobVTyI18i/zIDAx4XcyL/diPkiMaPIvwJebfZxo8i/uJ7Ic8b0yL+H7dCaeDrJv7Aol3ffsMm/gH4Si9FHyb9cVXtBqH7Iv37kzkWmdMi/4Fq4J3WayL90WqGZj6fIvzViauFQzMi/tkvQAhSvyL/FHpMmEifHv7qhju3kyse/DzwvYOVYyb/TbAIODefIv+WdKoG2J8i/GnaS8p0V4L8+1WqcwAzgv8/6KGdGAuC/KtGDzMUH4L+ZGVmfTvTfv8+TRcr45d+/Q46dUV3f37/TlzqHysjfv5Fn1anoW9+/Bem7jirP379XB8zGnenfv+Hakoq08N+/Aa0ZtlDp37/Gni/zsuDfv88+FRCVtd+/ucOf6IBg37+8kdtBdgjgvwzw4nIyFuC/XEgq24kL4L+9WCmb7wvgv5sTFcn2DuC/cvZYTIAN4L8Z7ocIcQfgv2tQk06gBuC/2rZtnNgD4L9tGScqt6vfv/jshBJ2DOC/QE6wp4oI4L8O7X885+3fvzjW8Cg94d+/d741/rUQ4L/Q/Vsh7BXgv9UQ1C3sDeC/7UaDsXQK4L/pEjOhvgngv2cIgcRj9d+/PGY0k9XB37/nQ2A+ugzgv1x9XbLMHeC/cKo5+8EU4L9QJKVN+Qjgv2kzi/UqAOC/zgJTQBgG4L8koI7tc+ffv2PvENx24t+/SbswN2QX4L+2tfJTHx3gv8ShKuxNJ+C/EBO6XSgm4L+MbBIvpiDgv96G+razGuC/gzImaWYL4L+XBU1PFPzfvz4XiCq4BeC/BSvybajR3797PtJHOx+7P0lzDZe5ScK/dPACxp9IMz+EJtXhBO7qPxp1WBlFQdg/0xIQqrkf0D/oWq26rUvjP5HRsD8mzso/3r7YL0nm2z/FlhsNr9H4P4bYYDhF7tM/djk0VFzm6j+UCcvNL3wHQG4kQ2VkCxtAU1nSnyu1GUANpkHrOHMZQIua50kUaQZAnWjVuAIz2b8FErWRxy/Tv0uWue0WNcq/FWzbdJup2L9vWmFvG9HUv4bjInQzi9e/kHOx15to1r+laoasYX7Uv3BD+W4aF9W/mfepkIJU1L8Ij/ONDAPNvwx70TmV9dW/rNgGrZHP1L+WwBQHF2zUv3Kzo78zNNO/RXnwM2f61b/XyH8tZ1PSv5vZxFV7Ide/QqgXHyOP079mOXCbtDHTv0XONbfE5NS/wtlWypJb0r/RZ2kxk/nVv6ID8kwo7NK/pnbV3LVb1L9vZCuXtKnNv0+dDZLdcsm/WRZ8KC7ptb808ytxx1jJP2fRFi1MtLw/sxsA7d+M0j9ErNbDCwXWPy5d2GC4iLE/jKnrii+32j9gUqEDpY/mP65PH3LJSShA\",\"dtype\":\"float64\",\"shape\":[159]}}},\"id\":\"d7156353-3042-4162-9e4b-d3a2f2382ef4\",\"type\":\"ColumnDataSource\"},{\"attributes\":{\"active_drag\":\"auto\",\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"d67db79d-5836-417e-90c2-9bd7ad05e438\",\"type\":\"ResizeTool\"},{\"id\":\"cb4e8bcb-b082-46b6-b34a-5e3ea666ee2b\",\"type\":\"CrosshairTool\"},{\"id\":\"62cf2027-dd82-46e1-8fc9-e47f45a640f0\",\"type\":\"PanTool\"},{\"id\":\"4a66dee5-46bb-4f4a-96c0-6427f16b4c43\",\"type\":\"WheelZoomTool\"},{\"id\":\"9b38fb6e-2fcd-497a-9e97-5a05731a9f2f\",\"type\":\"BoxZoomTool\"},{\"id\":\"f3914643-7564-44c6-b26a-b35e2aa1d7f8\",\"type\":\"ResetTool\"},{\"id\":\"74e27e14-9fe5-4b76-af55-ba920a8303c1\",\"type\":\"TapTool\"},{\"id\":\"4692caa2-0fa1-4e96-aeda-7be7b96cd953\",\"type\":\"SaveTool\"},{\"id\":\"af1bf063-1b59-4087-931b-26423951a276\",\"type\":\"BoxSelectTool\"},{\"id\":\"bb2d7428-96f9-4119-8622-9a66baf0fa4d\",\"type\":\"PolySelectTool\"},{\"id\":\"e9a76766-56d4-41b9-b7ac-34eab68b15f7\",\"type\":\"LassoSelectTool\"}]},\"id\":\"7be3034b-9678-4230-9534-01b6b945bc46\",\"type\":\"Toolbar\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"xs_units\":\"screen\",\"ys_units\":\"screen\"},\"id\":\"29fd2745-f9c2-423e-95b0-0c8f29c9d09d\",\"type\":\"PolyAnnotation\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"xs_units\":\"screen\",\"ys_units\":\"screen\"},\"id\":\"57472218-1de7-42a5-8c0e-14cf1282912b\",\"type\":\"PolyAnnotation\"},{\"attributes\":{\"below\":[{\"id\":\"b167ab8b-b14f-4432-bd8f-8fa8f623de57\",\"type\":\"LinearAxis\"}],\"left\":[{\"id\":\"87699db9-3cc3-4e68-9f83-cc52c6dab916\",\"type\":\"LinearAxis\"}],\"renderers\":[{\"id\":\"b167ab8b-b14f-4432-bd8f-8fa8f623de57\",\"type\":\"LinearAxis\"},{\"id\":\"6a5e908f-c008-4877-aa68-cc9a65a043d0\",\"type\":\"Grid\"},{\"id\":\"87699db9-3cc3-4e68-9f83-cc52c6dab916\",\"type\":\"LinearAxis\"},{\"id\":\"08f159ef-3848-4e87-8a41-2f42bc77fb3f\",\"type\":\"Grid\"},{\"id\":\"5f5beb84-743b-4d6a-bac5-bdfd4068bb94\",\"type\":\"BoxAnnotation\"},{\"id\":\"84c8d476-2c77-4934-9903-4c580be07c9f\",\"type\":\"BoxAnnotation\"},{\"id\":\"29fd2745-f9c2-423e-95b0-0c8f29c9d09d\",\"type\":\"PolyAnnotation\"},{\"id\":\"57472218-1de7-42a5-8c0e-14cf1282912b\",\"type\":\"PolyAnnotation\"},{\"id\":\"e8929947-5eee-4630-962c-f56d35f72b87\",\"type\":\"GlyphRenderer\"}],\"title\":{\"id\":\"9c811201-05ac-45db-b024-4eff4ab7a15c\",\"type\":\"Title\"},\"tool_events\":{\"id\":\"a4d6ca70-caeb-41a4-9661-fdede0724103\",\"type\":\"ToolEvents\"},\"toolbar\":{\"id\":\"7be3034b-9678-4230-9534-01b6b945bc46\",\"type\":\"Toolbar\"},\"x_range\":{\"id\":\"2231e079-aff2-44a2-b622-c673a5446bee\",\"type\":\"DataRange1d\"},\"y_range\":{\"id\":\"9f1a4237-baaa-4f4b-bc0e-b38e7b9c7000\",\"type\":\"DataRange1d\"}},\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"},{\"attributes\":{},\"id\":\"d71ae252-bedf-4d7b-bbbf-2d4f0db6f53d\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.6},\"fill_color\":{\"field\":\"fill_color\"},\"line_color\":{\"value\":null},\"radius\":{\"units\":\"data\",\"value\":0.1},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"710360b2-97a3-4bd3-be95-4fdad571422c\",\"type\":\"Circle\"},{\"attributes\":{\"callback\":null,\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"74e27e14-9fe5-4b76-af55-ba920a8303c1\",\"type\":\"TapTool\"},{\"attributes\":{\"callback\":null},\"id\":\"2231e079-aff2-44a2-b622-c673a5446bee\",\"type\":\"DataRange1d\"},{\"attributes\":{\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"4692caa2-0fa1-4e96-aeda-7be7b96cd953\",\"type\":\"SaveTool\"},{\"attributes\":{},\"id\":\"a4d6ca70-caeb-41a4-9661-fdede0724103\",\"type\":\"ToolEvents\"},{\"attributes\":{\"callback\":null,\"overlay\":{\"id\":\"84c8d476-2c77-4934-9903-4c580be07c9f\",\"type\":\"BoxAnnotation\"},\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"renderers\":[{\"id\":\"e8929947-5eee-4630-962c-f56d35f72b87\",\"type\":\"GlyphRenderer\"}]},\"id\":\"af1bf063-1b59-4087-931b-26423951a276\",\"type\":\"BoxSelectTool\"},{\"attributes\":{\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"d67db79d-5836-417e-90c2-9bd7ad05e438\",\"type\":\"ResizeTool\"},{\"attributes\":{\"callback\":null},\"id\":\"9f1a4237-baaa-4f4b-bc0e-b38e7b9c7000\",\"type\":\"DataRange1d\"},{\"attributes\":{\"overlay\":{\"id\":\"29fd2745-f9c2-423e-95b0-0c8f29c9d09d\",\"type\":\"PolyAnnotation\"},\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"bb2d7428-96f9-4119-8622-9a66baf0fa4d\",\"type\":\"PolySelectTool\"},{\"attributes\":{\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"11749384-2e0d-4bf8-ac3e-3f43011b1db0\",\"type\":\"BasicTicker\"}},\"id\":\"6a5e908f-c008-4877-aa68-cc9a65a043d0\",\"type\":\"Grid\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.1},\"fill_color\":{\"value\":\"#1f77b4\"},\"line_alpha\":{\"value\":0.1},\"line_color\":{\"value\":\"#1f77b4\"},\"radius\":{\"units\":\"data\",\"value\":0.1},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"e4fe3571-fe6e-4047-8e57-3c2fa07a5267\",\"type\":\"Circle\"},{\"attributes\":{\"data_source\":{\"id\":\"d7156353-3042-4162-9e4b-d3a2f2382ef4\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"710360b2-97a3-4bd3-be95-4fdad571422c\",\"type\":\"Circle\"},\"hover_glyph\":null,\"muted_glyph\":null,\"nonselection_glyph\":{\"id\":\"e4fe3571-fe6e-4047-8e57-3c2fa07a5267\",\"type\":\"Circle\"},\"selection_glyph\":null},\"id\":\"e8929947-5eee-4630-962c-f56d35f72b87\",\"type\":\"GlyphRenderer\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"5f5beb84-743b-4d6a-bac5-bdfd4068bb94\",\"type\":\"BoxAnnotation\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"84c8d476-2c77-4934-9903-4c580be07c9f\",\"type\":\"BoxAnnotation\"},{\"attributes\":{},\"id\":\"11749384-2e0d-4bf8-ac3e-3f43011b1db0\",\"type\":\"BasicTicker\"},{\"attributes\":{\"formatter\":{\"id\":\"309c9bdb-ca0d-4da2-b388-3db24d20b798\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"11749384-2e0d-4bf8-ac3e-3f43011b1db0\",\"type\":\"BasicTicker\"}},\"id\":\"b167ab8b-b14f-4432-bd8f-8fa8f623de57\",\"type\":\"LinearAxis\"},{\"attributes\":{\"formatter\":{\"id\":\"d71ae252-bedf-4d7b-bbbf-2d4f0db6f53d\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"03d58e40-79b0-4ccd-a219-0bd7f7b748e9\",\"type\":\"BasicTicker\"}},\"id\":\"87699db9-3cc3-4e68-9f83-cc52c6dab916\",\"type\":\"LinearAxis\"},{\"attributes\":{},\"id\":\"03d58e40-79b0-4ccd-a219-0bd7f7b748e9\",\"type\":\"BasicTicker\"},{\"attributes\":{},\"id\":\"309c9bdb-ca0d-4da2-b388-3db24d20b798\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"03d58e40-79b0-4ccd-a219-0bd7f7b748e9\",\"type\":\"BasicTicker\"}},\"id\":\"08f159ef-3848-4e87-8a41-2f42bc77fb3f\",\"type\":\"Grid\"},{\"attributes\":{\"plot\":null,\"text\":\"\"},\"id\":\"9c811201-05ac-45db-b024-4eff4ab7a15c\",\"type\":\"Title\"},{\"attributes\":{\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"cb4e8bcb-b082-46b6-b34a-5e3ea666ee2b\",\"type\":\"CrosshairTool\"},{\"attributes\":{\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"4a66dee5-46bb-4f4a-96c0-6427f16b4c43\",\"type\":\"WheelZoomTool\"},{\"attributes\":{\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"62cf2027-dd82-46e1-8fc9-e47f45a640f0\",\"type\":\"PanTool\"},{\"attributes\":{\"overlay\":{\"id\":\"5f5beb84-743b-4d6a-bac5-bdfd4068bb94\",\"type\":\"BoxAnnotation\"},\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"9b38fb6e-2fcd-497a-9e97-5a05731a9f2f\",\"type\":\"BoxZoomTool\"},{\"attributes\":{\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"f3914643-7564-44c6-b26a-b35e2aa1d7f8\",\"type\":\"ResetTool\"},{\"attributes\":{\"callback\":null,\"overlay\":{\"id\":\"57472218-1de7-42a5-8c0e-14cf1282912b\",\"type\":\"PolyAnnotation\"},\"plot\":{\"id\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"e9a76766-56d4-41b9-b7ac-34eab68b15f7\",\"type\":\"LassoSelectTool\"}],\"root_ids\":[\"3280ffff-43a1-4c2e-b42a-72db10d771a2\"]},\"title\":\"Bokeh Application\",\"version\":\"0.12.5\"}};\n",
" var render_items = [{\"docid\":\"7ad8d395-7667-4342-847c-c3734e5ba013\",\"elementid\":\"a773ecfa-47f7-41e5-beb5-c3386da4a45e\",\"modelid\":\"3280ffff-43a1-4c2e-b42a-72db10d771a2\"}];\n",
" \n",
" Bokeh.embed.embed_items(docs_json, render_items);\n",
" };\n",
" if (document.readyState != \"loading\") fn();\n",
" else document.addEventListener(\"DOMContentLoaded\", fn);\n",
" })();\n",
" },\n",
" function(Bokeh) {\n",
" }\n",
" ];\n",
" \n",
" function run_inline_js() {\n",
" \n",
" if ((window.Bokeh !== undefined) || (force === true)) {\n",
" for (var i = 0; i < inline_js.length; i++) {\n",
" inline_js[i](window.Bokeh);\n",
" }if (force === true) {\n",
" display_loaded();\n",
" }} else if (Date.now() < window._bokeh_timeout) {\n",
" setTimeout(run_inline_js, 100);\n",
" } else if (!window._bokeh_failed_load) {\n",
" console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n",
" window._bokeh_failed_load = true;\n",
" } else if (force !== true) {\n",
" var cell = $(document.getElementById(\"a773ecfa-47f7-41e5-beb5-c3386da4a45e\")).parents('.cell').data().cell;\n",
" cell.output_area.append_execute_result(NB_LOAD_WARNING)\n",
" }\n",
" \n",
" }\n",
" \n",
" if (window._bokeh_is_loading === 0) {\n",
" console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n",
" run_inline_js();\n",
" } else {\n",
" load_libs(js_urls, function() {\n",
" console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n",
" run_inline_js();\n",
" });\n",
" }\n",
" }(this));\n",
"</script>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x = Y_prod[:, 0]\n",
"y = Y_prod[:, 1]\n",
"radii = 0.1\n",
"\n",
"colors = ['#9f4496'] * 51 + ['#59b996'] * 55 + ['#3288bd'] * 53\n",
"\n",
"TOOLS=\"resize,crosshair,pan,wheel_zoom,box_zoom,reset,tap,previewsave,box_select,poly_select,lasso_select\"\n",
"\n",
"p = figure(tools=TOOLS)\n",
"p.scatter(x, y, radius=radii, fill_color=colors, fill_alpha=0.6, line_color=None)\n",
"\n",
"show(p)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The graphs are opposite as the matrices are reversed in their sign. In case we are implementing this in production, we may not be finding the eigen values to understand if you are capturing the majority of the variance in the dataset, as was done earlier. So to find if your new feature subspace is capturing the intended amount of variance you can use the `explained_variance_ratio_` attribute."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[ 5.00892097 1.9821498 ]\n",
"[ 0.71556014 0.28316426]\n"
]
}
],
"source": [
"print(prod_pca.explained_variance_)\n",
"print(prod_pca.explained_variance_ratio_)"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.99872439612636232"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.sum(prod_pca.explained_variance_ratio_.tolist())\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment