Skip to content

Instantly share code, notes, and snippets.

{ "cells": [  {   "cell_type": "markdown",   "metadata": {},   "source": [    "# Classifying wine dataset using pipelines\n",    "##Notebook by kumar Reddy\n",    "###[Persistent Systems Ltd]\n",    "###Data Source: UCI ML Repository\n",    "#Table of contents\n",    "###Step 1: Analyzing Data\n",    "\n",    "###Step 2: Applying Classification Techniques\n",    "\n",    "###Step 3: Standardization\n",    "\n",    "###Step 4: Using Pipelines\n",    "\n",    "###Step 5: Conclusion"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "##libraries\n",    "\n",    "####NumPy: >= V 1.11.1\n",    "####pandas: >= V 0.18.1\n",    "####scikit-learn: >= V 0.17.1"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "##Step 1: Analyzing Data\n",    "\n",    "\n",    "####About Wine Dataset: These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituen
{ "cells": [  {   "cell_type": "markdown",   "metadata": {},   "source": [    "# Classifying wine dataset using pipelines\n",    "\n",    "### Notebook by [Aashish K Tiwari](https://gist.github.com/AashishTiwari)\n",    "#### You can see all my public gists @ https://gist.github.com/AashishTiwari\n",    "\n",    "#### [Persistent Systems Ltd]\n",    "#### Data Source: UCI ML Repository"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "## Table of contents\n",    "\n",    "\n",    "1. [Step 1: Analyzing Data](#Step-1:-Analyzing-data)\n",    "\n",    "2. [Step 2: Applying Classification Techniques](#Step-2:-Applying-Classification-Techniques)\n",    "\n",    "3. [Step 3: Standardization](#Step-3:-Standardization)\n",    "\n",    "4. [Step 4: Using Pipelines](#Step-4:-Using-Pipelines)\n",    "\n",    "5. [Step 5: Conclusion](#Step-5:-Conclusion)"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "##  libraries\n",    "\n",    "[[ go back to the top ]](#Table-of-con
{ "cells": [  {   "cell_type": "markdown",   "metadata": {},   "source": [    "# Classifying wine dataset using pipelines\n",   "\n",    "#### [Persistent Systems Ltd]\n",    "#### Data Source: UCI ML Repository"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "## Table of contents\n",    "\n",    "\n",    "1. [Step 1: Analyzing Data](#Step-1:-Analyzing-data)\n",    "\n",    "2. [Step 2: Applying Classification Techniques](#Step-2:-Applying-Classification-Techniques)\n",    "\n",    "3. [Step 3: Standardization](#Step-3:-Standardization)\n",    "\n",    "4. [Step 4: Using Pipelines](#Step-4:-Using-Pipelines)\n",    "\n",    "5. [Step 5: Conclusion](#Step-5:-Conclusion)"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "##  libraries\n",    "\n",    "[[ go back to the top ]](#Table-of-contents)\n",    "\n",    "\n",    "* **NumPy**: >= V 1.11.1\n",    "* **pandas**: >= V 0.18.1\n",    "* **scikit-learn**:  >= V 0.17.1"   ]  },  {   "cell_type": "markdown",   "me
{ "cells": [  {   "cell_type": "markdown",   "metadata": {},   "source": [    "# Classifying wine dataset using pipelines\n",    "\n",    "### Notebook by [kumar reddy](https://gist.github.com/kumarreddy)\n",    "#### You can see all my public gists @ https://gist.github.com/kumarreddy\n",    "\n",    "#### [Persistent Systems Ltd]\n",    "#### Data Source: UCI ML Repository"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "## Table of contents\n",    "\n",    "\n",    "1. [Step 1: Analyzing Data](#Step-1:-Analyzing-data)\n",    "\n",    "2. [Step 2: Applying Classification Techniques](#Step-2:-Applying-Classification-Techniques)\n",    "\n",    "3. [Step 3: Standardization](#Step-3:-Standardization)\n",    "\n",    "4. [Step 4: Using Pipelines](#Step-4:-Using-Pipelines)\n",    "\n",    "5. [Step 5: Conclusion](#Step-5:-Conclusion)"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "##  libraries\n",    "\n",    "[[ go back to the top ]](#Table-of-contents)\n", 
This file has been truncated, but you can view the full file.
{ "cells": [  {   "cell_type": "markdown",   "metadata": {},   "source": [    "\n",    "## Data Science and (Unsupervised) Machine Learning with scikit-learn \n",    "\n",    "###By kumar Reddy \n",    "### Presented Jan 04, 2017 "   ]  },  {   "cell_type": "code",   "execution_count": 1,   "metadata": {    "collapsed": false   },   "outputs": [    {     "data": {      "text/html": [       "\n",       "        <iframe\n",       "            width=\"400\"\n",       "            height=\"300\"\n",       "            src=\"https://www.youtube.com/embed/2lpS6gUwiJQ\"\n",       "            frameborder=\"0\"\n",       "            allowfullscreen\n",       "        ></iframe>\n",       "        "      ],      "text/plain": [       "<IPython.lib.display.YouTubeVideo at 0x3b61cc0>"      ]     },     "execution_count": 1,     "metadata": {},     "output_type": "execute_result"    }   ],   "source": [    "from IPython.display import YouTubeVideo\n",    "YouTubeVideo('2lpS6gUwiJQ')"   ]  },  {   "cell_type": "markdown"
This file has been truncated, but you can view the full file.
{ "cells": [  {   "cell_type": "markdown",   "metadata": {},   "source": [    "\n",    "## Data Science and (Unsupervised) Machine Learning with scikit-learn \n",    "\n",    "###By kumar Reddy \n",    "### Presented Jan 04, 2017 "   ]  },  {   "cell_type": "code",   "execution_count": 1,   "metadata": {    "collapsed": false   },   "outputs": [    {     "data": {      "text/html": [       "\n",       "        <iframe\n",       "            width=\"400\"\n",       "            height=\"300\"\n",       "            src=\"https://www.youtube.com/embed/2lpS6gUwiJQ\"\n",       "            frameborder=\"0\"\n",       "            allowfullscreen\n",       "        ></iframe>\n",       "        "      ],      "text/plain": [       "<IPython.lib.display.YouTubeVideo at 0x3b61cc0>"      ]     },     "execution_count": 1,     "metadata": {},     "output_type": "execute_result"    }   ],   "source": [    "from IPython.display import YouTubeVideo\n",    "YouTubeVideo('2lpS6gUwiJQ')"   ]  },  {   "cell_type": "markdown"
This file has been truncated, but you can view the full file.
{ "cells": [  {   "cell_type": "code",   "execution_count": 1,   "metadata": {    "collapsed": false   },   "outputs": [],   "source": [    "#from IPython.display import YouTubeVideo\n",    "#YouTubeVideo('2lpS6gUwiJQ')"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "## This talk\n",    "### •A different way to look at graph analysis and visualization,\n",    "### •as an introduction to a few cool algorithms: Truncated SVD, K-Means and t-SNE\n",    "### •with a practical walkthrough using scikit-learn and friends numpy and bokeh,\n",    "### •and finishing off with some more general commentary on this approach to data analysis\n"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "## A map of Reddit¶\n",    "###     •Reddit is \"the front page of the internet\"\n",    "###     •Basically a discussion board, with sub-boards called subreddits \n",    "###     •Figure from this paper: Navigating the massive world of reddit: Using backbone networks to map user #int
This file has been truncated, but you can view the full file.
{ "cells": [  {   "cell_type": "code",   "execution_count": 1,   "metadata": {    "collapsed": false   },   "outputs": [],   "source": [    "#from IPython.display import YouTubeVideo\n",    "#YouTubeVideo('2lpS6gUwiJQ')"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "## This talk\n",    "### •A different way to look at graph analysis and visualization,\n",    "### •as an introduction to a few cool algorithms: Truncated SVD, K-Means and t-SNE\n",    "### •with a practical walkthrough using scikit-learn and friends numpy and bokeh,\n",    "### •and finishing off with some more general commentary on this approach to data analysis\n"   ]  },  {   "cell_type": "markdown",   "metadata": {},   "source": [    "## A map of Reddit¶\n",    "###     •Reddit is \"the front page of the internet\"\n",    "###     •Basically a discussion board, with sub-boards called subreddits \n",    "###     •Figure from this paper: Navigating the massive world of reddit: Using backbone networks to map user #int
{ "cells": [  {   "cell_type": "code",   "execution_count": 1,   "metadata": {    "collapsed": false   },   "outputs": [    {     "data": {      "text/html": [       "<div>\n",       "<table border=\"1\" class=\"dataframe\">\n",       "  <thead>\n",       "    <tr style=\"text-align: right;\">\n",       "      <th></th>\n",       "      <th>user</th>\n",       "      <th>0</th>\n",       "      <th>1</th>\n",       "      <th>2</th>\n",       "      <th>3</th>\n",       "      <th>4</th>\n",       "      <th>5</th>\n",       "      <th>6</th>\n",       "      <th>7</th>\n",       "      <th>8</th>\n",       "      <th>...</th>\n",       "      <th>15</th>\n",       "      <th>16</th>\n",       "      <th>17</th>\n",       "      <th>18</th>\n",       "      <th>19</th>\n",       "      <th>20</th>\n",       "      <th>21</th>\n",       "      <th>22</th>\n",       "      <th>23</th>\n",       "      <th>24</th>\n",       "    </tr>\n",       "  </thead>\n",       "  <tbody>\n",       "    <tr>\n",       "    
{ "cells": [  {   "cell_type": "code",   "execution_count": 1,   "metadata": {    "collapsed": true   },   "outputs": [],   "source": [    "# from IPython.display import YouTubeVideo\n",    "# YouTubeVideo('2lpS6gUwiJQ')"   ]  },  {   "cell_type": "code",   "execution_count": 2,   "metadata": {    "collapsed": false   },   "outputs": [    {     "data": {      "text/html": [       "<div>\n",       "<table border=\"1\" class=\"dataframe\">\n",       "  <thead>\n",       "    <tr style=\"text-align: right;\">\n",       "      <th></th>\n",       "      <th>user</th>\n",       "      <th>0</th>\n",       "      <th>1</th>\n",       "      <th>2</th>\n",       "      <th>3</th>\n",       "      <th>4</th>\n",       "      <th>5</th>\n",       "      <th>6</th>\n",       "      <th>7</th>\n",       "      <th>8</th>\n",       "      <th>...</th>\n",       "      <th>15</th>\n",       "      <th>16</th>\n",       "      <th>17</th>\n",       "      <th>18</th>\n",       "      <th>19</th>\n",       "      <th>20</th