Skip to content

Instantly share code, notes, and snippets.

@shas043
Created April 8, 2019 18:36
Show Gist options
  • Save shas043/c57e3e8293079dc10fdd1c1495d50128 to your computer and use it in GitHub Desktop.
Save shas043/c57e3e8293079dc10fdd1c1495d50128 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"nbformat_minor": 1,
"cells": [
{
"source": "<img class=\"irc_mi\" src=\"https://www.telegraph.co.uk/content/dam/health-fitness/2018/05/21/TELEMMGLPICT000152502167_trans_NvBQzQNjv4BqOHNs0Y5vwBZmXiYbjSVrpOPQJBoXPtkictzS7v68rAc.jpeg?imwidth=450\" onload=\"typeof google==='object'&amp;&amp;google.aft&amp;&amp;google.aft(this)\" alt=\"Image result for heart test\" width=\"480\" height=\"300\" style=\"margin-top: 27px;\" data-iml=\"1553803585244\">",
"cell_type": "markdown",
"metadata": {}
},
{
"source": "## <center>Data Analysis of heart patients</center>",
"cell_type": "markdown",
"metadata": {
"collapsed": true
}
},
{
"source": "### Introduction:\n\nThis notebook shows the data analysis done on test records of heart patients from Cleveland.\n\n### Table Of Contents\n\n**1. [Get data](#data_set)** \n**2. [Load data](#load_data)** \n**3. [Access data](#access_data)** \n**4. [Explore data](#explore_data)** \n**5. [Plot](#Plot)** \n**6. [Acknoledgement](#ack_data)**",
"cell_type": "markdown",
"metadata": {}
},
{
"source": "<a id=\"data_set\"></a> \n**1. Get Data** \n\nData set used in the analysis was obtained from kaggle (https://www.kaggle.com). It contains 14 attributes which are used for analysis the potential chances of heart disease in a patient. \n\nThe attributes used in dataset are the following: \n\n**age** ---------------- age in years \n**sex** ---------------- (1 = male; 0 = female) \n**cp** ----------------- chest pain type \n**trestbps** ----------- resting blood pressure (in mm Hg on admission to the hospital) \n**chol** --------------- serum cholestoral in mg/dl \n**fbs** ---------------- (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false) \n**restecg** ------------ resting electrocardiographic results \n**thalach** ------------ maximum heart rate achieved \n**exang** ------------- exercise induced angina (1 = yes; 0 = no) \n**oldpeak** ------------ ST depression induced by exercise relative to rest \n**slope** -------------- the slope of the peak exercise ST segment \n**ca** ----------------- number of major vessels (0-3) colored by flourosopy \n**thal** --------------- 3 = normal; 6 = fixed defect; 7 = reversable defect \n**target** ------------- 1 or 0 ",
"cell_type": "markdown",
"metadata": {}
},
{
"source": "<a id=\"load_data\"></a> \n**2. Load data**",
"cell_type": "markdown",
"metadata": {}
},
{
"execution_count": 16,
"cell_type": "code",
"metadata": {},
"outputs": [],
"source": "# The code was removed by Watson Studio for sharing."
},
{
"source": "<a id=\"access_data\"></a> \n**3. Access data**",
"cell_type": "markdown",
"metadata": {}
},
{
"execution_count": 71,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": " age sex cp trestbps chol fbs restecg thalach exang oldpeak \\\n0 63 1 3 145 233 1 0 150 0 2.3 \n1 37 1 2 130 250 0 1 187 0 3.5 \n2 41 0 1 130 204 0 0 172 0 1.4 \n3 56 1 1 120 236 0 1 178 0 0.8 \n4 57 0 0 120 354 0 1 163 1 0.6 \n5 57 1 0 140 192 0 1 148 0 0.4 \n6 56 0 1 140 294 0 0 153 0 1.3 \n7 44 1 1 120 263 0 1 173 0 0.0 \n8 52 1 2 172 199 1 1 162 0 0.5 \n9 57 1 2 150 168 0 1 174 0 1.6 \n10 54 1 0 140 239 0 1 160 0 1.2 \n11 48 0 2 130 275 0 1 139 0 0.2 \n12 49 1 1 130 266 0 1 171 0 0.6 \n13 64 1 3 110 211 0 0 144 1 1.8 \n14 58 0 3 150 283 1 0 162 0 1.0 \n15 50 0 2 120 219 0 1 158 0 1.6 \n16 58 0 2 120 340 0 1 172 0 0.0 \n17 66 0 3 150 226 0 1 114 0 2.6 \n18 43 1 0 150 247 0 1 171 0 1.5 \n19 69 0 3 140 239 0 1 151 0 1.8 \n20 59 1 0 135 234 0 1 161 0 0.5 \n21 44 1 2 130 233 0 1 179 1 0.4 \n22 42 1 0 140 226 0 1 178 0 0.0 \n23 61 1 2 150 243 1 1 137 1 1.0 \n24 40 1 3 140 199 0 1 178 1 1.4 \n25 71 0 1 160 302 0 1 162 0 0.4 \n26 59 1 2 150 212 1 1 157 0 1.6 \n27 51 1 2 110 175 0 1 123 0 0.6 \n28 65 0 2 140 417 1 0 157 0 0.8 \n29 53 1 2 130 197 1 0 152 0 1.2 \n.. ... ... .. ... ... ... ... ... ... ... \n273 58 1 0 100 234 0 1 156 0 0.1 \n274 47 1 0 110 275 0 0 118 1 1.0 \n275 52 1 0 125 212 0 1 168 0 1.0 \n276 58 1 0 146 218 0 1 105 0 2.0 \n277 57 1 1 124 261 0 1 141 0 0.3 \n278 58 0 1 136 319 1 0 152 0 0.0 \n279 61 1 0 138 166 0 0 125 1 3.6 \n280 42 1 0 136 315 0 1 125 1 1.8 \n281 52 1 0 128 204 1 1 156 1 1.0 \n282 59 1 2 126 218 1 1 134 0 2.2 \n283 40 1 0 152 223 0 1 181 0 0.0 \n284 61 1 0 140 207 0 0 138 1 1.9 \n285 46 1 0 140 311 0 1 120 1 1.8 \n286 59 1 3 134 204 0 1 162 0 0.8 \n287 57 1 1 154 232 0 0 164 0 0.0 \n288 57 1 0 110 335 0 1 143 1 3.0 \n289 55 0 0 128 205 0 2 130 1 2.0 \n290 61 1 0 148 203 0 1 161 0 0.0 \n291 58 1 0 114 318 0 2 140 0 4.4 \n292 58 0 0 170 225 1 0 146 1 2.8 \n293 67 1 2 152 212 0 0 150 0 0.8 \n294 44 1 0 120 169 0 1 144 1 2.8 \n295 63 1 0 140 187 0 0 144 1 4.0 \n296 63 0 0 124 197 0 1 136 1 0.0 \n297 59 1 0 164 176 1 0 90 0 1.0 \n298 57 0 0 140 241 0 1 123 1 0.2 \n299 45 1 3 110 264 0 1 132 0 1.2 \n300 68 1 0 144 193 1 1 141 0 3.4 \n301 57 1 0 130 131 0 1 115 1 1.2 \n302 57 0 1 130 236 0 0 174 0 0.0 \n\n slope ca thal target \n0 0 0 1 1 \n1 0 0 2 1 \n2 2 0 2 1 \n3 2 0 2 1 \n4 2 0 2 1 \n5 1 0 1 1 \n6 1 0 2 1 \n7 2 0 3 1 \n8 2 0 3 1 \n9 2 0 2 1 \n10 2 0 2 1 \n11 2 0 2 1 \n12 2 0 2 1 \n13 1 0 2 1 \n14 2 0 2 1 \n15 1 0 2 1 \n16 2 0 2 1 \n17 0 0 2 1 \n18 2 0 2 1 \n19 2 2 2 1 \n20 1 0 3 1 \n21 2 0 2 1 \n22 2 0 2 1 \n23 1 0 2 1 \n24 2 0 3 1 \n25 2 2 2 1 \n26 2 0 2 1 \n27 2 0 2 1 \n28 2 1 2 1 \n29 0 0 2 1 \n.. ... .. ... ... \n273 2 1 3 0 \n274 1 1 2 0 \n275 2 2 3 0 \n276 1 1 3 0 \n277 2 0 3 0 \n278 2 2 2 0 \n279 1 1 2 0 \n280 1 0 1 0 \n281 1 0 0 0 \n282 1 1 1 0 \n283 2 0 3 0 \n284 2 1 3 0 \n285 1 2 3 0 \n286 2 2 2 0 \n287 2 1 2 0 \n288 1 1 3 0 \n289 1 1 3 0 \n290 2 1 3 0 \n291 0 3 1 0 \n292 1 2 1 0 \n293 1 0 3 0 \n294 0 0 1 0 \n295 2 2 3 0 \n296 1 0 2 0 \n297 1 2 1 0 \n298 1 0 3 0 \n299 1 0 3 0 \n300 1 2 3 0 \n301 1 1 3 0 \n302 1 1 2 0 \n\n[303 rows x 14 columns]\n"
}
],
"source": "import pandas as pd\ndf = pd.read_csv('heart.csv')\nprint (df)"
},
{
"source": "<a id=\"explore_data\"></a> \n**4. Explore data**",
"cell_type": "markdown",
"metadata": {}
},
{
"execution_count": 18,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": " age sex cp trestbps chol fbs restecg thalach exang oldpeak slope \\\n0 63 1 3 145 233 1 0 150 0 2.3 0 \n1 37 1 2 130 250 0 1 187 0 3.5 0 \n2 41 0 1 130 204 0 0 172 0 1.4 2 \n3 56 1 1 120 236 0 1 178 0 0.8 2 \n4 57 0 0 120 354 0 1 163 1 0.6 2 \n\n ca thal target \n0 0 1 1 \n1 0 2 1 \n2 0 2 1 \n3 0 2 1 \n4 0 2 1 \n"
}
],
"source": "print (df.head())"
},
{
"execution_count": 19,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": " age sex cp trestbps chol fbs restecg thalach exang oldpeak \\\n298 57 0 0 140 241 0 1 123 1 0.2 \n299 45 1 3 110 264 0 1 132 0 1.2 \n300 68 1 0 144 193 1 1 141 0 3.4 \n301 57 1 0 130 131 0 1 115 1 1.2 \n302 57 0 1 130 236 0 0 174 0 0.0 \n\n slope ca thal target \n298 1 0 3 0 \n299 1 0 3 0 \n300 1 2 3 0 \n301 1 1 3 0 \n302 1 1 2 0 \n"
}
],
"source": "print (df.tail())"
},
{
"execution_count": 70,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"execution_count": 70,
"metadata": {},
"data": {
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>sex</th>\n <th>cp</th>\n <th>trestbps</th>\n <th>chol</th>\n <th>fbs</th>\n <th>restecg</th>\n <th>thalach</th>\n <th>exang</th>\n <th>oldpeak</th>\n <th>slope</th>\n <th>ca</th>\n <th>thal</th>\n <th>target</th>\n </tr>\n <tr>\n <th>age</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>29</th>\n <td>1.00</td>\n <td>1.000000</td>\n <td>130.0</td>\n <td>204.000000</td>\n <td>0.0</td>\n <td>0.00</td>\n <td>202.0</td>\n <td>0.000000</td>\n <td>0.000000</td>\n <td>2.000000</td>\n <td>0.000000</td>\n <td>2.000000</td>\n <td>1.000000</td>\n </tr>\n <tr>\n <th>34</th>\n <td>0.50</td>\n <td>2.000000</td>\n <td>118.0</td>\n <td>196.000000</td>\n <td>0.0</td>\n <td>0.50</td>\n <td>183.0</td>\n <td>0.000000</td>\n <td>0.350000</td>\n <td>2.000000</td>\n <td>0.000000</td>\n <td>2.000000</td>\n <td>1.000000</td>\n </tr>\n <tr>\n <th>35</th>\n <td>0.75</td>\n <td>0.250000</td>\n <td>126.5</td>\n <td>213.750000</td>\n <td>0.0</td>\n <td>0.75</td>\n <td>160.5</td>\n <td>0.500000</td>\n <td>0.750000</td>\n <td>1.750000</td>\n <td>0.000000</td>\n <td>2.500000</td>\n <td>0.500000</td>\n </tr>\n <tr>\n <th>37</th>\n <td>0.50</td>\n <td>2.000000</td>\n <td>125.0</td>\n <td>232.500000</td>\n <td>0.0</td>\n <td>1.00</td>\n <td>178.5</td>\n <td>0.000000</td>\n <td>1.750000</td>\n <td>1.000000</td>\n <td>0.000000</td>\n <td>2.000000</td>\n <td>1.000000</td>\n </tr>\n <tr>\n <th>38</th>\n <td>1.00</td>\n <td>2.333333</td>\n <td>132.0</td>\n <td>193.666667</td>\n <td>0.0</td>\n <td>1.00</td>\n <td>176.0</td>\n <td>0.333333</td>\n <td>1.266667</td>\n <td>1.666667</td>\n <td>2.666667</td>\n <td>2.333333</td>\n <td>0.666667</td>\n </tr>\n </tbody>\n</table>\n</div>",
"text/plain": " sex cp trestbps chol fbs restecg thalach exang \\\nage \n29 1.00 1.000000 130.0 204.000000 0.0 0.00 202.0 0.000000 \n34 0.50 2.000000 118.0 196.000000 0.0 0.50 183.0 0.000000 \n35 0.75 0.250000 126.5 213.750000 0.0 0.75 160.5 0.500000 \n37 0.50 2.000000 125.0 232.500000 0.0 1.00 178.5 0.000000 \n38 1.00 2.333333 132.0 193.666667 0.0 1.00 176.0 0.333333 \n\n oldpeak slope ca thal target \nage \n29 0.000000 2.000000 0.000000 2.000000 1.000000 \n34 0.350000 2.000000 0.000000 2.000000 1.000000 \n35 0.750000 1.750000 0.000000 2.500000 0.500000 \n37 1.750000 1.000000 0.000000 2.000000 1.000000 \n38 1.266667 1.666667 2.666667 2.333333 0.666667 "
},
"output_type": "execute_result"
}
],
"source": "# Mean values of all attributes for 1st 5 patients of different age \ndf.groupby('age').mean().head()"
},
{
"execution_count": 53,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"execution_count": 53,
"metadata": {},
"data": {
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>min</th>\n <th>max</th>\n <th>mean</th>\n </tr>\n <tr>\n <th>age</th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>65</th>\n <td>177</td>\n <td>417</td>\n <td>279.000000</td>\n </tr>\n <tr>\n <th>66</th>\n <td>212</td>\n <td>302</td>\n <td>245.714286</td>\n </tr>\n <tr>\n <th>67</th>\n <td>212</td>\n <td>564</td>\n <td>286.777778</td>\n </tr>\n <tr>\n <th>68</th>\n <td>193</td>\n <td>277</td>\n <td>238.750000</td>\n </tr>\n <tr>\n <th>69</th>\n <td>234</td>\n <td>254</td>\n <td>242.333333</td>\n </tr>\n <tr>\n <th>70</th>\n <td>174</td>\n <td>322</td>\n <td>252.500000</td>\n </tr>\n <tr>\n <th>71</th>\n <td>149</td>\n <td>302</td>\n <td>238.666667</td>\n </tr>\n <tr>\n <th>74</th>\n <td>269</td>\n <td>269</td>\n <td>269.000000</td>\n </tr>\n <tr>\n <th>76</th>\n <td>197</td>\n <td>197</td>\n <td>197.000000</td>\n </tr>\n <tr>\n <th>77</th>\n <td>304</td>\n <td>304</td>\n <td>304.000000</td>\n </tr>\n </tbody>\n</table>\n</div>",
"text/plain": " min max mean\nage \n65 177 417 279.000000\n66 212 302 245.714286\n67 212 564 286.777778\n68 193 277 238.750000\n69 234 254 242.333333\n70 174 322 252.500000\n71 149 302 238.666667\n74 269 269 269.000000\n76 197 197 197.000000\n77 304 304 304.000000"
},
"output_type": "execute_result"
}
],
"source": "# min, max, mean cholestrol level of patients with 10 different age in desc order\ndf.groupby('age').chol.agg(['min', 'max', 'mean']).tail(10)"
},
{
"source": "<a id=\"Plot\"></a> \n**5. Plot** ",
"cell_type": "markdown",
"metadata": {}
},
{
"execution_count": null,
"cell_type": "code",
"metadata": {},
"outputs": [],
"source": "%matplotlib inline"
},
{
"execution_count": 68,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"execution_count": 68,
"metadata": {},
"data": {
"text/plain": "<matplotlib.axes._subplots.AxesSubplot at 0x7ff4c7512278>"
},
"output_type": "execute_result"
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": "<matplotlib.figure.Figure at 0x7ff4c7502278>"
},
"metadata": {}
}
],
"source": "# Plot the min, max and mean values of cholestrol \ndf.groupby('age').chol.agg(['min', 'max', 'mean']).plot(kind = 'bar', figsize=(20,5))"
},
{
"source": "<a id='ack_data'></a>\n**6. Acknoledgement** ",
"cell_type": "markdown",
"metadata": {}
},
{
"source": "Creators: \n1. Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D. \n2. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D. \n3. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D. \n4. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D.\n\nDonor: David W. Aha (aha '@' ics.uci.edu) (714) 156-8*$9",
"cell_type": "markdown",
"metadata": {}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.5",
"name": "python3",
"language": "python"
},
"language_info": {
"mimetype": "text/x-python",
"nbconvert_exporter": "python",
"version": "3.5.5",
"name": "python",
"file_extension": ".py",
"pygments_lexer": "ipython3",
"codemirror_mode": {
"version": 3,
"name": "ipython"
}
}
},
"nbformat": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment