Skip to content

Instantly share code, notes, and snippets.

@Rishit-dagli
Last active June 17, 2021 09:25
Show Gist options
  • Save Rishit-dagli/86b8acbe5fed93a0e33eb1201e85ed8d to your computer and use it in GitHub Desktop.
Save Rishit-dagli/86b8acbe5fed93a0e33eb1201e85ed8d to your computer and use it in GitHub Desktop.
Calculate skewness and kurtosis of data in Python
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Skewness and Kurtosis.ipynb",
"provenance": [],
"authorship_tag": "ABX9TyO0JqNdqpHX6M4QuzZ5NffB",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/Rishit-dagli/86b8acbe5fed93a0e33eb1201e85ed8d/skewness-and-kurtosis.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mdOH_AyLTY3t"
},
"source": [
"# Skewness and Kurtosis\n",
"\n",
"This notebook is a foolow along for [this blog](https://www.freecodecamp.org/news/skewness-and-kurtosis-in-statistics-explained/) by me on FreeCodeCamp. We will start off by importing the [Boston Housing Datatset](https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html) and measure skewness and kurtosis for a column in this dataset."
]
},
{
"cell_type": "code",
"metadata": {
"id": "PKqsDGxGQPgg"
},
"source": [
"import pandas as pd\n",
"from scipy.stats import skew\n",
"from scipy.stats import kurtosis"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 205
},
"id": "Nov_ttfxQqRS",
"outputId": "66f112c8-bff4-4e5f-dd11-f63eb7adadbf"
},
"source": [
"column_names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']\n",
"data = pd.read_csv('https://gist.githubusercontent.com/Rishit-dagli/61922d4f6ef284877b6600163cabc681/raw/eeb1b4191b998cdd63a39f82f8031913b7590ff1/housing.csv', \n",
" header=None, \n",
" delimiter=r\"\\s+\", \n",
" names=column_names)\n",
"data.head()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>CRIM</th>\n",
" <th>ZN</th>\n",
" <th>INDUS</th>\n",
" <th>CHAS</th>\n",
" <th>NOX</th>\n",
" <th>RM</th>\n",
" <th>AGE</th>\n",
" <th>DIS</th>\n",
" <th>RAD</th>\n",
" <th>TAX</th>\n",
" <th>PTRATIO</th>\n",
" <th>B</th>\n",
" <th>LSTAT</th>\n",
" <th>MEDV</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0.00632</td>\n",
" <td>18.0</td>\n",
" <td>2.31</td>\n",
" <td>0</td>\n",
" <td>0.538</td>\n",
" <td>6.575</td>\n",
" <td>65.2</td>\n",
" <td>4.0900</td>\n",
" <td>1</td>\n",
" <td>296.0</td>\n",
" <td>15.3</td>\n",
" <td>396.90</td>\n",
" <td>4.98</td>\n",
" <td>24.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0.02731</td>\n",
" <td>0.0</td>\n",
" <td>7.07</td>\n",
" <td>0</td>\n",
" <td>0.469</td>\n",
" <td>6.421</td>\n",
" <td>78.9</td>\n",
" <td>4.9671</td>\n",
" <td>2</td>\n",
" <td>242.0</td>\n",
" <td>17.8</td>\n",
" <td>396.90</td>\n",
" <td>9.14</td>\n",
" <td>21.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0.02729</td>\n",
" <td>0.0</td>\n",
" <td>7.07</td>\n",
" <td>0</td>\n",
" <td>0.469</td>\n",
" <td>7.185</td>\n",
" <td>61.1</td>\n",
" <td>4.9671</td>\n",
" <td>2</td>\n",
" <td>242.0</td>\n",
" <td>17.8</td>\n",
" <td>392.83</td>\n",
" <td>4.03</td>\n",
" <td>34.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>0.03237</td>\n",
" <td>0.0</td>\n",
" <td>2.18</td>\n",
" <td>0</td>\n",
" <td>0.458</td>\n",
" <td>6.998</td>\n",
" <td>45.8</td>\n",
" <td>6.0622</td>\n",
" <td>3</td>\n",
" <td>222.0</td>\n",
" <td>18.7</td>\n",
" <td>394.63</td>\n",
" <td>2.94</td>\n",
" <td>33.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>0.06905</td>\n",
" <td>0.0</td>\n",
" <td>2.18</td>\n",
" <td>0</td>\n",
" <td>0.458</td>\n",
" <td>7.147</td>\n",
" <td>54.2</td>\n",
" <td>6.0622</td>\n",
" <td>3</td>\n",
" <td>222.0</td>\n",
" <td>18.7</td>\n",
" <td>396.90</td>\n",
" <td>5.33</td>\n",
" <td>36.2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" CRIM ZN INDUS CHAS NOX ... TAX PTRATIO B LSTAT MEDV\n",
"0 0.00632 18.0 2.31 0 0.538 ... 296.0 15.3 396.90 4.98 24.0\n",
"1 0.02731 0.0 7.07 0 0.469 ... 242.0 17.8 396.90 9.14 21.6\n",
"2 0.02729 0.0 7.07 0 0.469 ... 242.0 17.8 392.83 4.03 34.7\n",
"3 0.03237 0.0 2.18 0 0.458 ... 222.0 18.7 394.63 2.94 33.4\n",
"4 0.06905 0.0 2.18 0 0.458 ... 222.0 18.7 396.90 5.33 36.2\n",
"\n",
"[5 rows x 14 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 2
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6p_Xme0rSwUG"
},
"source": [
"Let calculate skewness and kurtosis for one column in this dataset"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xUYoqFUASr9Y",
"outputId": "cd3f0931-15e9-4e3e-a027-95b189400508"
},
"source": [
"skew(data[\"MEDV\"].dropna())"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"1.104810822864635"
]
},
"metadata": {
"tags": []
},
"execution_count": 3
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ZpER4zjzS_LI",
"outputId": "62ebaafb-9080-403d-fdb6-1daa6aaba7d3"
},
"source": [
"kurtosis(data[\"MEDV\"].dropna())"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"1.4686287722747462"
]
},
"metadata": {
"tags": []
},
"execution_count": 4
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment