Skip to content

Instantly share code, notes, and snippets.

@zhuang-hao-ming
Last active July 15, 2018 14:29
Show Gist options
  • Save zhuang-hao-ming/5205dac69558f5a0f84ec6f07c8d8988 to your computer and use it in GitHub Desktop.
Save zhuang-hao-ming/5205dac69558f5a0f84ec6f07c8d8988 to your computer and use it in GitHub Desktop.
pandas备忘
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>RespondentID</th>\n",
" <th>Do you celebrate Thanksgiving?</th>\n",
" <th>What is typically the main dish at your Thanksgiving dinner?</th>\n",
" <th>What is typically the main dish at your Thanksgiving dinner? - Other (please specify)</th>\n",
" <th>How is the main dish typically cooked?</th>\n",
" <th>How is the main dish typically cooked? - Other (please specify)</th>\n",
" <th>What kind of stuffing/dressing do you typically have?</th>\n",
" <th>What kind of stuffing/dressing do you typically have? - Other (please specify)</th>\n",
" <th>What type of cranberry saucedo you typically have?</th>\n",
" <th>What type of cranberry saucedo you typically have? - Other (please specify)</th>\n",
" <th>...</th>\n",
" <th>Have you ever tried to meet up with hometown friends on Thanksgiving night?</th>\n",
" <th>Have you ever attended a \"Friendsgiving?\"</th>\n",
" <th>Will you shop any Black Friday sales on Thanksgiving Day?</th>\n",
" <th>Do you work in retail?</th>\n",
" <th>Will you employer make you work on Black Friday?</th>\n",
" <th>How would you describe where you live?</th>\n",
" <th>Age</th>\n",
" <th>What is your gender?</th>\n",
" <th>How much total combined money did all members of your HOUSEHOLD earn last year?</th>\n",
" <th>US Region</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>4337954960</td>\n",
" <td>Yes</td>\n",
" <td>Turkey</td>\n",
" <td>NaN</td>\n",
" <td>Baked</td>\n",
" <td>NaN</td>\n",
" <td>Bread-based</td>\n",
" <td>NaN</td>\n",
" <td>None</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>NaN</td>\n",
" <td>Suburban</td>\n",
" <td>18 - 29</td>\n",
" <td>Male</td>\n",
" <td>$75,000 to $99,999</td>\n",
" <td>Middle Atlantic</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>4337951949</td>\n",
" <td>Yes</td>\n",
" <td>Turkey</td>\n",
" <td>NaN</td>\n",
" <td>Baked</td>\n",
" <td>NaN</td>\n",
" <td>Bread-based</td>\n",
" <td>NaN</td>\n",
" <td>Other (please specify)</td>\n",
" <td>Homemade cranberry gelatin ring</td>\n",
" <td>...</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>NaN</td>\n",
" <td>Rural</td>\n",
" <td>18 - 29</td>\n",
" <td>Female</td>\n",
" <td>$50,000 to $74,999</td>\n",
" <td>East South Central</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>4337935621</td>\n",
" <td>Yes</td>\n",
" <td>Turkey</td>\n",
" <td>NaN</td>\n",
" <td>Roasted</td>\n",
" <td>NaN</td>\n",
" <td>Rice-based</td>\n",
" <td>NaN</td>\n",
" <td>Homemade</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>Yes</td>\n",
" <td>Yes</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>NaN</td>\n",
" <td>Suburban</td>\n",
" <td>18 - 29</td>\n",
" <td>Male</td>\n",
" <td>$0 to $9,999</td>\n",
" <td>Mountain</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4337933040</td>\n",
" <td>Yes</td>\n",
" <td>Turkey</td>\n",
" <td>NaN</td>\n",
" <td>Baked</td>\n",
" <td>NaN</td>\n",
" <td>Bread-based</td>\n",
" <td>NaN</td>\n",
" <td>Homemade</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>NaN</td>\n",
" <td>Urban</td>\n",
" <td>30 - 44</td>\n",
" <td>Male</td>\n",
" <td>$200,000 and up</td>\n",
" <td>Pacific</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>4337931983</td>\n",
" <td>Yes</td>\n",
" <td>Tofurkey</td>\n",
" <td>NaN</td>\n",
" <td>Baked</td>\n",
" <td>NaN</td>\n",
" <td>Bread-based</td>\n",
" <td>NaN</td>\n",
" <td>Canned</td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>Yes</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>No</td>\n",
" <td>NaN</td>\n",
" <td>Urban</td>\n",
" <td>30 - 44</td>\n",
" <td>Male</td>\n",
" <td>$100,000 to $124,999</td>\n",
" <td>Pacific</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 65 columns</p>\n",
"</div>"
],
"text/plain": [
" RespondentID Do you celebrate Thanksgiving? \\\n",
"0 4337954960 Yes \n",
"1 4337951949 Yes \n",
"2 4337935621 Yes \n",
"3 4337933040 Yes \n",
"4 4337931983 Yes \n",
"\n",
" What is typically the main dish at your Thanksgiving dinner? \\\n",
"0 Turkey \n",
"1 Turkey \n",
"2 Turkey \n",
"3 Turkey \n",
"4 Tofurkey \n",
"\n",
" What is typically the main dish at your Thanksgiving dinner? - Other (please specify) \\\n",
"0 NaN \n",
"1 NaN \n",
"2 NaN \n",
"3 NaN \n",
"4 NaN \n",
"\n",
" How is the main dish typically cooked? \\\n",
"0 Baked \n",
"1 Baked \n",
"2 Roasted \n",
"3 Baked \n",
"4 Baked \n",
"\n",
" How is the main dish typically cooked? - Other (please specify) \\\n",
"0 NaN \n",
"1 NaN \n",
"2 NaN \n",
"3 NaN \n",
"4 NaN \n",
"\n",
" What kind of stuffing/dressing do you typically have? \\\n",
"0 Bread-based \n",
"1 Bread-based \n",
"2 Rice-based \n",
"3 Bread-based \n",
"4 Bread-based \n",
"\n",
" What kind of stuffing/dressing do you typically have? - Other (please specify) \\\n",
"0 NaN \n",
"1 NaN \n",
"2 NaN \n",
"3 NaN \n",
"4 NaN \n",
"\n",
" What type of cranberry saucedo you typically have? \\\n",
"0 None \n",
"1 Other (please specify) \n",
"2 Homemade \n",
"3 Homemade \n",
"4 Canned \n",
"\n",
" What type of cranberry saucedo you typically have? - Other (please specify) \\\n",
"0 NaN \n",
"1 Homemade cranberry gelatin ring \n",
"2 NaN \n",
"3 NaN \n",
"4 NaN \n",
"\n",
" ... \\\n",
"0 ... \n",
"1 ... \n",
"2 ... \n",
"3 ... \n",
"4 ... \n",
"\n",
" Have you ever tried to meet up with hometown friends on Thanksgiving night? \\\n",
"0 Yes \n",
"1 No \n",
"2 Yes \n",
"3 Yes \n",
"4 Yes \n",
"\n",
" Have you ever attended a \"Friendsgiving?\" \\\n",
"0 No \n",
"1 No \n",
"2 Yes \n",
"3 No \n",
"4 No \n",
"\n",
" Will you shop any Black Friday sales on Thanksgiving Day? \\\n",
"0 No \n",
"1 Yes \n",
"2 Yes \n",
"3 No \n",
"4 No \n",
"\n",
" Do you work in retail? Will you employer make you work on Black Friday? \\\n",
"0 No NaN \n",
"1 No NaN \n",
"2 No NaN \n",
"3 No NaN \n",
"4 No NaN \n",
"\n",
" How would you describe where you live? Age What is your gender? \\\n",
"0 Suburban 18 - 29 Male \n",
"1 Rural 18 - 29 Female \n",
"2 Suburban 18 - 29 Male \n",
"3 Urban 30 - 44 Male \n",
"4 Urban 30 - 44 Male \n",
"\n",
" How much total combined money did all members of your HOUSEHOLD earn last year? \\\n",
"0 $75,000 to $99,999 \n",
"1 $50,000 to $74,999 \n",
"2 $0 to $9,999 \n",
"3 $200,000 and up \n",
"4 $100,000 to $124,999 \n",
"\n",
" US Region \n",
"0 Middle Atlantic \n",
"1 East South Central \n",
"2 Mountain \n",
"3 Pacific \n",
"4 Pacific \n",
"\n",
"[5 rows x 65 columns]"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"\n",
"data = pd.read_csv('thanksgiving_poll.csv', encoding='utf-8')\n",
"\n",
"data.head()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array(['Yes', 'No'], dtype=object)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 获得series中的唯一值\n",
"a=data['Do you celebrate Thanksgiving?'].unique()\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['RespondentID', 'Do you celebrate Thanksgiving?',\n",
" 'What is typically the main dish at your Thanksgiving dinner?',\n",
" 'What is typically the main dish at your Thanksgiving dinner? - Other (please specify)',\n",
" 'How is the main dish typically cooked?',\n",
" 'How is the main dish typically cooked? - Other (please specify)',\n",
" 'What kind of stuffing/dressing do you typically have?',\n",
" 'What kind of stuffing/dressing do you typically have? - Other (please specify)',\n",
" 'What type of cranberry saucedo you typically have?',\n",
" 'What type of cranberry saucedo you typically have? - Other (please specify)',\n",
" 'Do you typically have gravy?',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Brussel sprouts',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Carrots',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Cauliflower',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Corn',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Cornbread',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Fruit salad',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Green beans/green bean casserole',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Macaroni and cheese',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Mashed potatoes',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Rolls/biscuits',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Squash',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Vegetable salad',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Yams/sweet potato casserole',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Other (please specify)',\n",
" 'Which of these side dishes aretypically served at your Thanksgiving dinner? Please select all that apply. - Other (please specify).1',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Apple',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Buttermilk',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Cherry',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Chocolate',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Coconut cream',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Key lime',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Peach',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Pecan',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Pumpkin',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Sweet Potato',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - None',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Other (please specify)',\n",
" 'Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Other (please specify).1',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Apple cobbler',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Blondies',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Brownies',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Carrot cake',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Cheesecake',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Cookies',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Fudge',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Ice cream',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Peach cobbler',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - None',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Other (please specify)',\n",
" 'Which of these desserts do you typically have at Thanksgiving dinner? Please select all that apply. - Other (please specify).1',\n",
" 'Do you typically pray before or after the Thanksgiving meal?',\n",
" 'How far will you travel for Thanksgiving?',\n",
" 'Will you watch any of the following programs on Thanksgiving? Please select all that apply. - Macy's Parade',\n",
" 'What's the age cutoff at your \"kids' table\" at Thanksgiving?',\n",
" 'Have you ever tried to meet up with hometown friends on Thanksgiving night?',\n",
" 'Have you ever attended a \"Friendsgiving?\"',\n",
" 'Will you shop any Black Friday sales on Thanksgiving Day?',\n",
" 'Do you work in retail?',\n",
" 'Will you employer make you work on Black Friday?',\n",
" 'How would you describe where you live?', 'Age', 'What is your gender?',\n",
" 'How much total combined money did all members of your HOUSEHOLD earn last year?',\n",
" 'US Region'],\n",
" dtype='object')"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 获得列label\n",
"data.columns"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Female 544\n",
"Male 481\n",
"NaN 33\n",
"Name: What is your gender?, dtype: int64"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 统计词频\n",
"data['What is your gender?'].value_counts(dropna=False)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import math\n",
"\n",
"def gender_code(gender_string):\n",
" if isinstance(gender_string, float) and math.isnan(gender_string):\n",
" return gender_string\n",
" return int(gender_string == \"Female\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"pandas.core.series.Series"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 对于series使用apply函数\n",
"# 更新列\n",
"data[\"gender\"] = data[\"What is your gender?\"].apply(gender_code)\n",
"data[\"gender\"].value_counts(dropna=False)\n",
"type(data['gender'].value_counts(dropna=False))"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"RespondentID object\n",
"Do you celebrate Thanksgiving? object\n",
"What is typically the main dish at your Thanksgiving dinner? object\n",
"What is typically the main dish at your Thanksgiving dinner? - Other (please specify) object\n",
"How is the main dish typically cooked? object\n",
"dtype: object"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 对于dataframe的列使用apply\n",
"data.apply(lambda x: x.dtype).head()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"$25,000 to $49,999 180\n",
"Prefer not to answer 136\n",
"$50,000 to $74,999 135\n",
"$75,000 to $99,999 133\n",
"$100,000 to $124,999 111\n",
"$200,000 and up 80\n",
"$10,000 to $24,999 68\n",
"$0 to $9,999 66\n",
"$125,000 to $149,999 49\n",
"$150,000 to $174,999 40\n",
"NaN 33\n",
"$175,000 to $199,999 27\n",
"Name: How much total combined money did all members of your HOUSEHOLD earn last year?, dtype: int64"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 统计词频\n",
"data[\"How much total combined money did all members of your HOUSEHOLD earn last year?\"].value_counts(dropna=False)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"def clean_income(value):\n",
" if value == \"$200,000 and up\":\n",
" return 200000\n",
" elif value == \"Prefer not to answer\":\n",
" return np.nan\n",
" elif isinstance(value, float) and math.isnan(value):\n",
" return np.nan\n",
" value = value.replace(\",\", \"\").replace(\"$\", \"\")\n",
" income_high, income_low = value.split(\" to \")\n",
" return (int(income_high) + int(income_low)) / 2"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 87499.5\n",
"1 62499.5\n",
"2 4999.5\n",
"3 200000.0\n",
"4 112499.5\n",
"Name: income, dtype: float64"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 对series使用apply返回新的列,然后将列设置回dataframe\n",
"data[\"income\"] = data[\"How much total combined money did all members of your HOUSEHOLD earn last year?\"].apply(clean_income)\n",
"data[\"income\"].head()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Canned 502\n",
"Homemade 301\n",
"None 146\n",
"Other (please specify) 25\n",
"Name: What type of cranberry saucedo you typically have?, dtype: int64"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 统计词频\n",
"data[\"What type of cranberry saucedo you typically have?\"].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"homemade = data[data[\"What type of cranberry saucedo you typically have?\"] == \"Homemade\"]\n",
"canned = data[data[\"What type of cranberry saucedo you typically have?\"] == \"Canned\"]"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"94878.1072874494\n",
"83823.40340909091\n"
]
}
],
"source": [
"print(homemade[\"income\"].mean())\n",
"print(canned[\"income\"].mean())"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<pandas.core.groupby.DataFrameGroupBy object at 0x000001D87DB2F710>"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 按照指定的列分组, 返回一个dataframe iterator, 每个dataframe在指定列上取值一致\n",
"grouped = data.groupby(\"What type of cranberry saucedo you typically have?\")\n",
"grouped"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Canned\n",
"(502, 67)\n",
"<class 'pandas.core.frame.DataFrame'>\n",
"Homemade\n",
"(301, 67)\n",
"<class 'pandas.core.frame.DataFrame'>\n",
"None\n",
"(146, 67)\n",
"<class 'pandas.core.frame.DataFrame'>\n",
"Other (please specify)\n",
"(25, 67)\n",
"<class 'pandas.core.frame.DataFrame'>\n"
]
},
{
"data": {
"text/plain": [
"<pandas.core.groupby.SeriesGroupBy object at 0x000001D87E3F0EB8>"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"grouped.groups\n",
"grouped.size()\n",
"# 遍历dataframe group\n",
"for name, group in grouped:\n",
" print(name)\n",
" print(group.shape)\n",
" print(type(group))\n",
"grouped[\"income\"]"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"What type of cranberry saucedo you typically have?\n",
"Canned 83823.403409\n",
"Homemade 94878.107287\n",
"None 78886.084034\n",
"Other (please specify) 86629.978261\n",
"Name: income, dtype: float64"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 从dataframe group中获得 series group, 调用agg,应用到series上,每个series返回一项组成series\n",
"a=grouped[\"income\"].agg(np.mean)\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x1d87e38b4e0>"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAFuCAYAAABqVdQ0AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAIABJREFUeJzt3XmYZXV95/H3B5BNQBZbRwFtxFaD\nRgFbZUuikiiumAkkGImgTMhClKgzGXU0GJeZxIxxxjWibDEo4kIkakRk0bgBzSK7oQMoPRBoBBUB\nZfvOH+dXcK1T3bVwu09V+X49Tz117+8s93vXzzm/s6WqkCRp1AZDFyBJmn8MB0lSj+EgSeoxHCRJ\nPYaDJKnHcJAk9RgOkqQew0GS1GM4SJJ6Nhq6gLl6+MMfXkuXLh26DElaMM4///ybq2rJTMZdsOGw\ndOlSVqxYMXQZkrRgJPn+TMe1W0mS1GM4SJJ6DAdJUo/hIEnqMRwkST2GgySpx3CQJPUYDpKkngV7\nEJzmt6Vv/OLQJczItX/9oqFLkOYl1xwkST2GgySpx3CQJPUYDpKkHsNBktRjOEiSegwHSVKP4SBJ\n6jEcJEk9hoMkqcdwkCT1GA6SpB7DQZLUYzhIknoMB0lSj+EgSeoxHCRJPV4JTtIvnYVwpcKhr1Lo\nmoMkqcdwkCT1GA6SpB7DQZLUYzhIknrcW2mEezBIUsc1B0lSj+EgSeoxHCRJPYaDJKnHcJAk9cwo\nHJK8LsllSS5N8skkmybZKck5Sa5K8qkkG7dxN2n3V7bhS0fm86bW/r0kzx9p36+1rUzyxnE/SUnS\n7EwbDkm2B14LLK+qpwAbAgcBfwO8t6qWAbcCh7VJDgNurarHA+9t45Fklzbdk4H9gA8l2TDJhsAH\ngRcAuwAvb+NKkgYy026ljYDNkmwEbA7cADwX+EwbfgLwsnZ7/3afNnzfJGntJ1XVz6vqGmAl8Mz2\nt7Kqrq6qu4CT2riSpIFMGw5V9f+A/w38gC4UfgycD/yoqu5po60Ctm+3tweua9Pe08bfbrR90jRr\nau9JcniSFUlWrF69eibPT5I0BzPpVtqGbkl+J+DRwEPpuoAmq4lJ1jBstu39xqqjq2p5VS1fsmTJ\ndKVLkuZoJqfP+E3gmqpaDZDkc8BewNZJNmprBzsA17fxVwE7AqtaN9TDgFtG2ieMTrOmdkksjFO7\ngKd3WUxmss3hB8AeSTZv2w72BS4HzgIOaOMcAny+3T613acNP7OqqrUf1PZm2glYBpwLnAcsa3s/\nbUy30frUB//UJElzNe2aQ1Wdk+QzwAXAPcCFwNHAF4GTkryztR3TJjkG+HiSlXRrDAe1+VyW5GS6\nYLkHOKKq7gVI8mfAaXR7Qh1bVZeN7ylKkmZrRmdlraqjgKMmNV9Nt6fR5HF/Bhy4hvm8C3jXFO1f\nAr40k1okSeueR0hLknoMB0lSj+EgSeoxHCRJPYaDJKnHcJAk9RgOkqQew0GS1GM4SJJ6DAdJUo/h\nIEnqMRwkST2GgySpx3CQJPUYDpKkHsNBktRjOEiSegwHSVKP4SBJ6jEcJEk9hoMkqcdwkCT1GA6S\npB7DQZLUYzhIknoMB0lSj+EgSeoxHCRJPYaDJKnHcJAk9RgOkqQew0GS1GM4SJJ6DAdJUs+MwiHJ\n1kk+k+TKJFck2TPJtklOT3JV+79NGzdJ3pdkZZKLk+w+Mp9D2vhXJTlkpP3pSS5p07wvScb/VCVJ\nMzXTNYf/C3y5qp4EPA24AngjcEZVLQPOaPcBXgAsa3+HAx8GSLItcBTwLOCZwFETgdLGOXxkuv0e\n3NOSJD0Y04ZDkq2AXweOAaiqu6rqR8D+wAlttBOAl7Xb+wP/UJ3vAFsneRTwfOD0qrqlqm4FTgf2\na8O2qqpvV1UB/zAyL0nSAGay5vA4YDVwXJILk3wsyUOBR1bVDQDt/yPa+NsD141Mv6q1ra191RTt\nPUkOT7IiyYrVq1fPoHRJ0lzMJBw2AnYHPlxVuwG380AX0lSm2l5Qc2jvN1YdXVXLq2r5kiVL1l61\nJGnOZhIOq4BVVXVOu/8ZurC4sXUJ0f7fNDL+jiPT7wBcP037DlO0S5IGMm04VNV/ANcleWJr2he4\nHDgVmNjj6BDg8+32qcAr215LewA/bt1OpwHPS7JN2xD9POC0Nuy2JHu0vZReOTIvSdIANprheK8B\nTkyyMXA18Cq6YDk5yWHAD4AD27hfAl4IrATuaONSVbckeQdwXhvv7VV1S7v9J8DxwGbAv7Q/SdJA\nZhQOVXURsHyKQftOMW4BR6xhPscCx07RvgJ4ykxqkSStex4hLUnqMRwkST2GgySpx3CQJPUYDpKk\nHsNBktRjOEiSegwHSVKP4SBJ6jEcJEk9hoMkqcdwkCT1GA6SpB7DQZLUYzhIknoMB0lSj+EgSeox\nHCRJPYaDJKnHcJAk9RgOkqQew0GS1GM4SJJ6DAdJUo/hIEnqMRwkST2GgySpx3CQJPUYDpKkHsNB\nktRjOEiSegwHSVKP4SBJ6jEcJEk9hoMkqWfG4ZBkwyQXJvlCu79TknOSXJXkU0k2bu2btPsr2/Cl\nI/N4U2v/XpLnj7Tv19pWJnnj+J6eJGkuZrPmcCRwxcj9vwHeW1XLgFuBw1r7YcCtVfV44L1tPJLs\nAhwEPBnYD/hQC5wNgQ8CLwB2AV7expUkDWRG4ZBkB+BFwMfa/QDPBT7TRjkBeFm7vX+7Txu+bxt/\nf+Ckqvp5VV0DrASe2f5WVtXVVXUXcFIbV5I0kJmuOfwf4C+A+9r97YAfVdU97f4qYPt2e3vgOoA2\n/Mdt/PvbJ02zpvaeJIcnWZFkxerVq2dYuiRptqYNhyQvBm6qqvNHm6cYtaYZNtv2fmPV0VW1vKqW\nL1myZC1VS5IejI1mMM7ewEuTvBDYFNiKbk1i6yQbtbWDHYDr2/irgB2BVUk2Ah4G3DLSPmF0mjW1\nS5IGMO2aQ1W9qap2qKqldBuUz6yqVwBnAQe00Q4BPt9un9ru04afWVXV2g9qezPtBCwDzgXOA5a1\nvZ82bo9x6lienSRpTmay5rAm/x04Kck7gQuBY1r7McDHk6ykW2M4CKCqLktyMnA5cA9wRFXdC5Dk\nz4DTgA2BY6vqsgdRlyTpQZpVOFTV2cDZ7fbVdHsaTR7nZ8CBa5j+XcC7pmj/EvCl2dQiSVp3PEJa\nktRjOEiSegwHSVKP4SBJ6jEcJEk9hoMkqcdwkCT1GA6SpB7DQZLUYzhIknoMB0lSj+EgSeoxHCRJ\nPYaDJKnHcJAk9RgOkqQew0GS1GM4SJJ6DAdJUo/hIEnqMRwkST2GgySpx3CQJPUYDpKkHsNBktRj\nOEiSegwHSVKP4SBJ6jEcJEk9hoMkqcdwkCT1GA6SpB7DQZLUYzhIknqmDYckOyY5K8kVSS5LcmRr\n3zbJ6Umuav+3ae1J8r4kK5NcnGT3kXkd0sa/KskhI+1PT3JJm+Z9SbIunqwkaWZmsuZwD/CGqvoV\nYA/giCS7AG8EzqiqZcAZ7T7AC4Bl7e9w4MPQhQlwFPAs4JnAUROB0sY5fGS6/R78U5MkzdW04VBV\nN1TVBe32bcAVwPbA/sAJbbQTgJe12/sD/1Cd7wBbJ3kU8Hzg9Kq6papuBU4H9mvDtqqqb1dVAf8w\nMi9J0gBmtc0hyVJgN+Ac4JFVdQN0AQI8oo22PXDdyGSrWtva2ldN0S5JGsiMwyHJFsBngT+vqp+s\nbdQp2moO7VPVcHiSFUlWrF69erqSJUlzNKNwSPIQumA4sao+15pvbF1CtP83tfZVwI4jk+8AXD9N\n+w5TtPdU1dFVtbyqli9ZsmQmpUuS5mAmeysFOAa4oqr+bmTQqcDEHkeHAJ8faX9l22tpD+DHrdvp\nNOB5SbZpG6KfB5zWht2WZI/2WK8cmZckaQAbzWCcvYE/AC5JclFrezPw18DJSQ4DfgAc2IZ9CXgh\nsBK4A3gVQFXdkuQdwHltvLdX1S3t9p8AxwObAf/S/iRJA5k2HKrqG0y9XQBg3ynGL+CINczrWODY\nKdpXAE+ZrhZJ0vrhEdKSpB7DQZLUYzhIknoMB0lSj+EgSeoxHCRJPYaDJKnHcJAk9RgOkqQew0GS\n1GM4SJJ6DAdJUo/hIEnqMRwkST2GgySpx3CQJPUYDpKkHsNBktRjOEiSegwHSVKP4SBJ6jEcJEk9\nhoMkqcdwkCT1GA6SpB7DQZLUYzhIknoMB0lSj+EgSeoxHCRJPYaDJKnHcJAk9RgOkqQew0GS1GM4\nSJJ65k04JNkvyfeSrEzyxqHrkaRfZvMiHJJsCHwQeAGwC/DyJLsMW5Uk/fKaF+EAPBNYWVVXV9Vd\nwEnA/gPXJEm/tFJVQ9dAkgOA/arqv7T7fwA8q6r+bNJ4hwOHt7tPBL63XgudvYcDNw9dxCLi6zle\nvp7jtRBez8dW1ZKZjLjRuq5khjJFWy+1qupo4Oh1X854JFlRVcuHrmOx8PUcL1/P8Vpsr+d86VZa\nBew4cn8H4PqBapGkX3rzJRzOA5Yl2SnJxsBBwKkD1yRJv7TmRbdSVd2T5M+A04ANgWOr6rKByxqH\nBdMFtkD4eo6Xr+d4LarXc15skJYkzS/zpVtJkjSPGA6SpB7DQZLUYzhIknrmxd5Ki0GS/7y24VX1\nufVVi7Q2SfYBllXVcUmWAFtU1TVD17WQJNkUeDHwa8CjgTuBS4EvLpI9Ld1baVySHNduPgLYCziz\n3X8OcHZVrTU8NLUkTwA+DDyyqp6S5KnAS6vqnQOXtiAlOQpYDjyxqp6Q5NHAp6tq74FLWzCSvA14\nCXA2cD5wE7Ap8AS67/umwBuq6uKBShwLw2HMknwB+MOquqHdfxTwQcNhbpJ8DfhvwEeqarfWdmlV\nPWXYyhamJBcBuwEXjLyeF1fVU4etbOFI8qKq+uJahj8CeExVrViPZY2d2xzGb+lEMDQ30i1RaG42\nr6pzJ7XdM0gli8Nd1S0RFkCShw5cz4IzEQxJplxAqaqbFnowgNsc1oWzk5wGfJLuC3gQcNawJS1o\nNyfZmQd+zA4Ablj7JFqLk5N8BNg6yR8CrwY+OnBNC9Xft9P9HA98oqp+NHA9Y2W30jqQ5LeBX293\nv15VpwxZz0KW5HF0pyXYC7gVuAY4uKquHbKuhSzJbwHPozsb8mlVdfrAJS1YSZbRBeyBwLnAcYvl\n9TQc1oEkj6XbG+SrSTYHNqyq24auayFr3R8b+DpqvmlXsnwZ8D7gJ3Sh++aFvoei4TBmbVX9cGDb\nqtq5LVn8fVXtO3BpC0qS169teFX93fqqZTFpu1z/Dd1edWl/VVVbDVrYAtT2nHsV8CLgdOCYqrqg\n7QH27ap67KAFPkhucxi/I+gue3oOQFVd1fZe0Oxs2f4/EXgGD5zC/SXA1wepaHF4N/CSqrpi6EIW\ngQ/Qba95c1XdOdFYVdcnectwZY2H4TB+P6+qu5Lu4nZJNmKKq9pp7arqrwCSfAXYfaI7qe1j/ukB\nS1vobjQYHpwkZ7SegO9U1cenGmdN7QuJ4TB+X0vyZmCztuHvT4F/HrimhewxwF0j9+8Clg5TyqKw\nIsmngH8Cfj7RuND7x9ezRyX5DeAlST7JpMscV9UFw5Q1Xm5zGLMkGwCHMbI3CPCx8oWekyT/A/hd\n4BS6NbDfBk6uqv85aGEL1MiR/KOqql693otZoNru1IcB+wCTj2eoqnru+q9q/AwHzXtJnk73RYRu\n1+ALh6xHAkjy1qp6x9B1rCuGw5gl2Rt4G/BYum67ib1BHjdkXQtd26i/6cT9qvrBgOUsWEl2AN4P\n7E23JvYN4MiqWjVoYQtIkidV1ZVJdp9quN1KmlKSK4HX0Z2Q696J9qr64WBFLWBJXgq8h+7MlzfR\nbYO4sqqePGhhC1SS04FPABMbTA8GXlFVvzVcVQtLkqOr6vAkU535wG4lTS3JOVX1rKHrWCySfBd4\nLvDVqtotyXOAl1fV4QOXtiAluaiqdp2uTfLEe+N3VpK/TbJnkt0n/oYuagG7u611bZBkg6o6C/CH\nbO5uTnJwkg3b38GAa7VzkOSIJFuP3N8myZ8OWdM4ueYwZot9VXN9S/JVulMT/C/g4XRdS8+oqr0G\nLWyBSvIYuoO39qTb5vAtum0O3x+0sAVoDWthF06cCn2hMxw0r7VzKv2MbsP+K4CHASe6DUdDS3Ix\n8LSJ3dTbOZYuXizbwzwIbsySbAL8Dt2BWve/vlX19qFqWsiq6naAJFvhwYRzluQv1zK4FvMumevQ\naXSnQP97urWwPwa+PGxJ4+Oaw5gl+TLwY/p7K71nsKIWsCR/BLyd7hq99+GuwXOS5A1TND+U7mCu\n7apqi/Vc0oLXDnj9I2Bfus/lV+gOeL13rRMuEIbDmHkJy/FKchWwZ1XdPHQti0WSLYEj6YLhZOA9\nVXXTsFUtTEk2o7sk6PeGrmXc3Ftp/L6V5FeHLmIR+XfgjqGLWAySbJvkncDFdF2eu1fVfzcY5qYd\ng3MRrSspya5JTl37VAuHaw5jluRy4PF0Vyz7OQ90g3gB9zlIshtwHN0p0EdPFPfawYpagJL8LfCf\n6a6q98Gq+unAJS14Sc6nOwbn7Ik9lJJcvFi+64bDmLWrwPW4q+DcJDmX7hQPl9BtcwCgqk4YrKgF\nKMl9dOF6D794Cnkv9jNHEwe8ju6+upjCwb2VxmwiBCafC0hzdk9VrfWqcJpeVdmFPH6XJvl9YMN2\nxcfX0h03sij4gRmzJC9tG1GvAb4GXAv8y6BFLWxnJTk8yaNan/m2SbYduigJeA3wZLo1sk/SXT/6\nzwetaIzsVhozzwU0XkmumaLZXVk1b7RjcGriaoWLhd1K43d3Vf0wyf3nAkryN0MXtVBV1U5D1yBN\nJckzgGNp1ztP8mPg1VV1/qCFjYnhMH4/SrIF8HXgxCQ30W0E1Bwk2Rx4Pd2+5Ie3vt0nVtUXBi5N\nOgb406r6V4Ak+9DtWbcoNki7zWFMkjy+Xehnf7r98l9Ht//zD+n6JjU3x9FdN3riRHurgHcOV450\nv9smggGgqr4BLJquJcNhfP4P3Yfl9qq6r6ruabtbfonuynCam52r6t3A3QBVdSeTLuguDeTcJB9J\n8uwkv5HkQ8DZi+U0/XYrjc/Sqrp4cmNVrUiydP2Xs2jc1U5RMHHmy50ZORhOGtDE6bqPmtS+F93n\ndUGfpt9wGJ+1HdOw2XqrYvE5iq57bsckJ9Jd+/jQQSuSgKp6ztA1rEt2K43PeUn+cHJjksPoztCq\nOaiq0+lO+3Ao3b7ky6vq7CFrkgCSHJlkq3Q+luSCJM8buq5x8TiHMUnySOAUuo2nE2GwHNgY+O2q\n+o+halvokjyV/vUxPjdYQRLdMU1V9bQkzweOAN4KHFdVC357A9itNDZVdSOwVzvobeKU3V+sqjMH\nLGvBS3Is3a6Bl/HAuZUKMBw0tIkdI15IFwrfTbJodpZwzUHzWpLLq2qXoeuQJktyHLA9sBPwNGBD\nujO0Pn3QwsbENQfNd99OsktVXT50IdIkh9HtsXR1Vd2RZDvgVQPXNDauOWheS/LrdNeO/g+8Poa0\n3hgOmteSrKQ7fcbk6zl4fQxpHbJbSfPdD6pq0Vx6UVooXHPQvNZOSbA1XdfS6GVC3VtJg2sn21tW\nVcclWQJsUVVTnWZ+wTEcNK+1PUImq6p69XovRhqR5Ci6Y5meWFVPSPJo4NNVtffApY2F4SBJc5Dk\nImA34ILFeA1pT5+heS3JDklOSXJTkhuTfDbJDkPXJQF3Vbd0PXFSyIcOXM9YGQ6a744DTgUeTXfA\n0T+3NmloJyf5CLB1O6/aV4GPDlzT2NitpHktyUVVtet0bdIQkvwW8Dy6429OayeKXBRcc9B8d3OS\ng5Ns2P4Opru6njSo1o10ZlX9N7o1hs2SPGTgssbGNQfNa0keA3wA2JOub/dbwJEeBKehJTkf+DVg\nG+A7wArgjqp6xaCFjYnhIElzkOSCqto9yWuAzarq3UkunNhzaaHzCGnNS0neT9sLZCpV9dr1WI40\nlSTZE3gF3Un4YBH9pi6aJ6JFZ8XI7b+if51eaWhHAm8CTqmqy5I8Djhr4JrGxm4lzXuLaVVdWihc\nc9BC4BKM5p12LqW/AJ4MbDrRXlXPHayoMXJXVkmamxOBK+muBPdXwLXAeUMWNE52K2leSnIbD6wx\nbA7cMTGI7sR7Ww1SmNQkOb+qnj56PqUkX6uq3xi6tnGwW0nzUlVtOXQN0jTubv9vSPIi4Hpg0Zz3\ny3CQpLl5Z5KHAW8A3g9sBbxu2JLGx24lSVKPG6QlaQ6SPCHJGUkubfefmuQtQ9c1LoaDJM3NR+kO\ngrsboKouBg4atKIxMhwkaW42r6pzJ7XdM0gl64DhIElzc3OSnXngSnAHADcMW9L4uEFakuagnUvp\naGAv4FbgGuDgqrp2yLrGxXCQpAehXfRng6q6behaxslwkKRZSPL6tQ2vqr9bX7WsSx4EJ0mz80tx\n9L5rDpKkHvdWkqQ5SPK4JP+cZHWSm5J8vm2kXhQMB0mam08AJwOPAh4NfBr45KAVjZHhIElzk6r6\neFXd0/7+kUV0YSq3OUjSHCT5a+BHwEl0ofB7wCbABwGq6pbhqnvwDAdJmoMk16xlcFXVgt7+YDhI\nknrc5iBJs5Bkn2mGb5XkKeurnnXFg+AkaXZ+J8m7gS8D5wOrgU2BxwPPAR5Ld3W4Bc1uJUmapSTb\nAAcAe9PtynoncAXwxar6xpC1jYvhIEnqcZuDJKnHcJAk9RgOkjRLSTZIstfQdaxLhoMkzVJV3Qe8\nZ+g61iXDQZLm5itJfidJhi5kXXBvJUmagyS3AQ8F7qXblTV0p83YatDCxsRwkCT12K0kSXOQzsFJ\n3tru75jkmUPXNS6uOUjSHCT5MHAf8Nyq+pV21PRXquoZA5c2Fp5bSZLm5llVtXuSCwGq6tYkGw9d\n1LjYrSRJc3N3kg1pV39LsoRuTWJRMBwkaW7eB5wCPCLJu4BvAP9z2JLGx20OkjRHSZ4E7Eu3G+sZ\nVXXFwCWNjeEgSXPUupUeycj226r6wXAVjY8bpCVpDpK8BjgKuJHuQLjQbX946pB1jYtrDpI0B0lW\n0u2x9MOha1kX3CAtSXNzHfDjoYtYV1xzkKRZSPL6dvPJwBOBLwI/nxheVX83RF3j5jYHSZqdLdv/\nH7S/jdsftGMeFgPXHCRpDpIcWFWfnq5toTIcJGkOklxQVbtP17ZQ2a0kSbOQ5AXAC4Htk7xvZNBW\nwD3DVDV+hoMkzc71wPnAS9v/CbcBrxukonXAbiVJmoMkWwBL6TZC/3tV/WzYisbL4xwkaRaSbJTk\n3cA1wAnAPwLXJXl3kocMW934GA6SNDt/C2wLPK6qnl5VuwE7A1sD/3vQysbIbiVJmoUkVwFPqEk/\nnu0kfFdW1bJhKhsv1xwkaXZqcjC0xntZRAfBGQ6SNDuXJ3nl5MYkBwNXDlDPOmG3kiTNQpLtgc8B\nd9LtylrAM4DNgN+uqv83YHljYzhI0hwkeS7dyfcCXFZVZwxc0lgZDpKkHrc5SJJ6DAdJUo/hMEZJ\n3pvkz0fun5bkYyP335Pk9UmeneQLs5z3oUkePdth61qSTZJ8NclFSX5vPT3m8UkOWB+PNd8k+ek8\nqOFBfd6SvD3Jb85huqVJLm23Z/UdSvK2JP91to85i/k/on0PLkmyIsnj19VjrS+Gw3h9C9gLIMkG\nwMPpNlhN2Av45hznfSiwpi/k2oata7sBD6mqXavqU7OZsB00tF5Nfsx0/B7MzqE8iM9bVf1lVX11\nfOXMCxsB/7WqfhX4KPDGget50PxSjNc3aeFAFwqXArcl2SbJJsCvABe24Vsk+UySK5OcmCQASf4y\nyXlJLk1ydPvxOgBYDpzYltA3m3jAKYa9KMkpI8N/K8nn2u2ftrWXC5KckWRJa985yZeTnJ/kX5M8\nafITS7Jtkn9KcnGS7yR5apJH0J1XZtf22DtPmubxbWnqu+0xd25LfGcl+QRwSRvvn9pjX5bk8JHp\nf5rkXW367yR55Mjsf7PV+m9JXtzG3zDJ37bX7+Ikf9Taf+Ex2xLoFUk+BFwAvDXJe0ce9w+T/MKl\nHtu8j2/vyyVJXjcy7nmtxs8m2by1/8LazegSf5K/aPP4bpK/Xtt7kGSnJN9uj/GOkXmkPdeJenpr\nbUnekeTIkfvvSvLaNU2bSUvjST6Q5NBJ8xzH5+3+1ybJM5J8q70W5ybZsr0//9qmuyDJXqxBkg2S\nXDUy7w2SrEzy8ClG3yXJ2UmuTvLakXn0Pn9J/iTd+ZMmxjk0yfvb7YNbrRcl+UiSDavq+qq6qI2+\nKbDwT8JXVf6N8Q+4FngM8EfAHwPvoDv3+97A19s4z6a7MPkOdAH9bWCfNmzbkXl9HHhJu302sHwN\nj3n/MLrd6q4ElrT7nxiZRwGvaLf/EvhAu30GsKzdfhZw5hSP8X7gqHb7ucBFI8/lC2uo6xy6/b6h\n+8Js3sa/HdhpZLxt2//N6AJ1u5F6J2p/N/CWdvt44MvttVsGrGrzP3xknE2AFcBOkx+T7kya9wF7\ntPsPBf6dbg0IujXAX530XJ4OnD5yf+v2f7uRtncCrxmp8YCRYT9t/1/Q5r/5pOc+5XsAnAq8st0+\nYmQ+vwOcDmwIPJLucpWPmlTzUuCCdnuD9hy3W9O0k99L4APAoevg83Y8cADdpTWvBp7R2reiWwLf\nHNi0tS0DVow8n0snf+6Ao4A/b7efB3x2iprf1l73TejW6H848n73Pn/AEmDlyPT/AuxDt4D3zyPT\nfmji/Wn3d23PaenQv0UP9s81h/GbWHvYi+5H/9sj9781Mt65VbWqqu4DLqL74AM8J8k5SS6h+xEe\n7ZaaVnWf0I8DByfZGtiT7oMN3Q/iRNfPPwL7pDvt8F7Ap5NcBHyE7odisn3afKmqM4HtkjxsTXUk\n2RLYvqpOadP8rKruGHnu14yM/tok3wW+A+xI94MAcBcwsSR7Pg+8RgAnV9V9VXUV3ZfxSXQ/DK9s\nz+Mcui/5xLwmP+b3q+o7rbbbgTOBF7cl9odU1SWTntLVwOOSvD/JfsBPWvtT2lLuJcArmP79+k3g\nuInXoqpumeY92Bv4ZLv98ZH57AN8sqruraobga/RHYh1v6q6Fvhhkt3aa3NhVf1wJtPO1Gw/b5Mm\nfyJwQ1Wd1+b1k6q6B3gI8NGyEBycAAAETElEQVT2mn4a2GWaMo4FJo5YfjVw3BrG+2JV/byqbgZu\nogtGmOLzV1WrgauT7JFku1brN4F96RYUzmvv1b7A4ybVcmh77Rc0L/YzfhPbHX6VbinkOuANdD8m\nx46M9/OR2/cCGyXZlG5JZHlVXZfkbXRLxLN1HN3Szc+AT7cv3FSKbonyR1W16zTzzBqmn834E26/\nf6Tk2XQ/mHtW1R1JzuaB53x3+/GB9hqt5bGrPeZrquq0Xyike4zbJ40/+f7HgDfTLQX3flyq6tYk\nTwOeT7cE/7t0P0THAy+rqu+2Lphnt0nuoXXbJgkPXIA+U9Q+3Xsw1eu8ttd31MfothH8Jx74/K1p\n2vtrbmb62ZvN523UVK8FdBfMuRF4WqtnrV007btyY7qD0p5FF9JTmeo792zW/Pn7FN37fCVwSlVV\ney9PqKo3reExHl9VX19bvQuFaw7j903gxcAtbcnsFrpT+e5JtxaxNhMfypvb0uToHjm3AVuuYbpf\nGFZV19NdreotdD9eEzYYmefvA9+oqp8A1yQ5EO7vy37aFI/xddqXrn2hbm7TTqkNW5XkZW2aTdL6\n4yd5GHBr+2I+CdhjTfOc5MDWv7wz3ZLb94DTgD9JO6d+kickeehMZlZV59AtNf4+Dyyp36/1YW9Q\nVZ8F3gpMXCd4S+CG9pijP0rX0i1hAuxPtzQM8BXg1Xlg28S207wH3wQOardH5/914PfSbQtZAvw6\ncO4UT+0UYD+6NYPTppn2+3T98pu0tcJ9p3yxHsTnbdJ8rgQeneQZ7XlvmWQjus/EDW2t+g/our+m\n8zG6tZOTqzsB3kyt7fP3OeBlwMt5YA3oDOCAdNvbJrbFPXZkmlfN4rHnNdccxu8Suj7NT0xq26Kt\nzq5RVf0oyUfb+NcC540MPh74+yR30i3l3DnNsBPp+oEvHxnvduDJSc6n2+YxsRHzFcCHk7yF7kfs\nJOC7k8p7G3BckouBO4BD1vZcmj8APpLk7cDdwIFTjPNl4I/bfL9Ht2o/E9+j6w55JPDHVfWzdLsN\nLwUuaEt4q+m+3DN1MrBrVd06xbDt6Z7/xALVxJLjW+m6sL5P975N/Gh+FPh8knPpflBuB6iqLyfZ\nFViR5C7gS3RrLGt6D44EPpFuw/JnR+o5hW6B47t0S99/UVX/MbnoqroryVl0ayb3TjdtkpOBi4Gr\neGDnicmO58F93kZr+z3g/el2sriTbin+Q8BnW1ieRX8tbyqn0q3BrKlLaU3W+Plra4uXA7tU1bmt\n7fL2Hn2lfRbupluT/H6b7A384vu0YHn6jEUqyQfo+piPGWn7aVVtMWBZ81q6PXXeW4voHDntB+wC\n4MC2fWZdPc6gn7cky+neu19bH4/3y8BupUWoLak9lW41W9NIsnWSfwPuXGTBsAuwEjhjHQfDoJ+3\nJG+kW1pf03YAzYFrDpKkHtccJEk9hoMkqcdwkCT1GA6SpB7DQZLU8/8Bt1ErjkFXmaMAAAAASUVO\nRK5CYII=\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%matplotlib inline\n",
"\n",
"# agg到dataframe group上,返回dataframe, 没一行对应一个group, 每一列对应一个属性列\n",
"sauce = grouped.agg(np.mean)\n",
"# 取回dataframe的一列,绘图\n",
"sauce[\"income\"].plot(kind=\"bar\")"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# 使用两个列分组\n",
"grouped = data.groupby([\"What type of cranberry saucedo you typically have?\", \"What is typically the main dish at your Thanksgiving dinner?\"])"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style>\n",
" .dataframe thead tr:only-child th {\n",
" text-align: right;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: left;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th></th>\n",
" <th>RespondentID</th>\n",
" <th>gender</th>\n",
" <th>income</th>\n",
" </tr>\n",
" <tr>\n",
" <th>What type of cranberry saucedo you typically have?</th>\n",
" <th>What is typically the main dish at your Thanksgiving dinner?</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th rowspan=\"7\" valign=\"top\">Canned</th>\n",
" <th>Chicken</th>\n",
" <td>4.336354e+09</td>\n",
" <td>0.333333</td>\n",
" <td>80999.600000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Ham/Pork</th>\n",
" <td>4.336757e+09</td>\n",
" <td>0.642857</td>\n",
" <td>77499.535714</td>\n",
" </tr>\n",
" <tr>\n",
" <th>I don't know</th>\n",
" <td>4.335987e+09</td>\n",
" <td>0.000000</td>\n",
" <td>4999.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Other (please specify)</th>\n",
" <td>4.336682e+09</td>\n",
" <td>1.000000</td>\n",
" <td>53213.785714</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Roast beef</th>\n",
" <td>4.336254e+09</td>\n",
" <td>0.571429</td>\n",
" <td>25499.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Tofurkey</th>\n",
" <td>4.337157e+09</td>\n",
" <td>0.714286</td>\n",
" <td>100713.857143</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Turkey</th>\n",
" <td>4.336705e+09</td>\n",
" <td>0.544444</td>\n",
" <td>85242.682045</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">Homemade</th>\n",
" <th>Chicken</th>\n",
" <td>4.336540e+09</td>\n",
" <td>0.750000</td>\n",
" <td>19999.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Ham/Pork</th>\n",
" <td>4.337253e+09</td>\n",
" <td>0.250000</td>\n",
" <td>96874.625000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>I don't know</th>\n",
" <td>4.336084e+09</td>\n",
" <td>1.000000</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Other (please specify)</th>\n",
" <td>4.336863e+09</td>\n",
" <td>0.600000</td>\n",
" <td>55356.642857</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Roast beef</th>\n",
" <td>4.336174e+09</td>\n",
" <td>0.000000</td>\n",
" <td>33749.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Tofurkey</th>\n",
" <td>4.336790e+09</td>\n",
" <td>0.666667</td>\n",
" <td>57916.166667</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Turducken</th>\n",
" <td>4.337475e+09</td>\n",
" <td>0.500000</td>\n",
" <td>200000.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Turkey</th>\n",
" <td>4.336791e+09</td>\n",
" <td>0.531008</td>\n",
" <td>97690.147982</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"8\" valign=\"top\">None</th>\n",
" <th>Chicken</th>\n",
" <td>4.336151e+09</td>\n",
" <td>0.500000</td>\n",
" <td>11249.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Ham/Pork</th>\n",
" <td>4.336680e+09</td>\n",
" <td>0.444444</td>\n",
" <td>61249.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>I don't know</th>\n",
" <td>4.336412e+09</td>\n",
" <td>0.500000</td>\n",
" <td>33749.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Other (please specify)</th>\n",
" <td>4.336688e+09</td>\n",
" <td>0.600000</td>\n",
" <td>119106.678571</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Roast beef</th>\n",
" <td>4.337424e+09</td>\n",
" <td>0.000000</td>\n",
" <td>162499.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Tofurkey</th>\n",
" <td>4.336950e+09</td>\n",
" <td>0.500000</td>\n",
" <td>112499.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Turducken</th>\n",
" <td>4.336739e+09</td>\n",
" <td>0.000000</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Turkey</th>\n",
" <td>4.336784e+09</td>\n",
" <td>0.523364</td>\n",
" <td>74606.275281</td>\n",
" </tr>\n",
" <tr>\n",
" <th rowspan=\"4\" valign=\"top\">Other (please specify)</th>\n",
" <th>Ham/Pork</th>\n",
" <td>4.336465e+09</td>\n",
" <td>1.000000</td>\n",
" <td>87499.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Other (please specify)</th>\n",
" <td>4.337335e+09</td>\n",
" <td>0.000000</td>\n",
" <td>124999.666667</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Tofurkey</th>\n",
" <td>4.336122e+09</td>\n",
" <td>1.000000</td>\n",
" <td>37499.500000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Turkey</th>\n",
" <td>4.336724e+09</td>\n",
" <td>0.700000</td>\n",
" <td>82916.194444</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" RespondentID \\\n",
"What type of cranberry saucedo you typically have? What is typically the main dish at your Thanksg... \n",
"Canned Chicken 4.336354e+09 \n",
" Ham/Pork 4.336757e+09 \n",
" I don't know 4.335987e+09 \n",
" Other (please specify) 4.336682e+09 \n",
" Roast beef 4.336254e+09 \n",
" Tofurkey 4.337157e+09 \n",
" Turkey 4.336705e+09 \n",
"Homemade Chicken 4.336540e+09 \n",
" Ham/Pork 4.337253e+09 \n",
" I don't know 4.336084e+09 \n",
" Other (please specify) 4.336863e+09 \n",
" Roast beef 4.336174e+09 \n",
" Tofurkey 4.336790e+09 \n",
" Turducken 4.337475e+09 \n",
" Turkey 4.336791e+09 \n",
"None Chicken 4.336151e+09 \n",
" Ham/Pork 4.336680e+09 \n",
" I don't know 4.336412e+09 \n",
" Other (please specify) 4.336688e+09 \n",
" Roast beef 4.337424e+09 \n",
" Tofurkey 4.336950e+09 \n",
" Turducken 4.336739e+09 \n",
" Turkey 4.336784e+09 \n",
"Other (please specify) Ham/Pork 4.336465e+09 \n",
" Other (please specify) 4.337335e+09 \n",
" Tofurkey 4.336122e+09 \n",
" Turkey 4.336724e+09 \n",
"\n",
" gender \\\n",
"What type of cranberry saucedo you typically have? What is typically the main dish at your Thanksg... \n",
"Canned Chicken 0.333333 \n",
" Ham/Pork 0.642857 \n",
" I don't know 0.000000 \n",
" Other (please specify) 1.000000 \n",
" Roast beef 0.571429 \n",
" Tofurkey 0.714286 \n",
" Turkey 0.544444 \n",
"Homemade Chicken 0.750000 \n",
" Ham/Pork 0.250000 \n",
" I don't know 1.000000 \n",
" Other (please specify) 0.600000 \n",
" Roast beef 0.000000 \n",
" Tofurkey 0.666667 \n",
" Turducken 0.500000 \n",
" Turkey 0.531008 \n",
"None Chicken 0.500000 \n",
" Ham/Pork 0.444444 \n",
" I don't know 0.500000 \n",
" Other (please specify) 0.600000 \n",
" Roast beef 0.000000 \n",
" Tofurkey 0.500000 \n",
" Turducken 0.000000 \n",
" Turkey 0.523364 \n",
"Other (please specify) Ham/Pork 1.000000 \n",
" Other (please specify) 0.000000 \n",
" Tofurkey 1.000000 \n",
" Turkey 0.700000 \n",
"\n",
" income \n",
"What type of cranberry saucedo you typically have? What is typically the main dish at your Thanksg... \n",
"Canned Chicken 80999.600000 \n",
" Ham/Pork 77499.535714 \n",
" I don't know 4999.500000 \n",
" Other (please specify) 53213.785714 \n",
" Roast beef 25499.500000 \n",
" Tofurkey 100713.857143 \n",
" Turkey 85242.682045 \n",
"Homemade Chicken 19999.500000 \n",
" Ham/Pork 96874.625000 \n",
" I don't know NaN \n",
" Other (please specify) 55356.642857 \n",
" Roast beef 33749.500000 \n",
" Tofurkey 57916.166667 \n",
" Turducken 200000.000000 \n",
" Turkey 97690.147982 \n",
"None Chicken 11249.500000 \n",
" Ham/Pork 61249.500000 \n",
" I don't know 33749.500000 \n",
" Other (please specify) 119106.678571 \n",
" Roast beef 162499.500000 \n",
" Tofurkey 112499.500000 \n",
" Turducken NaN \n",
" Turkey 74606.275281 \n",
"Other (please specify) Ham/Pork 87499.500000 \n",
" Other (please specify) 124999.666667 \n",
" Tofurkey 37499.500000 \n",
" Turkey 82916.194444 "
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 对dataframe group使用agg,返沪dataframe\n",
"a=grouped.agg(np.mean)\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"# 对series group使用agg, 可以传入多个函数\n",
"a=grouped[\"income\"].agg([np.mean, np.sum, np.std]).head(10)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"How would you describe where you live? \n",
"Rural Turkey 189\n",
" Other (please specify) 9\n",
" Ham/Pork 7\n",
" I don't know 3\n",
" Tofurkey 3\n",
" Chicken 2\n",
" Turducken 2\n",
" Roast beef 1\n",
"Suburban Turkey 449\n",
" Ham/Pork 17\n",
" Other (please specify) 13\n",
" Tofurkey 9\n",
" Chicken 3\n",
" Roast beef 3\n",
" I don't know 1\n",
" Turducken 1\n",
"Urban Turkey 198\n",
" Other (please specify) 13\n",
" Tofurkey 8\n",
" Chicken 7\n",
" Roast beef 6\n",
" Ham/Pork 4\n",
"Name: What is typically the main dish at your Thanksgiving dinner?, dtype: int64"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 对series group使用apply, 对每一个series使用apply, 返回多个series, 每个series 作为列垂直拼接\n",
"grouped = data.groupby(\"How would you describe where you live?\")[\"What is typically the main dish at your Thanksgiving dinner?\"]\n",
"grouped.apply(lambda x:x.value_counts())"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment