Skip to content

Instantly share code, notes, and snippets.

@Gonzillaaa
Created November 13, 2012 07:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Gonzillaaa/4064433 to your computer and use it in GitHub Desktop.
Save Gonzillaaa/4064433 to your computer and use it in GitHub Desktop.
{
"metadata": {
"name": "Study #1 OU Group C"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Behaviour Change Analysis Sales vs Expected probability\n",
"\n",
"### Move to working directory"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"cd /Users/gonzillaaa/Dropbox/Code/thesis-analysis/2012-08-07thesis-analysis/analysis"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"/Users/gonzillaaa/Dropbox/Code/thesis-analysis/2012-08-07thesis-analysis/analysis\n"
]
}
],
"prompt_number": 19
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### import and run sales module code written so far"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import pandas as pd\n",
"from behavior import change\n",
"pd.set_printoptions(notebook_repr_html=True, max_columns=45)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 20
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%run behavior/change.py"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 21
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### access ou and arup data under ou_ arup_ + (tab)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"sales_data = ou_sales_analysis\n",
"organisation_name = \"OU\""
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 22
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2 - We focus on one group, create an easy access dataframe (df) for it and look at overall change"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"group = \"df_group_c_sales\"\n",
"df = sales_data.sales_per_group[group]\n",
"print df.keys()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"array([participant_id, experiment_id, sale_bottle, vote_time, sale_color,\n",
" sale_order, voted_ingroup, questionnaire_user_id,\n",
" questionnaire_phase, questionnaire_user_name,\n",
" questionnaire_user_surname, questionnaire_experiment_id,\n",
" questionnaire_user_age, questionnaire_user_gender,\n",
" how_often_buy_wine, where_buy_wine, how_many_bottles, average_spend,\n",
" wine_knowledge, change_discover, change_quantity, change_type,\n",
" change_amount, bottle_id, wine_type, table_id, bottle_display_id,\n",
" pixel, probability_score, pink, green, blue, nocolor], dtype=object)\n"
]
}
],
"prompt_number": 23
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#we get rid of data columns we don't want\n",
"observed = df[[\"experiment_id\", \"participant_id\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"experiment_id\")\n",
"observed = observed.aggregate(np.sum)\n",
"observed = observed.reset_index()\n",
"\n",
"print\n",
"print \"group size :\", len(df[\"participant_id\"].unique())\n",
"testChangeInGroup(group, observed)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"group size : 25\n",
"Chi-square test p-value: 1.39439385104e-05 \n",
"change is significant for group, df_group_c_sales\n",
"\n"
]
}
],
"prompt_number": 24
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Wine knowledge"
]
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"#we get rid of data columns we don't want\n",
"observed = df[[\"experiment_id\", \"participant_id\", \"wine_knowledge\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"wine_knowledge\")\n",
" \n",
"reportChangeInColumn(group, observed, \"Wine knowledge\")\n",
" "
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"Wine knowledge: 0.0\n",
"group size : 1\n",
"Chi-square test p-value: 6.52311220302e-05 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Wine knowledge: 1.0\n",
"group size : 7\n",
"Chi-square test p-value: 2.74566262007e-05 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Wine knowledge: 2.0\n",
"group size : 8\n",
"Chi-square test p-value: 0.000194534471021 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Wine knowledge: 3.0\n",
"group size : 8\n",
"Chi-square test p-value: 0.504566717012 "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"\n",
"---\n",
"Wine knowledge: 4.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.416957464434 \n",
"\n"
]
}
],
"prompt_number": 25
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Gender"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#we get rid of data columns we don't want\n",
"#print df\n",
"observed = df[[\"experiment_id\", \"participant_id\", \"questionnaire_user_gender\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"questionnaire_user_gender\")\n",
"\n",
"reportChangeInColumn(group, observed, \"Gender\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"Gender: 1.0\n",
"group size : 16\n",
"Chi-square test p-value: 0.0614677954937 \n",
"\n",
"\n",
"---\n",
"Gender: 2.0\n",
"group size : 9\n",
"Chi-square test p-value: 4.11008904541e-05 \n",
"change is significant for group, df_group_c_sales\n",
"\n"
]
}
],
"prompt_number": 26
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Age"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#we get rid of data columns we don't want\n",
"#print df\n",
"observed = df[[\"experiment_id\", \"participant_id\", \"questionnaire_user_age\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"questionnaire_user_age\")\n",
"reportChangeInColumn(group, observed, \"Age\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"Age: 20.0\n",
"group size : 8\n",
"Chi-square test p-value: 0.000387513774371 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Age: 30.0\n",
"group size : 6\n",
"Chi-square test p-value: 0.0641329719938 \n",
"\n",
"\n",
"---\n",
"Age: 40.0\n",
"group size : 5\n",
"Chi-square test p-value: 0.46967107284 \n",
"\n",
"\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"---\n",
"Age: 50.0\n",
"group size : 4\n",
"Chi-square test p-value: 0.896085870503 \n",
"\n",
"\n",
"---\n",
"Age: 60.0\n",
"group size : 2\n",
"Chi-square test p-value: 3.40026554706e-05 \n",
"change is significant for group, df_group_c_sales\n",
"\n"
]
}
],
"prompt_number": 27
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###How often buy wine"
]
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"\n",
"observed = df[[\"experiment_id\", \"participant_id\", \"how_often_buy_wine\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"how_often_buy_wine\")\n",
"reportChangeInColumn(group, observed, \"How often buy wine\")\n"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"How often buy wine: 0.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.418978846474 \n",
"\n",
"\n",
"---\n",
"How often buy wine: 1.0\n",
"group size : 12\n",
"Chi-square test p-value: 0.000203819520233 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"How often buy wine: 2.0\n",
"group size : 4\n",
"Chi-square test p-value: 0.0185576792801 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"How often buy wine: 3.0\n",
"group size : 6\n",
"Chi-square test p-value: 0.112467262178 "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"\n",
"---\n",
"How often buy wine: 4.0\n",
"group size : 1\n",
"Chi-square test p-value: 8.18655595555e-07 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"How often buy wine: 5.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.193779733631 \n",
"\n"
]
}
],
"prompt_number": 28
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###How many bottles"
]
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"observed = df[[\"experiment_id\", \"participant_id\", \"how_many_bottles\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"how_many_bottles\")\n",
"reportChangeInColumn(group, observed, \"How many bottles\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"How many bottles: 1.0\n",
"group size : 11\n",
"Chi-square test p-value: 0.001289195089 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"How many bottles: 2.0\n",
"group size : 12\n",
"Chi-square test p-value: 0.0264578617612 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"How many bottles: 4.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.193779733631 \n",
"\n",
"\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"---\n",
"How many bottles: 5.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.416957464434 \n",
"\n"
]
}
],
"prompt_number": 29
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Average Spend"
]
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"observed = df[[\"experiment_id\", \"participant_id\", \"average_spend\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"average_spend\")\n",
"reportChangeInColumn(group, observed, \"Average spend\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"Average spend: 0.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.833703116189 \n",
"\n",
"\n",
"---\n",
"Average spend: 1.0\n",
"group size : 5\n",
"Chi-square test p-value: 0.000293529397163 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Average spend: 2.0\n",
"group size : 13\n",
"Chi-square test p-value: 0.00179266113524 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Average spend: 3.0\n",
"group size : 3\n",
"Chi-square test p-value: 0.0315546964851 "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Average spend: 4.0\n",
"group size : 2\n",
"Chi-square test p-value: 0.0883571486689 \n",
"\n",
"\n",
"---\n",
"Average spend: 5.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.0718977724965 \n",
"\n"
]
}
],
"prompt_number": 30
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Wine type"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#we get rid of data columns we don't want\n",
"#print df\n",
"observed = df[[\"experiment_id\", \"participant_id\", \"wine_type\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"wine_type\")\n",
"reportChangeInColumn(group, observed, \"Wine type\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"Wine type: 1\n",
"group size : 22\n",
"Chi-square test p-value: 0.000337546498453 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Wine type: 2\n",
"group size : 12\n",
"Chi-square test p-value: 2.92019120478e-05 \n",
"change is significant for group, df_group_c_sales\n",
"\n"
]
}
],
"prompt_number": 31
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Wine price"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"observed = df[[\"experiment_id\", \"participant_id\", \"table_id\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"table_id\")\n",
"reportChangeInColumn(group, observed, \"Wine price\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"Wine price: 1\n",
"group size : 6\n",
"Chi-square test p-value: 0.233898012605 \n",
"\n",
"\n",
"---\n",
"Wine price: 2\n",
"group size : 8\n",
"Chi-square test p-value: 0.439791727846 \n",
"\n",
"\n",
"---\n",
"Wine price: 3\n",
"group size : 14\n",
"Chi-square test p-value: 0.000132125207844 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Wine price: 4\n",
"group size : 1\n",
"Chi-square test p-value: 0.28388613076 "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"\n",
"---\n",
"Wine price: 5\n",
"group size : 6\n",
"Chi-square test p-value: 0.943846505205 \n",
"\n",
"\n",
"---\n",
"Wine price: 6\n",
"group size : 7\n",
"Chi-square test p-value: 5.45303414624e-09 \n",
"change is significant for group, df_group_c_sales\n",
"\n"
]
}
],
"prompt_number": 32
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###change_discover"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"observed = df[[\"experiment_id\", \"participant_id\", \"change_discover\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"change_discover\")\n",
"reportChangeInColumn(group, observed, \"Change Discover\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"Change Discover: 0.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.28388613076 \n",
"\n",
"\n",
"---\n",
"Change Discover: 1.0\n",
"group size : 2\n",
"Chi-square test p-value: 0.12065673634 \n",
"\n",
"\n",
"---\n",
"Change Discover: 2.0\n",
"group size : 5\n",
"Chi-square test p-value: 0.0514871680084 \n",
"\n",
"\n",
"---\n",
"Change Discover: 3.0\n",
"group size : 7\n",
"Chi-square test p-value: 0.0380980353744 "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Change Discover: 4.0\n",
"group size : 6\n",
"Chi-square test p-value: 0.01044446374 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Change Discover: 5.0\n",
"group size : 4\n",
"Chi-square test p-value: 0.0116471981457 \n",
"change is significant for group, df_group_c_sales\n",
"\n"
]
}
],
"prompt_number": 33
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###change_quantity"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"observed = df[[\"experiment_id\", \"participant_id\", \"change_quantity\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"change_quantity\")\n",
"reportChangeInColumn(group, observed, \"Change Quantity\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"Change Quantity: 0.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.28388613076 \n",
"\n",
"\n",
"---\n",
"Change Quantity: 1.0\n",
"group size : 10\n",
"Chi-square test p-value: 0.00105111880859 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Change Quantity: 2.0\n",
"group size : 8\n",
"Chi-square test p-value: 0.171209461002 \n",
"\n",
"\n",
"---\n",
"Change Quantity: 3.0\n",
"group size : 3\n",
"Chi-square test p-value: 0.026325488078 "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Change Quantity: 4.0\n",
"group size : 2\n",
"Chi-square test p-value: 0.0676870282371 \n",
"\n",
"\n",
"---\n",
"Change Quantity: 5.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.933378895053 \n",
"\n"
]
}
],
"prompt_number": 34
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###change_type"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"observed = df[[\"experiment_id\", \"participant_id\", \"change_type\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"change_type\")\n",
"reportChangeInColumn(group, observed, \"Change Type\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"Change Type: 0.0\n",
"group size : 2\n",
"Chi-square test p-value: 0.176744402129 \n",
"\n",
"\n",
"---\n",
"Change Type: 1.0\n",
"group size : 5\n",
"Chi-square test p-value: 0.285631923849 \n",
"\n",
"\n",
"---\n",
"Change Type: 2.0\n",
"group size : 7\n",
"Chi-square test p-value: 0.126345696396 \n",
"\n",
"\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"---\n",
"Change Type: 3.0\n",
"group size : 6\n",
"Chi-square test p-value: 0.00433414724892 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Change Type: 4.0\n",
"group size : 3\n",
"Chi-square test p-value: 0.00263113048644 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Change Type: 5.0\n",
"group size : 2\n",
"Chi-square test p-value: 0.097011103381 \n",
"\n"
]
}
],
"prompt_number": 35
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###change_amount"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"observed = df[[\"experiment_id\", \"participant_id\", \"change_amount\", \"pink\", \"green\", \"blue\", \"nocolor\"]]\n",
"observed = observed.groupby(\"change_amount\")\n",
"reportChangeInColumn(group, observed, \"Change Amount\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"---\n",
"Change Amount: 0.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.28388613076 \n",
"\n",
"\n",
"---\n",
"Change Amount: 1.0\n",
"group size : 10\n",
"Chi-square test p-value: 0.00282470145503 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Change Amount: 2.0\n",
"group size : 9\n",
"Chi-square test p-value: 0.00208534275674 \n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Change Amount: 3.0\n",
"group size : 3\n",
"Chi-square test p-value: 5.08651483595e-06 "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"change is significant for group, df_group_c_sales\n",
"\n",
"\n",
"---\n",
"Change Amount: 4.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.193779733631 \n",
"\n",
"\n",
"---\n",
"Change Amount: 5.0\n",
"group size : 1\n",
"Chi-square test p-value: 0.933378895053 \n",
"\n"
]
}
],
"prompt_number": 36
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment