Last active
October 18, 2019 23:50
-
-
Save emjun/f0ebafc97c208b9f329b1e8c14f90a7f to your computer and use it in GitHub Desktop.
Example of writing a Tea program
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"name": "Example of using Tea", | |
"version": "0.3.2", | |
"provenance": [], | |
"collapsed_sections": [], | |
"include_colab_link": true | |
}, | |
"kernelspec": { | |
"display_name": "Python 3", | |
"name": "python3" | |
} | |
}, | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "view-in-github", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"<a href=\"https://colab.research.google.com/gist/emjun/f0ebafc97c208b9f329b1e8c14f90a7f/tea_example_0.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"colab_type": "text", | |
"id": "JyG45Qk3qQLS" | |
}, | |
"source": [ | |
"# Tea\n", | |
"Tea requires users to describe their data, variables, study design, assumptions about the data, and hypotheses at a high-level. Tea combined computed properties about the data (e.g., normal distribution) with users' assumptions and hypotheses to infer a set of valid statisitcal analyses that test users' hypotheses. Unlike other statistical analysis tools, Tea focuses on capturing users' *explicit* hypotheses and assumptions about the data. \n" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "Ua--kwl8Ge7x", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Example\n", | |
"\n", | |
"Let's walk through an example! Make sure to install Tea before. :)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "VLt-_g2lGtHi", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## 1. Import tea" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "-gsAYPFgGxhL", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"import tea" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"colab_type": "text", | |
"id": "KR921S_OQSHG" | |
}, | |
"source": [ | |
"## Data\n", | |
"\n", | |
"Load data." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"cellView": "both", | |
"colab_type": "code", | |
"id": "WUtu4316QSHL", | |
"colab": {} | |
}, | |
"source": [ | |
"\n", | |
"tea.data(\"https://homes.cs.washington.edu/~emjun/tea-lang/datasets/UScrime.csv\")\n" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"colab_type": "text", | |
"id": "Id6tDF1HQSHD" | |
}, | |
"source": [ | |
"## Variables\n", | |
"\n", | |
"Declare and annotate the variables of interest." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "9_3Uxu7cGS0D", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"variables = [\n", | |
" {\n", | |
" 'name' : 'So',\n", | |
" 'data type' : 'nominal',\n", | |
" 'categories' : ['0', '1']\n", | |
" },\n", | |
" {\n", | |
" 'name' : 'Prob',\n", | |
" 'data type' : 'ratio',\n", | |
" 'range' : [0,1]\n", | |
" }\n", | |
"]\n", | |
"\n", | |
"tea.define_variables(variables)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "qwy7tpFCHlkw", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Assumptions\n", | |
"\n", | |
"Declare any assumptions you may have about the data based on prior visualization or domain knowledge." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "JIrQI3r4NZRI", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"assumptions = {\n", | |
" 'groups normally distributed': [['So', 'Prob']],\n", | |
" 'Type I (False Positive) Error Rate': 0.05,\n", | |
"}\n", | |
"\n", | |
"tea.assume(assumptions)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "1fCGhA2DNbHh", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Study Design\n", | |
"\n", | |
"Express how the data were collected." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "DiZp2K-rN24X", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"experimental_design = {\n", | |
" 'study type': 'observational study',\n", | |
" 'contributor variables': 'So',\n", | |
" 'outcome variables': 'Prob',\n", | |
" }\n", | |
"tea.define_study_design(experimental_design)" | |
], | |
"execution_count": 0, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "lFc-d10JN3_i", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Hypothesis\n", | |
"Explicitly state a hypothesis about the relationship between the variables in the data." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "5mNM6Q8uOE3h", | |
"colab_type": "code" | |
}, | |
"source": [ | |
"tea.hypothesize(['So', 'Prob'], ['So:1 > 0'])" | |
] | |
} | |
] | |
} |
Great find! 👍
Updated the example!
Thanks :)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Should be
tea.data("https://homes.cs.washington.edu/~emjun/tea-lang/datasets/UScrime.csv")
at the top, correct?