Skip to content

Instantly share code, notes, and snippets.

@emjun
Last active October 18, 2019 23:50
Show Gist options
  • Save emjun/f0ebafc97c208b9f329b1e8c14f90a7f to your computer and use it in GitHub Desktop.
Save emjun/f0ebafc97c208b9f329b1e8c14f90a7f to your computer and use it in GitHub Desktop.
Example of writing a Tea program
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Example of using Tea",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/emjun/f0ebafc97c208b9f329b1e8c14f90a7f/tea_example_0.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "JyG45Qk3qQLS"
},
"source": [
"# Tea\n",
"Tea requires users to describe their data, variables, study design, assumptions about the data, and hypotheses at a high-level. Tea combined computed properties about the data (e.g., normal distribution) with users' assumptions and hypotheses to infer a set of valid statisitcal analyses that test users' hypotheses. Unlike other statistical analysis tools, Tea focuses on capturing users' *explicit* hypotheses and assumptions about the data. \n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ua--kwl8Ge7x",
"colab_type": "text"
},
"source": [
"## Example\n",
"\n",
"Let's walk through an example! Make sure to install Tea before. :)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VLt-_g2lGtHi",
"colab_type": "text"
},
"source": [
"## 1. Import tea"
]
},
{
"cell_type": "code",
"metadata": {
"id": "-gsAYPFgGxhL",
"colab_type": "code",
"colab": {}
},
"source": [
"import tea"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "KR921S_OQSHG"
},
"source": [
"## Data\n",
"\n",
"Load data."
]
},
{
"cell_type": "code",
"metadata": {
"cellView": "both",
"colab_type": "code",
"id": "WUtu4316QSHL",
"colab": {}
},
"source": [
"\n",
"tea.data(\"https://homes.cs.washington.edu/~emjun/tea-lang/datasets/UScrime.csv\")\n"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "Id6tDF1HQSHD"
},
"source": [
"## Variables\n",
"\n",
"Declare and annotate the variables of interest."
]
},
{
"cell_type": "code",
"metadata": {
"id": "9_3Uxu7cGS0D",
"colab_type": "code",
"colab": {}
},
"source": [
"variables = [\n",
" {\n",
" 'name' : 'So',\n",
" 'data type' : 'nominal',\n",
" 'categories' : ['0', '1']\n",
" },\n",
" {\n",
" 'name' : 'Prob',\n",
" 'data type' : 'ratio',\n",
" 'range' : [0,1]\n",
" }\n",
"]\n",
"\n",
"tea.define_variables(variables)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "qwy7tpFCHlkw",
"colab_type": "text"
},
"source": [
"## Assumptions\n",
"\n",
"Declare any assumptions you may have about the data based on prior visualization or domain knowledge."
]
},
{
"cell_type": "code",
"metadata": {
"id": "JIrQI3r4NZRI",
"colab_type": "code",
"colab": {}
},
"source": [
"assumptions = {\n",
" 'groups normally distributed': [['So', 'Prob']],\n",
" 'Type I (False Positive) Error Rate': 0.05,\n",
"}\n",
"\n",
"tea.assume(assumptions)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "1fCGhA2DNbHh",
"colab_type": "text"
},
"source": [
"## Study Design\n",
"\n",
"Express how the data were collected."
]
},
{
"cell_type": "code",
"metadata": {
"id": "DiZp2K-rN24X",
"colab_type": "code",
"colab": {}
},
"source": [
"experimental_design = {\n",
" 'study type': 'observational study',\n",
" 'contributor variables': 'So',\n",
" 'outcome variables': 'Prob',\n",
" }\n",
"tea.define_study_design(experimental_design)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "lFc-d10JN3_i",
"colab_type": "text"
},
"source": [
"## Hypothesis\n",
"Explicitly state a hypothesis about the relationship between the variables in the data."
]
},
{
"cell_type": "code",
"metadata": {
"id": "5mNM6Q8uOE3h",
"colab_type": "code"
},
"source": [
"tea.hypothesize(['So', 'Prob'], ['So:1 > 0'])"
]
}
]
}
@g-simmons2
Copy link

Should be tea.data("https://homes.cs.washington.edu/~emjun/tea-lang/datasets/UScrime.csv") at the top, correct?

@emjun
Copy link
Author

emjun commented Sep 23, 2019

Great find! 👍

Updated the example!

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment