Skip to content

Instantly share code, notes, and snippets.

@jkuruzovich
Created October 17, 2019 18:35
Show Gist options
  • Save jkuruzovich/c3f1eca974dbb6c48d443b82396706bb to your computer and use it in GitHub Desktop.
Save jkuruzovich/c3f1eca974dbb6c48d443b82396706bb to your computer and use it in GitHub Desktop.
x.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "x.ipynb",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/jkuruzovich/c3f1eca974dbb6c48d443b82396706bb/x.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vZfe_BrN2782",
"colab_type": "text"
},
"source": [
"## Midterm\n",
"\n",
"![](https://github.com/rpi-techfundamentals/hm-01-starter/blob/master/notsaved.png?raw=1)\n",
"\n",
"**WARNING!!! If you see this icon on the top of your COLAB sesssion, your work is not saved automatically.**\n",
"\n",
"**Do not manually upload any files. Use the `wget` command to retreive files.**\n",
"\n",
"**Save your working file in Google drive so that all changes will be saved as you work. MAKE SURE that your final version is saved to GitHub.** \n",
"\n",
"Before you turn this in, make sure everything runs as expected. First, restart the kernel (in the menu, select Kernel → Restart) and then run all cells (in the menubar, select Cell → Run All). They should run completely without intervention...**i.e., DO NOT not manually upload any files. Use the `wget` command to retreive files as necesssary.**\n",
"\n",
"\n",
"### This is a 45 point assignment.\n",
"\n",
"**You may find it useful to go through the notebooks from the course materials when doing these exercises.**\n",
"\n",
"**If you receive assistance from anyone (other than instructor clarification) it will be considered an ethical violation and referred to associate dean. By listing your name below your are confirming that you have received no unauthorized help in this exam and have completed it yourself.**"
]
},
{
"cell_type": "code",
"metadata": {
"id": "7Klp5Qe4_iAk",
"colab_type": "code",
"colab": {}
},
"source": [
"course=\"mgmt6560\" #course number\n",
"section=\"\" #For example 1 or 2\n",
"first_name=\"\" #For example, \"jason\"\n",
"last_name=\"\" #For example, \"kuruzovich\"\n",
"student_id=\"\" #for example \"kuruzj\"\n",
"student_email=\"\" #for example \"kuruzj@rpi.edu\""
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "wAwXOu5EwrwL",
"colab_type": "code",
"colab": {}
},
"source": [
"#get the data\n",
"!wget https://www.dropbox.com/s/02brm5nrxe63ea9/midterm2.csv\n"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "DlqW9xLjyRhV",
"colab_type": "code",
"colab": {}
},
"source": [
"import pandas as pd\n",
"data = pd.read_csv(\"midterm2.csv\")\n",
"data.head()"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "1rX8YKAq0vfJ",
"colab_type": "text"
},
"source": [
"### (10 points) 1. Predict Group Using Different Sets of IVs \n",
"\n",
"Predict the `group` variable from `v1-v5` and then from `v6-v10` using k Nearest Neighbor algorithm and all of the data (i.e., don't do train test split) and the default hyperparameters. **IGNORE `target` variable for now.** \n",
"\n",
"`accuracy_v1_5`\n",
"\n",
"`accuracy_v6_10`"
]
},
{
"cell_type": "code",
"metadata": {
"id": "RM2YgR6V1wC2",
"colab_type": "code",
"colab": {}
},
"source": [
""
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "8tRVUsE-go9j",
"colab_type": "code",
"colab": {}
},
"source": [
"accuracy_v1_5 = "
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "dhja_qBGgrBG",
"colab_type": "code",
"colab": {}
},
"source": [
"accuracy_v6_10 = "
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "plPZBhRq-DND",
"colab_type": "text"
},
"source": [
"### (10 points) 2. Null model\n",
"\n",
"What would the accuracy of the null/naive model be? Set it `accuracy_null`.\n",
"\n",
"How would you interpret the model for `accuracy_v1_5`, `accuracy_v6_10`, vs the null model. \n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "GVxdxFNq8dFN",
"colab_type": "code",
"colab": {}
},
"source": [
"#Enter this to 1 decimal place. (i.e., not string)\n",
"accuracy_null= 1.1 #included as example.\n",
"\n",
"accuracy_null"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "d_03eMsF-iNm",
"colab_type": "code",
"colab": {}
},
"source": [
"q1 = \"\"\" \n",
"Answer here. \n",
"\"\"\""
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "zBeAGSgcAVGc",
"colab_type": "text"
},
"source": [
"### (10 points) 3. Perform linear regression using SciKit Learn. \n",
"\n",
"Perform two regression analyses. \n",
"\n",
"For `analysis1` select the independent variables `v1-v10` (all v variables) and `group` and regress on the dependent variable `target`. Calculate the r2 for the linear regression and assign the r2 to the `r2_analysis1` variable.\n",
"\n",
"For `analysis2` filter to only include rows where `group` is equal to 1. Select the independent variables `v1-v10` (all v variables) regress on the dependent variable `target`. Calculate the r2 for the linear regression and assign the r2 to the `r2_analysis2` variable.\n",
"\n",
"Print `r2_analysis1` and `r2_analysis2` to make sure they are set.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "2qhESlamBAtt",
"colab_type": "code",
"colab": {}
},
"source": [
"\n"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "0Akeynz4APl5",
"colab_type": "code",
"colab": {}
},
"source": [
"\n"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "GAj0ruboDRn6",
"colab_type": "code",
"colab": {}
},
"source": [
"#Print r2_analysis1 and r2_analysis2 to make sure they are set.\n",
"print(r2_analysis1, r2_analysis2)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "tnnT1zAFyoKl",
"colab_type": "text"
},
"source": [
"### (10 points) Train Test Split\n",
"Using the `random_state=99` do a 50 50 train test split of only variables `v1-v10` and the `target` for y. Your split should create the following \n",
"\n",
"`train_X`, `test_X`, `train_y`, `test_y`"
]
},
{
"cell_type": "code",
"metadata": {
"id": "XpIWFGohx2wq",
"colab_type": "code",
"colab": {}
},
"source": [
""
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "x3BdJIB50LEt",
"colab_type": "code",
"colab": {}
},
"source": [
"train_X"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "kJNWb2xqNVg8",
"colab_type": "code",
"colab": {}
},
"source": [
""
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "QHtc9iadNgVF",
"colab_type": "text"
},
"source": [
"### (5. points) Describe why the testing or validation set is important for assessment of a machine learning model. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "BJoQVFlhNdAl",
"colab_type": "code",
"colab": {}
},
"source": [
"q5 = \"\"\" \n",
"Answer here. \n",
"\"\"\""
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "MeQK0zX-8BDH",
"colab_type": "code",
"colab": {}
},
"source": [
""
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "PxhmXmqi8Bk9",
"colab_type": "text"
},
"source": [
"### You must commit your final notebook to github by 5:50 PM. Any commits after that time will not be accepted. \n",
"\n",
"Click [this link](https://classroom.github.com/a/LAzmhlFi) to generate the repository for submission. \n",
"\n"
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment