Skip to content

Instantly share code, notes, and snippets.

@bedohazizsolt
Last active October 4, 2022 16:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bedohazizsolt/56b3a86a575f3e3227e50644ae4adfe9 to your computer and use it in GitHub Desktop.
Save bedohazizsolt/56b3a86a575f3e3227e50644ae4adfe9 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "L_NT2cWGTvgo"
},
"source": [
"# Linear regression\n",
"\n",
"### 1. Load the provided .npy files. You can load it with numpy.\n",
"\n",
"* each file contains one vector, X and y\n",
"* visualize X vs y on a scatter plot\n",
"* fit an y=w_0 + w_1⋅X + w_2⋅X^2 linear regression using `sklearn`\n",
"\n",
"### 2. Using different features\n",
"\n",
"* plot the residuals (the difference between the prediction and the actual y ) vs the original y \n",
"* a non-random-noise like pattern suggests non-linear connection between the features and the predictions\n",
"* someone told us that the connection between X and y is y=A⋅X+B⋅cos^3(X)+C⋅X^2+D \n",
" * using sklearn's linear regression estimate A,B,C,D !\n",
"* plot the residuals again! is it better now?\n",
"\n",
"### 3. Other methdods than sklearn for linear regression\n",
"\n",
"* using the statsmodels package perform the same linear regression as in 2.) (hint: use statsmodels.api.OLS)\n",
"* is the result the same? if not guess, why? (did you not forget to add the constant term?)\n",
"* try to get the same results with statsmodels as with sklearn!\n",
"* using the analytic solution formula shown during the lecture, calculate the coefficients (A, B, C, D). are they the same compared to the two previous methods?\n",
"\n",
"### 4.\n",
"\n",
"* load the [real_estate](https://gist.github.com/qbeer/f356d7144543cbb09c9792c34b8ad722) data to a pandas dataframe\n",
"drop the ID column and the geographic location columns\n",
"fit a linear regression model to predict the unit price using sklearn\n",
"* interpret the coefficients and their meaning shortly with your own words\n",
"* plot the residuals for the predictions. if you had to decide only on this information, which house would you buy?\n",
"\n",
"### 5.\n",
"* Using the same dataset from task 4) compute the parameters of the multivariate regression model via gradient descent.\n",
"* Compare the calculated parameters with the ones obtained in task 4) via sklearn. Is there any difference? If so give your explanation.\n",
"\n",
"Hint: you can use a function to calculate the loss and a function to perform the gradient descent to learn the parameters. Example:\n",
"\n",
"```python\n",
"def comp_cost(X, y, theta):\n",
" \"\"\"Compute cost given X, y and parameters theta.\"\"\"\n",
" .\n",
" .\n",
" .\n",
" return J\n",
"```\n",
"\n",
"```python\n",
"def grad_descent(X, y, theta, alpha, num_iters):\n",
" \"\"\"Perform gradient descent\"\"\"\n",
" .\n",
" .\n",
" . \n",
" return J_history, theta\n",
"```\n",
"\n",
"---\n",
"\n",
"## Hints:\n",
"\n",
"* On total you can get 10 points for fully completing all tasks.\n",
"* Decorate your notebook with, questions, explanation etc, make it self contained and understandable!\n",
"* Comments you code when necessary\n",
"* Write functions for repetitive tasks!\n",
"* Use the pandas package for data loading and handling\n",
"* Use matplotlib and seaborn for plotting or bokeh and plotly for interactive investigation\n",
"* Use the scikit learn package for almost everything\n",
"* Use for loops only if it is really necessary!\n",
"* Code sharing is not allowed between student! Sharing code will result in zero points.\n",
"* If you use code found on web, it is OK, but, make its source clear!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"authorship_tag": "ABX9TyO+SGQqiJxE7tqyElTiYeKt",
"collapsed_sections": [],
"include_colab_link": true,
"name": "HW_4.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment