Skip to content

Instantly share code, notes, and snippets.

@BastinRobin
Last active March 3, 2021 05:28
Show Gist options
  • Save BastinRobin/5b884de3678aea2b74c1af7a98340a3a to your computer and use it in GitHub Desktop.
Save BastinRobin/5b884de3678aea2b74c1af7a98340a3a to your computer and use it in GitHub Desktop.
Regression Analysis Lab

Regression Analysis

This exercise is created to make you confident in building multiple regression models using spreadsheets.

This dataset contains information collected by the U.S Census Service concerning housing in the area of Boston Mass. It was obtained from the StatLib archive, and has been used extensively throughout the literature to benchmark algorithms. However, these comparisons were primarily done outside of Delve and are thus somewhat suspect. The dataset is small in size with only 506 cases.

  • Download the dataset using this link https://gofile.io/d/YZUprY
  • Use MS Excel to build a Multiple Regression Models
  • Audit the regression summary and elimate features which are not significant.
  • Predict the house price MEDV .

Data Description

CRIM:per capita crime rate by town
ZN:proportion of residential land zoned for lots over 25,000 sq.ft.
INDUS:proportion of non-retail business acres per town
CHAS:Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
NOX:nitric oxides concentration (parts per 10 million)
RM:average number of rooms per dwelling
AGE:proportion of owner-occupied units built prior to 1940
DIS:weighted distances to five Boston employment centres
RAD:index of accessibility to radial highways
TAX:full-value property-tax rate per $10,000
PTRATIO:pupil-teacher ratio by town
B:1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
LSTAT:% lower status of the population
MEDV:Median value of owner-occupied homes in $1000's

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment