Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save kwinkunks/293784bec54455a4d950ebfa289d2bd0 to your computer and use it in GitHub Desktop.
Save kwinkunks/293784bec54455a4d950ebfa289d2bd0 to your computer and use it in GitHub Desktop.
When, if ever, should you scale the target in a regression task?
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "ad6b42c5",
"metadata": {},
"source": [
"# Should you scale the target in regression tasks?\n",
"\n",
"It's fairly well known that it's often use, and sometimes essential, to scale (eg normalize or standardize) the features (i.e. `X`) before training a predictive model on them.\n",
"\n",
"The question of when, if everm one should _also_ scale the target, `y`, sometimes comes up.\n",
"\n",
"I asked ChatGPT for its opinion, and it surprised me a bit because it almost seems to suggest that you should always scale the target:\n",
"\n",
"> Scaling the target variable in a regression task can be beneficial in certain scenarios:\n",
"> \n",
"> 1. When using regularization techniques: If you are using a regression algorithm with regularization, such as Lasso or Ridge regression, scaling the target variable can help ensure that the magnitude of the coefficients and the regularization penalty term are on a similar scale. This can lead to more stable and accurate models.\n",
"> \n",
"> 2. When using neural networks: Neural networks often work better when both input features and target variables are on a similar scale, as it can help with gradient-based optimization and avoid issues like vanishing or exploding gradients. Scaling the target variable in this context can lead to faster convergence and better model performance.\n",
"> \n",
"> 3. When comparing models with different target scales: If you are comparing models across different datasets with varying target variable scales, scaling the target variable can help make performance comparisons more consistent and meaningful.\n",
"> \n",
"> 4. When using gradient-based optimization algorithms: Some machine learning algorithms, like support vector regression (SVR), use gradient-based optimization methods. Scaling the target variable can help improve the convergence of the optimization process.\n",
"> \n",
"> 5. When the target variable has a large range or extreme values: In cases where the target variable has a large range or extreme values, scaling it can help stabilize the learning process and avoid potential issues with numerical stability or model performance.\n",
"> \n",
"> However, it is important to remember that after training the model, you will need to reverse the scaling transformation to interpret the model's predictions in their original scale.\n",
"> \n",
"> In summary, while not always necessary, scaling the target variable in a regression task can be beneficial in specific situations, such as when using regularization, neural networks, or gradient-based optimization algorithms.\n",
"\n",
"And there's [this conversation](https://stats.stackexchange.com/questions/111467/is-it-necessary-to-scale-the-target-value-in-addition-to-scaling-features-for-re) on StackOverflow, with answers ranging from \"No\" to \"Maybe\" to \"Yes\".\n",
"\n",
"So let's load some data and try some things!\n",
"\n",
"### TLDR\n",
"\n",
"Experimentation suggests that very large or very small target values make SVR and MLPRegressor very unstable and hard to train, even with hyperparameter search. (I probably didn't think hard enough about how to compensate for the value magnitudes, mostly just playing with a couple of arguments). In each case, scaling the target (standardizing) solved the problem.\n",
"\n",
"As far as I can tell, scaling the target did not substantially affect the performance of LinearRegression (with or without L2 regularization, see below re L1), KNN regression, SGD regression (surprised me), or RandomForestRegressor.\n",
"\n",
"For whatever reason, I cannot get Lasso (L1 regularization) to converge at all on my data, even when scaling the input and the output.\n",
"\n",
"If you're getting unstable results from an SVR or MLP, I think the best strategy may be to just use another algorithm, rather than trying to find the right hyperparameters, or deal with the hassle of scaling the target. If hellbent on using a neural net or SVR, just scale the target."
]
},
{
"cell_type": "markdown",
"id": "a295a42a",
"metadata": {},
"source": [
"## Load some data"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "5150d6bb",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>MD</th>\n",
" <th>GR</th>\n",
" <th>NPOR</th>\n",
" <th>RHOB</th>\n",
" <th>DT4P</th>\n",
" <th>DT4S</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>count</th>\n",
" <td>1196.000000</td>\n",
" <td>1196.000000</td>\n",
" <td>1196.000000</td>\n",
" <td>1196.000000</td>\n",
" <td>1196.000000</td>\n",
" <td>1196.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>mean</th>\n",
" <td>2790.536000</td>\n",
" <td>67.651168</td>\n",
" <td>0.327833</td>\n",
" <td>2492.510252</td>\n",
" <td>279.854858</td>\n",
" <td>462.820733</td>\n",
" </tr>\n",
" <tr>\n",
" <th>std</th>\n",
" <td>345.399768</td>\n",
" <td>16.584102</td>\n",
" <td>0.073247</td>\n",
" <td>113.628417</td>\n",
" <td>25.897177</td>\n",
" <td>396.698555</td>\n",
" </tr>\n",
" <tr>\n",
" <th>min</th>\n",
" <td>2193.036000</td>\n",
" <td>20.533973</td>\n",
" <td>0.063613</td>\n",
" <td>2085.521075</td>\n",
" <td>174.242612</td>\n",
" <td>-3278.379200</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25%</th>\n",
" <td>2491.786000</td>\n",
" <td>55.929214</td>\n",
" <td>0.282957</td>\n",
" <td>2449.629850</td>\n",
" <td>262.940034</td>\n",
" <td>464.979536</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50%</th>\n",
" <td>2790.536000</td>\n",
" <td>72.154001</td>\n",
" <td>0.341788</td>\n",
" <td>2506.717670</td>\n",
" <td>284.703725</td>\n",
" <td>515.470562</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75%</th>\n",
" <td>3089.286000</td>\n",
" <td>80.129819</td>\n",
" <td>0.381460</td>\n",
" <td>2566.703012</td>\n",
" <td>299.693629</td>\n",
" <td>555.194036</td>\n",
" </tr>\n",
" <tr>\n",
" <th>max</th>\n",
" <td>3388.036000</td>\n",
" <td>156.760897</td>\n",
" <td>0.540542</td>\n",
" <td>2948.923828</td>\n",
" <td>342.492104</td>\n",
" <td>590.466475</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" MD GR NPOR RHOB DT4P \\\n",
"count 1196.000000 1196.000000 1196.000000 1196.000000 1196.000000 \n",
"mean 2790.536000 67.651168 0.327833 2492.510252 279.854858 \n",
"std 345.399768 16.584102 0.073247 113.628417 25.897177 \n",
"min 2193.036000 20.533973 0.063613 2085.521075 174.242612 \n",
"25% 2491.786000 55.929214 0.282957 2449.629850 262.940034 \n",
"50% 2790.536000 72.154001 0.341788 2506.717670 284.703725 \n",
"75% 3089.286000 80.129819 0.381460 2566.703012 299.693629 \n",
"max 3388.036000 156.760897 0.540542 2948.923828 342.492104 \n",
"\n",
" DT4S \n",
"count 1196.000000 \n",
"mean 462.820733 \n",
"std 396.698555 \n",
"min -3278.379200 \n",
"25% 464.979536 \n",
"50% 515.470562 \n",
"75% 555.194036 \n",
"max 590.466475 "
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"\n",
"df = pd.read_csv('https://geocomp.s3.amazonaws.com/data/MD-GR-NPOR-RHOB-DT4P-DT4S.txt',\n",
" names='MD-GR-NPOR-RHOB-DT4P-DT4S'.split('-'),\n",
" )\n",
"\n",
"df.describe()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "ed09aa8a",
"metadata": {},
"outputs": [],
"source": [
"# Remove problem samples.\n",
"df = df[df['DT4S'] > 200]\n",
"\n",
"# Make new targets.\n",
"df['DT4S_scaled'] = df['DT4S'] * 100\n",
"df['DT4S_huge'] = df['DT4S'] * 1e3\n",
"df['DT4S_tiny'] = df['DT4S'] / 1e6"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "804296bf",
"metadata": {},
"outputs": [],
"source": [
"from sklearn.linear_model import LinearRegression, Ridge, SGDRegressor\n",
"from sklearn.svm import SVR\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.preprocessing import StandardScaler\n",
"from sklearn.compose import TransformedTargetRegressor\n",
"from sklearn.pipeline import make_pipeline\n",
"\n",
"\n",
"# Make X and y.\n",
"X = df[['MD', 'GR', 'NPOR', 'RHOB', 'DT4P']].values\n",
"y = df['DT4S_tiny'].values\n",
"\n",
"# Split.\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)"
]
},
{
"cell_type": "markdown",
"id": "80e0a31c",
"metadata": {},
"source": [
"## Linear regression\n",
"\n",
"Should not make a difference if I scale the target or not."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "724bbb33",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.9477605931787968"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Make and fit pipeline.\n",
"regr = LinearRegression()\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "909653f3",
"metadata": {},
"source": [
"With scaled target:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "3c1a2cba",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.9477605931787968"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = TransformedTargetRegressor(LinearRegression(), transformer=StandardScaler())\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "446e4918",
"metadata": {},
"source": [
"Now with regularization, using `Ridge()`. The idea is that if you use regularization then it might be a good idea to scale the target."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "720397dd",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"0.9478444519961955"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = Ridge()\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "339509ed",
"metadata": {},
"source": [
"With scaled target:"
]
},
{
"cell_type": "code",
"execution_count": 84,
"id": "555606fe",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.9478444519961955"
]
},
"execution_count": 84,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = TransformedTargetRegressor(Ridge(), transformer=StandardScaler())\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "c3a095ed",
"metadata": {},
"source": [
"Makes no difference in either case, i.e. regularization or not."
]
},
{
"cell_type": "markdown",
"id": "856402f9",
"metadata": {},
"source": [
"Now with `Lasso()`"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "872327f3",
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/data1/bin/miniconda3/envs/redflag/lib/python3.10/site-packages/sklearn/base.py:1151: UserWarning: With alpha=0, this algorithm does not converge well. You are advised to use the LinearRegression estimator\n",
" return fit_method(estimator, *args, **kwargs)\n",
"/home/data1/bin/miniconda3/envs/redflag/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:628: UserWarning: Coordinate descent with no regularization may lead to unexpected results and is discouraged.\n",
" model = cd_fast.enet_coordinate_descent(\n",
"/home/data1/bin/miniconda3/envs/redflag/lib/python3.10/site-packages/sklearn/linear_model/_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.223e-07, tolerance: 3.498e-10 Linear regression models with null weight for the l1 regularization term are more efficiently fitted using one of the solvers implemented in sklearn.linear_model.Ridge/RidgeCV instead.\n",
" model = cd_fast.enet_coordinate_descent(\n"
]
},
{
"data": {
"text/plain": [
"0.9477605931787968"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.linear_model import Lasso\n",
"\n",
"regr = Lasso(alpha=0)\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "e3955be9",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"-0.0015762114490540657"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.linear_model import Lasso\n",
"\n",
"regr = Lasso()\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "3af3d829",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-0.0015762114490540657"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = TransformedTargetRegressor(Lasso(), transformer=StandardScaler())\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "a3b0e018",
"metadata": {},
"source": [
"`Lasso` does not converge at all and I don't know why."
]
},
{
"cell_type": "markdown",
"id": "aba26a60",
"metadata": {},
"source": [
"## Random forest"
]
},
{
"cell_type": "markdown",
"id": "0dc49d83",
"metadata": {},
"source": [
"Now with `RandomForestRegressor()`"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "736788f6",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"0.9595482554402164"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.ensemble import RandomForestRegressor\n",
"\n",
"regr = RandomForestRegressor()\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "03cfc73d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.9622020937029279"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = TransformedTargetRegressor(RandomForestRegressor(), transformer=StandardScaler())\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "dc829c35",
"metadata": {},
"source": [
"## Stochastic gradient descent"
]
},
{
"cell_type": "markdown",
"id": "ffe90c6c",
"metadata": {},
"source": [
"And with SGD -- tiny not affect, huge is bad."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "42971ceb",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"0.9465742697231702"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = SGDRegressor(penalty=None) # No regularization.\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "dbb9d3c5",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"0.9450423441129018"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = SGDRegressor(alpha=0.01)\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "3512acfd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.9482629278774858"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = TransformedTargetRegressor(SGDRegressor(alpha=0.01), transformer=StandardScaler())\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "cf34c5d4",
"metadata": {},
"source": [
"## Neural network"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "128c00c2",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"-170807.59684778232"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.neural_network import MLPRegressor\n",
"\n",
"regr = MLPRegressor(alpha=0, max_iter=1000)\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "06b282cf",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"-138922.25067943643"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = MLPRegressor(alpha=0.0001, max_iter=1000)\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "b485269c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.9600540359490983"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = TransformedTargetRegressor(MLPRegressor(), transformer=StandardScaler())\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "711233a7",
"metadata": {},
"source": [
"💥 So this one does blow up."
]
},
{
"cell_type": "markdown",
"id": "b3849134",
"metadata": {},
"source": [
"## KNN"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "1c89038b",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"0.9248696037011152"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.neighbors import KNeighborsRegressor\n",
"\n",
"regr = KNeighborsRegressor(n_neighbors=50)\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "fbf8e2aa",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"0.9507415190568468"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = KNeighborsRegressor()\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "ae1e64f3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.9507415190568468"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = TransformedTargetRegressor(KNeighborsRegressor(), transformer=StandardScaler())\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "778f4100",
"metadata": {},
"source": [
"## Support vector machine"
]
},
{
"cell_type": "markdown",
"id": "6b326e64",
"metadata": {},
"source": [
"And with SVR -- all affected"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "bb08e0fe",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"-1.2176570605683525"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = SVR(C=1e12) # Almost no regularization.\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "f8a68873",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"-1.2176570605683525"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = SVR(C=1)\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "c60d29fc",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.9628423809808271"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = TransformedTargetRegressor(SVR(C=1), transformer=StandardScaler())\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "6fad9d3d",
"metadata": {},
"source": [
"## Question: can you always rescue things with (say) GridSearch\n",
"\n",
"**Short answer: Maybe, if you know which params to change and really check all the cases (esp near edges!). But for SVR at least, tiny values seem to be very difficult to compensate for.**"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "045ec64f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-1.2176570605683525"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = SVR(C=1)\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "00a81294",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fitting 6 folds for each of 13 candidates, totalling 78 fits\n"
]
},
{
"data": {
"text/html": [
"<style>#sk-container-id-1 {color: black;}#sk-container-id-1 pre{padding: 0;}#sk-container-id-1 div.sk-toggleable {background-color: white;}#sk-container-id-1 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-1 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-1 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-1 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-1 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-1 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-1 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-1 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-1 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-1 div.sk-item {position: relative;z-index: 1;}#sk-container-id-1 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-1 div.sk-item::before, #sk-container-id-1 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-1 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-1 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-1 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-1 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-1 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-1 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-1 div.sk-label-container {text-align: center;}#sk-container-id-1 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-1 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>GridSearchCV(cv=6,\n",
" estimator=Pipeline(steps=[(&#x27;standardscaler&#x27;, StandardScaler()),\n",
" (&#x27;svr&#x27;, SVR(C=1))]),\n",
" param_grid={&#x27;svr__epsilon&#x27;: array([1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01,\n",
" 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06])},\n",
" verbose=1)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item sk-dashed-wrapped\"><div class=\"sk-label-container\"><div class=\"sk-label sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" ><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">GridSearchCV</label><div class=\"sk-toggleable__content\"><pre>GridSearchCV(cv=6,\n",
" estimator=Pipeline(steps=[(&#x27;standardscaler&#x27;, StandardScaler()),\n",
" (&#x27;svr&#x27;, SVR(C=1))]),\n",
" param_grid={&#x27;svr__epsilon&#x27;: array([1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01,\n",
" 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06])},\n",
" verbose=1)</pre></div></div></div><div class=\"sk-parallel\"><div class=\"sk-parallel-item\"><div class=\"sk-item\"><div class=\"sk-label-container\"><div class=\"sk-label sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-2\" type=\"checkbox\" ><label for=\"sk-estimator-id-2\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">estimator: Pipeline</label><div class=\"sk-toggleable__content\"><pre>Pipeline(steps=[(&#x27;standardscaler&#x27;, StandardScaler()), (&#x27;svr&#x27;, SVR(C=1))])</pre></div></div></div><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-3\" type=\"checkbox\" ><label for=\"sk-estimator-id-3\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">StandardScaler</label><div class=\"sk-toggleable__content\"><pre>StandardScaler()</pre></div></div></div><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-4\" type=\"checkbox\" ><label for=\"sk-estimator-id-4\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">SVR</label><div class=\"sk-toggleable__content\"><pre>SVR(C=1)</pre></div></div></div></div></div></div></div></div></div></div></div></div>"
],
"text/plain": [
"GridSearchCV(cv=6,\n",
" estimator=Pipeline(steps=[('standardscaler', StandardScaler()),\n",
" ('svr', SVR(C=1))]),\n",
" param_grid={'svr__epsilon': array([1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01,\n",
" 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06])},\n",
" verbose=1)"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.model_selection import GridSearchCV\n",
"import numpy as np\n",
"\n",
"\n",
"grid = GridSearchCV(pipe, param_grid={'svr__epsilon': np.logspace(-6, 6, 13)}, cv=6, verbose=1)\n",
"grid.fit(X_train, y_train)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "e5675260",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fitting 6 folds for each of 169 candidates, totalling 1014 fits\n"
]
},
{
"data": {
"text/html": [
"<style>#sk-container-id-2 {color: black;}#sk-container-id-2 pre{padding: 0;}#sk-container-id-2 div.sk-toggleable {background-color: white;}#sk-container-id-2 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-2 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-2 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-2 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-2 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-2 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-2 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-2 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-2 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-2 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-2 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-2 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-2 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-2 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-2 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-2 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-2 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-2 div.sk-item {position: relative;z-index: 1;}#sk-container-id-2 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-2 div.sk-item::before, #sk-container-id-2 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-2 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-2 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-2 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-2 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-2 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-2 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-2 div.sk-label-container {text-align: center;}#sk-container-id-2 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-2 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-2\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>GridSearchCV(cv=6,\n",
" estimator=Pipeline(steps=[(&#x27;standardscaler&#x27;, StandardScaler()),\n",
" (&#x27;svr&#x27;, SVR(C=1))]),\n",
" param_grid={&#x27;svr__C&#x27;: array([1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01,\n",
" 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06]),\n",
" &#x27;svr__epsilon&#x27;: array([1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01,\n",
" 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06])},\n",
" verbose=1)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item sk-dashed-wrapped\"><div class=\"sk-label-container\"><div class=\"sk-label sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-5\" type=\"checkbox\" ><label for=\"sk-estimator-id-5\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">GridSearchCV</label><div class=\"sk-toggleable__content\"><pre>GridSearchCV(cv=6,\n",
" estimator=Pipeline(steps=[(&#x27;standardscaler&#x27;, StandardScaler()),\n",
" (&#x27;svr&#x27;, SVR(C=1))]),\n",
" param_grid={&#x27;svr__C&#x27;: array([1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01,\n",
" 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06]),\n",
" &#x27;svr__epsilon&#x27;: array([1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01,\n",
" 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06])},\n",
" verbose=1)</pre></div></div></div><div class=\"sk-parallel\"><div class=\"sk-parallel-item\"><div class=\"sk-item\"><div class=\"sk-label-container\"><div class=\"sk-label sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-6\" type=\"checkbox\" ><label for=\"sk-estimator-id-6\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">estimator: Pipeline</label><div class=\"sk-toggleable__content\"><pre>Pipeline(steps=[(&#x27;standardscaler&#x27;, StandardScaler()), (&#x27;svr&#x27;, SVR(C=1))])</pre></div></div></div><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-7\" type=\"checkbox\" ><label for=\"sk-estimator-id-7\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">StandardScaler</label><div class=\"sk-toggleable__content\"><pre>StandardScaler()</pre></div></div></div><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-8\" type=\"checkbox\" ><label for=\"sk-estimator-id-8\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">SVR</label><div class=\"sk-toggleable__content\"><pre>SVR(C=1)</pre></div></div></div></div></div></div></div></div></div></div></div></div>"
],
"text/plain": [
"GridSearchCV(cv=6,\n",
" estimator=Pipeline(steps=[('standardscaler', StandardScaler()),\n",
" ('svr', SVR(C=1))]),\n",
" param_grid={'svr__C': array([1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01,\n",
" 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06]),\n",
" 'svr__epsilon': array([1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01,\n",
" 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06])},\n",
" verbose=1)"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.model_selection import GridSearchCV\n",
"\n",
"grid = GridSearchCV(pipe, param_grid={'svr__C': np.logspace(-6, 6, 13),\n",
" 'svr__epsilon': np.logspace(-6, 6, 13)},\n",
" cv=6, verbose=1)\n",
"grid.fit(X_train, y_train)"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "5f4aea17",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'svr__C': 1e-06, 'svr__epsilon': 10000.0}"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"grid.best_params_"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "4eced1fa",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-1.2176570469087928"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"grid.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "fadda0f9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-1.2176570605683525"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = SVR(C=1e6)\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "01db85d4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-1.2176570605683525"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"regr = SVR(C=1e9)\n",
"pipe = make_pipeline(StandardScaler(), regr)\n",
"pipe.fit(X_train, y_train)\n",
"pipe.score(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"id": "c2c3e477",
"metadata": {},
"source": [
"Conclusion: probably better to just use another algorithm rather than trying to resuce stability from SVR or MLP."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "redflag",
"language": "python",
"name": "redflag"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment