Skip to content

Instantly share code, notes, and snippets.

Model Boundary Needs Scaling Probabilities Strengths Weak Spots
Logistic Reg Linear ✅ (softmax) Fast, interpretable, calibrated Nonlinear patterns
SVM (Linear) Linear max-margin ⚠️ (enable) Robust separator, few params Overlap regions
SVM (RBF) Nonlinear ⚠️ Powerful on small data Tune C,gamma
KNN Nonlinear (data-driven) Simple, local structure Slow predict, noisy, scales matter
Decision Tree Nonlinear ✅ (leaf freq) Explainable
ID Outlook Temperature Play Tennis
1 Sunny Hot No
2 Sunny Hot No
3 Overcast Hot Yes
4 Rain Mild Yes
5 Rain Cool Yes
6 Rain Hot No
7 Overcast Cool Yes
8 Sunny Mild No
Kernel Best For Parameters to Tune Risk
Linear Linearly separable data None Poor for complex data
Polynomial Structured patterns c, d Overfitting with high d
RBF Complex, non-linear data γ Sensitive to γ
Sigmoid Neural network-like problems α Can be unstable
Model Family Scale? Notes
KNN / K-Means Distances must be comparable across features
SVM (linear/RBF) Margin/RBF kernel sensitive to scale
Logistic/Linear (with penalties) Coefficients & penalties become comparable
Neural Nets Helps optimization & stability
Trees / Random Forest / Gradient boost decision tree Split thresholds are rank-based
Naive Bayes Usually fine unscaled (Gaussian NB ok either way)
Scaler Type Outlier Robust? Distribution Goal Range/Bias Sparse-safe
StandardScaler ~Gaussian (z-scores) Mean 0, Var 1 ⚠️ (with_mean=False for sparse)
MinMaxScaler Preserve shape [0,1] (or custom)
RobustScaler Median/IQR-based No fixed range ⚠️
MaxAbsScaler Preserve signs [-1,1]
PowerTransformer ⚠️ More Gaussian (de-skew) No fixed range
QuantileTransformer Uniform/Normal via ranks [0,1] or Normal
Normalizer (row-wise)
count mean std min 25% 50% 75% max skew kurtosis
symboling 201.0 0.840796 1.254802 -2.000000 0.000000 1.000000 2.000000 3.000000 0.197370 -0.707178
normalized-losses 201.0 122.000000 31.996250 65.000000 101.000000 122.000000 137.000000 256.000000 0.846546 1.319068
wheel-base 201.0 98.797015 6.066366 86.600000 94.500000 97.000000 102.400000 120.900000 1.031261 0.948445
length 201.0 0.837102 0.059213 0.678039 0.801538 0.832292 0.881788 1.000000 0.154446 -0.065192
width 201.0 0.915126 0.029187 0.837500 0.890278 0.909722 0.925000 1.000000 0.875029 0.678655
symboling normalized-losses make aspiration num-of-doors body-style drive-wheels engine-location wheel-base length ... compression-ratio horsepower peak-rpm city-mpg highway-mpg price city-L/100km horsepower-binned diesel gas
3 122 alfa-romero std two convertible rwd front 88.6 0.81115 ... 9.0 111.0 5000.0 21 27 13495.0 11.190476 Medium 0 1
3 122 alfa-romero std two convertible rwd front 88.6 0.81115 ... 9.0 111.0