Created
September 23, 2021 01:09
-
-
Save fabclmnt/4c30b8aeab8defec7235cc715ad8c700 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Warnings: | |
TOTAL: 5 warning(s) | |
Priority 1: 1 warning(s) | |
Priority 2: 4 warning(s) | |
Priority 1 - heavy impact expected: | |
* [DUPLICATES - DUPLICATE COLUMNS] Found 1 columns with exactly the same feature values as other columns. | |
Priority 2 - usage allowed, limited human intelligibility: | |
* [DATA RELATIONS - HIGH COLLINEARITY - NUMERICAL] Found 3 numerical variables with high Variance Inflation Factor (VIF>5.0). The variables listed in results are highly collinear with other variables in the dataset. These will make model explainability harder and potentially give way to issues like overfitting. Depending on your end goal you might want to remove the highest VIF variables. | |
* [ERRONEOUS DATA - PREDEFINED ERRONEOUS DATA] Found 1960 ED values in the dataset. | |
* [DATA RELATIONS - HIGH COLLINEARITY - CATEGORICAL] Found 10 categorical variables with significant collinearity (p-value < 0.05). The variables listed in results are highly collinear with other variables in the dataset and sorted descending according to propensity. These will make model explainability harder and potentially give way to issues like overfitting. Depending on your end goal you might want to remove variables following the provided order. | |
* [DUPLICATES - EXACT DUPLICATES] Found 3 instances with exact duplicate feature values. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment