If we want to measure the drift of each feature, we can convert each feature to a distribution using:
- KDE for continuous features
- Binning for Categorical features
With distributions for each train column and uploaded log column, we can compare column-to-column with KL divergence.
- KL divergence measures how different two distributions are
- Higher KL divergence = more drift
- Use this metric to classify drift