Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save samarth-agrawal-86/3a8d06d2c17bed2d79782432bf8d434a to your computer and use it in GitHub Desktop.
Save samarth-agrawal-86/3a8d06d2c17bed2d79782432bf8d434a to your computer and use it in GitHub Desktop.
Sorted Split - To create train valid test dataset using fast_ml train_valid_test_split
import pandas as pd
df = pd.read_csv('/kaggle/input/bluebook-for-bulldozers/TrainAndValid.csv', parse_dates=['saledate'], low_memory=False)
from fast_ml.model_development import train_valid_test_split
X_train, y_train, X_valid, y_valid, X_test, y_test = train_valid_test_split(df, target = 'SalePrice',
method='sorted', sort_by_col='saledate',
train_size=0.8, valid_size=0.1, test_size=0.1)
print(X_train.shape), print(y_train.shape)
print(X_valid.shape), print(y_valid.shape)
print(X_test.shape), print(y_test.shape)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment