Skip to content

Instantly share code, notes, and snippets.

@thomasaarholt
Created July 24, 2022 13:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save thomasaarholt/4ff0dc09bd566803a84005bcfeb29ed0 to your computer and use it in GitHub Desktop.
Save thomasaarholt/4ff0dc09bd566803a84005bcfeb29ed0 to your computer and use it in GitHub Desktop.
Train and test split function for polars dataframes
def train_test_split(
df: pl.DataFrame, train_fraction: float = 0.75
) -> Tuple[pl.DataFrame, pl.DataFrame]:
"""Split polars dataframe into two sets.
Args:
df (pl.DataFrame): Dataframe to split
train_fraction (float, optional): Fraction that goes to train. Defaults to 0.75.
Returns:
Tuple[pl.DataFrame, pl.DataFrame]: Tuple of train and test dataframes
"""
df = df.with_column(pl.all().shuffle(seed=1))
split_index = int(train_fraction * len(df))
df_train = df[:split_index]
df_test = df[split_index:]
return (df_train, df_test)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment