Skip to content

Instantly share code, notes, and snippets.

@kvnkho
Last active August 2, 2022 19:25
Show Gist options
  • Save kvnkho/e2d26134e26440535a3c5da888e7808a to your computer and use it in GitHub Desktop.
Save kvnkho/e2d26134e26440535a3c5da888e7808a to your computer and use it in GitHub Desktop.
Example of Fugue Transform
import pandas as pd
from fugue import transform
from sklearn.preprocessing import minmax_scale
df = pd.DataFrame({"col1": ["A","A","A","B","B","B"], "col2":[1,2,3,4,5,6]})
def normalize(df: pd.DataFrame) -> pd.DataFrame:
return df.assign(scaled=minmax_scale(df["col2"]))
# run on Pandas
pdf = transform(df.copy(), normalize,
schema="*,scaled:float", partition={"by":"col1"})
# run on Dask
ddf = transform(df.copy(), normalize,
schema="*,scaled:float", partition={"by":"col1"} ,engine="dask")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment