Skip to content

Instantly share code, notes, and snippets.

@csbailey5t
Last active September 28, 2022 13:19
Show Gist options
  • Save csbailey5t/907923243647a6473237440ab360a87a to your computer and use it in GitHub Desktop.
Save csbailey5t/907923243647a6473237440ab360a87a to your computer and use it in GitHub Desktop.
Code for Conditional Data Gen in 5 lines
DATASET_PATH = 'https://gretel-public-website.s3.amazonaws.com/datasets/mitre-synthea-health.csv'
model = trainer.Trainer()
model.train(DATASET_PATH, seed_fields=["RACE", "ETHNICITY", "GENDER"])
seed_df = pd.DataFrame(data=[
["black", "african", "F"],
["black", "african", "F"],
["black", "african", "F"],
["black", "african", "F"],
["asian", "chinese", "F"],
["asian", "chinese", "F"],
["asian", "chinese", "F"],
["asian", "chinese", "F"],
["asian", "chinese", "F"]
], columns=["RACE", "ETHNICITY", "GENDER"])
model.generate(seed_df=seed_df)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment