Skip to content

Instantly share code, notes, and snippets.

@C-EB
Created February 6, 2025 16:01
Show Gist options
  • Save C-EB/bfbe7acdbd28cd80cdc515dbaae9c070 to your computer and use it in GitHub Desktop.
Save C-EB/bfbe7acdbd28cd80cdc515dbaae9c070 to your computer and use it in GitHub Desktop.
tokenization application
tokenized_datasets = dataset.map(
preprocess_function,
batched=True,
remove_columns=dataset['train'].column_names
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment