Skip to content

Instantly share code, notes, and snippets.

@toandaominh1997
Created January 24, 2024 08:13
Show Gist options
  • Save toandaominh1997/6a5edcb052645808b3239af7463d0872 to your computer and use it in GitHub Desktop.
Save toandaominh1997/6a5edcb052645808b3239af7463d0872 to your computer and use it in GitHub Desktop.
import pandas as pd
def split_dataframe(df, chunk_size, file_name = "shopee"):
"""
Splits a Dataframe into smaller chunks of a specified size and saves each chunk as a separate CSV file.
Parameters:
- df (pandas.DataFrame): The DataFrame to be split.
- chunk_size (int): The number of rows each chunk should contain.
- file_name (str, optional): The base name for the output CSV files. Defaults to "shopee".
The function does not return any value. It saves the chunks directly to the current working directory.
Example Usage:
import pandas as pd
df = pd.DataFrame(...)
# Split 'df' into chunks of 100 rows each, and save them as 'shopee_1.csv', 'shopee_2.csv', etc.
split_dataframe(df, 100, "shopee")
"""
for i in range(0, len(df), chunk_size):
chunk = df[i:i+chunk_size]
chunk.to_csv(f"./{file_name}_{idx+1}.csv", index = False)
# Pass df as dataframe, chunk_size, file_name
split_dataframe(df, chunk_size = 60000, file_name= "shopee")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment