Created
March 18, 2025 14:06
-
-
Save chausen/ccaf558f384769bba1899e6deb69ffc9 to your computer and use it in GitHub Desktop.
Downsample large data file
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
# 1. Read raw CSV | |
df = pd.read_csv("raw_data.csv") | |
# 2. Convert the timestamp column to datetime (adjust column name/format as necessary) | |
df["timestamp"] = pd.to_datetime(df["timestamp"]) | |
# 3. Make the timestamp the index | |
df.set_index("timestamp", inplace=True) | |
# 4. Resample to 10-second intervals, and compute mean, min, max on each numeric column | |
df_agg = df.resample("10S").agg(["mean", "min", "max"]) | |
# 5. Optional: Flatten the multi-level column names produced by agg() | |
df_agg.columns = ["_".join(col).strip() for col in df_agg.columns.values] | |
# 6. Save the aggregated data to a new CSV | |
df_agg.to_csv("aggregated_data_10s.csv") | |
print("Done! Aggregated file saved as aggregated_data_10s.csv") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment