Skip to content

Instantly share code, notes, and snippets.

@zh3389
Created April 28, 2024 08:36
Show Gist options
  • Save zh3389/8a65929fc6ea10cf4d3277a319a90179 to your computer and use it in GitHub Desktop.
Save zh3389/8a65929fc6ea10cf4d3277a319a90179 to your computer and use it in GitHub Desktop.
数据清洗自动化使用Pandas库,实现复杂数据处理和清洗的自动化。
import pandas as pd
def clean_data(file_path):
df = pd.read_csv(file_path)
# 示例:处理缺失值
df.fillna('N/A', inplace=True)
# 示例:去除重复行
df.drop_duplicates(inplace=True)
# 示例:转换列类型
df['date_column'] = pd.to_datetime(df['date_column'])
return df
# 使用示例:
cleaned_df = clean_data('data.csv')
print("数据清洗完成,已准备就绪!")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment