Skip to content

Instantly share code, notes, and snippets.

@louwersj
Last active January 23, 2023 15:33
Show Gist options
  • Save louwersj/031001b9949c91f74de48162bf2858a1 to your computer and use it in GitHub Desktop.
Save louwersj/031001b9949c91f74de48162bf2858a1 to your computer and use it in GitHub Desktop.
# Developed for taking a number of .csv files with flight data from flightradar24 and prepare them for use in kepler.gl to
# visualize flight paths over time.
#
# This script uses the Pandas library to manipulate CSV files. It starts by using the glob library to find all CSV files in
# the current directory, and stores them in a list called "csv_files".
#
# It then creates an empty list called "df_list" which will be used to store the modified DataFrames.
#
# A for loop then iterates over each file in the "csv_files" list, using the Pandas function "pd.read_csv()" to read the file
# and store it as a DataFrame. It then uses the Pandas function "df["Position"].str.replace('"','')" to remove any double
# quotes from the "Position" column.
#
# It then uses the Pandas function "df[["Latitude", "Longitude"]] = df["Position"].str.split(",", expand=True)" to split the
# "Position" column into two separate columns: "Latitude" and "Longitude" by using the "," as a separator.
#
# The modified DataFrame is then appended to the "df_list" list.
#
# After all files have been processed, the script uses the Pandas function "pd.concat(df_list, ignore_index=True)" to
# concatenate all the DataFrames in the "df_list" list into a single DataFrame called "result".
#
# Finally, the script uses the Pandas function "result.to_csv("cleaned_merged_file.csv", index=False)" to save the "result"
# DataFrame as a new CSV file called "cleaned_merged_file.csv" with the index set to false.
import pandas as pd
import glob
# Find all CSV files in the current directory
csv_files = glob.glob("*.csv")
df_list = []
# Iterate over each file and split the "Position" column into separate "Latitude" and "Longitude" columns
for file in csv_files:
df = pd.read_csv(file)
df["Position"] = df["Position"].str.replace('"','')
df[["Latitude", "Longitude"]] = df["Position"].str.split(",", expand=True)
df_list.append(df)
# Concatenate all the DataFrames into a single DataFrame
result = pd.concat(df_list, ignore_index=True)
# Create a new file with the "cleaned_merged_file.csv"
result.to_csv("cleaned_merged_file.csv", index=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment