Last active
January 23, 2023 15:33
-
-
Save louwersj/031001b9949c91f74de48162bf2858a1 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Developed for taking a number of .csv files with flight data from flightradar24 and prepare them for use in kepler.gl to | |
# visualize flight paths over time. | |
# | |
# This script uses the Pandas library to manipulate CSV files. It starts by using the glob library to find all CSV files in | |
# the current directory, and stores them in a list called "csv_files". | |
# | |
# It then creates an empty list called "df_list" which will be used to store the modified DataFrames. | |
# | |
# A for loop then iterates over each file in the "csv_files" list, using the Pandas function "pd.read_csv()" to read the file | |
# and store it as a DataFrame. It then uses the Pandas function "df["Position"].str.replace('"','')" to remove any double | |
# quotes from the "Position" column. | |
# | |
# It then uses the Pandas function "df[["Latitude", "Longitude"]] = df["Position"].str.split(",", expand=True)" to split the | |
# "Position" column into two separate columns: "Latitude" and "Longitude" by using the "," as a separator. | |
# | |
# The modified DataFrame is then appended to the "df_list" list. | |
# | |
# After all files have been processed, the script uses the Pandas function "pd.concat(df_list, ignore_index=True)" to | |
# concatenate all the DataFrames in the "df_list" list into a single DataFrame called "result". | |
# | |
# Finally, the script uses the Pandas function "result.to_csv("cleaned_merged_file.csv", index=False)" to save the "result" | |
# DataFrame as a new CSV file called "cleaned_merged_file.csv" with the index set to false. | |
import pandas as pd | |
import glob | |
# Find all CSV files in the current directory | |
csv_files = glob.glob("*.csv") | |
df_list = [] | |
# Iterate over each file and split the "Position" column into separate "Latitude" and "Longitude" columns | |
for file in csv_files: | |
df = pd.read_csv(file) | |
df["Position"] = df["Position"].str.replace('"','') | |
df[["Latitude", "Longitude"]] = df["Position"].str.split(",", expand=True) | |
df_list.append(df) | |
# Concatenate all the DataFrames into a single DataFrame | |
result = pd.concat(df_list, ignore_index=True) | |
# Create a new file with the "cleaned_merged_file.csv" | |
result.to_csv("cleaned_merged_file.csv", index=False) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment