Skip to content

Instantly share code, notes, and snippets.

@Shiina18
Created January 11, 2024 07:20
Show Gist options
  • Save Shiina18/008271735840338c6cdc0d76de710240 to your computer and use it in GitHub Desktop.
Save Shiina18/008271735840338c6cdc0d76de710240 to your computer and use it in GitHub Desktop.
Merge all csvs with the same columns. Usage `./merge_csv.sh -s source_dir -t target_csv_filepath`. Modification is needed if there may be spaces in filenames.
#!/bin/bash
# Parse command line arguments using getopts
while getopts ":s:t:" opt; do
case $opt in
s)
source_dir="$OPTARG"
;;
t)
target_path="$OPTARG"
;;
\?)
echo "Invalid option: -$OPTARG" >&2
exit 1
;;
:)
echo "Option -$OPTARG requires an argument." >&2
exit 1
;;
esac
done
# Print source_dir and target_path for debugging
echo "Source Directory: $source_dir"
echo "Target Path: $target_path"
# Get all CSV files in the source directory
csv_files=$(ls "$source_dir"/*.csv)
# Count the number of CSV files
num_files=$(echo $csv_files | wc -w)
# Write the header of the first file to the target file
head -1 "$(echo $csv_files | cut -d' ' -f1)" > "$target_path"
# Append the content of all CSV files to the target file, excluding the header
i=0
for file in $csv_files
do
tail -n +2 "$file" >> "$target_path"
i=$((i+1))
echo $((i*100/num_files))
done | pv -N "Processing files" -l -s $num_files > /dev/null
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment