Skip to content

Instantly share code, notes, and snippets.

@kissmygritts
Created May 3, 2022 22:38
Show Gist options
  • Save kissmygritts/637d741354cf10c118c95a1aa42e0dab to your computer and use it in GitHub Desktop.
Save kissmygritts/637d741354cf10c118c95a1aa42e0dab to your computer and use it in GitHub Desktop.
Load a directory of csvs into postgres
#!/bin/sh
dir_path=$1
directory=$(cd "$dir_path" && pwd)
temp_dir=$(mktemp -d)
build_script="$directory/build.sql"
files=$(dir $dir_path)
# delete build script if exists
if [ -f "$build_script" ];
then
rm $build_script
fi
# read csv header and create etl tables
for file in $files
do
tablename=$(basename $file | cut -f 1 -d '_')
echo "Processing table: ${tablename}"
file_path="$directory/$file"
temp_path="$temp_dir/$tablename.csv"
# remove N/A and NULL values from file & save to temp_dir
sed 's/N\/A//g;s/NULL//g' $file_path > $temp_path
head $temp_path |\
csvsql \
-i postgresql \
-y 0 \
-I \
--no-constraints \
--db-schema 'etl' \
--tables $tablename >> $build_script
done
# dropping, then creating etl schema
psql draw_validation -q -c "drop schema if exists etl cascade"
psql draw_validation -q -c "create schema etl"
# running build script based on data in the directory
psql draw_validation -q --single-transaction -f $build_script
# read csv data into etl schema
for file in $files
do
tablename=$(basename $file | cut -f 1 -d '_')
file_path="$temp_dir/$tablename.csv"
echo "Loading table: ${tablename}"
# load files into database from temp_dir
psql draw_validation \
-q -c "\copy etl.${tablename} from '${file_path}' with csv header"
done
# clean up
rm $build_script
rm -rf $temp_dir
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment