Skip to content

Instantly share code, notes, and snippets.

@RolandColored
Last active March 13, 2024 13:18
Show Gist options
  • Save RolandColored/60246df7fc39018e9e71cf62e9d86968 to your computer and use it in GitHub Desktop.
Save RolandColored/60246df7fc39018e9e71cf62e9d86968 to your computer and use it in GitHub Desktop.
Transforms a dump created by `pg_restore --data-only -t table db.pg_dump -f table.tsv` into a regular TSV file to be read by Spark
#!/bin/bash
for filename in *.tsv; do
echo $filename
tail -n +23 $filename | head -n 1 | sed -e 's/.*(\(.*\)).*/\1/' | sed -e 's/, /\t/g' > fixed/$filename
tail -n +24 $filename | head -n -7 >> fixed/$filename
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment