Skip to content

Instantly share code, notes, and snippets.

@runekaagaard
Last active September 18, 2018 14:09
Show Gist options
  • Save runekaagaard/7924963 to your computer and use it in GitHub Desktop.
Save runekaagaard/7924963 to your computer and use it in GitHub Desktop.
Bash scripts for extracting individual databases from a sql file dumped using pg_dumpall

Postgresql Xtra commands

  • pgx_list_dbs: Lists the names of databases in an .sql file dumped using pg_dumpall.
  • pgx_extract_db: Extracts a single database from a sql file dumped with pg_dumpall and outputs its content to stdout.

Installation

Download the (re)installer by doing:

sudo wget -O /usr/local/bin/pgx_installer \
     https://gist.github.com/runekaagaard/7924963/raw/pgx_installer \
     && sudo chmod +x /usr/local/bin/pgx_installer

Then run it with:

pgx_installer

Acknowledgements

Based on:

http://madssj.com/blog/2010/04/09/extracting-a-single-database-from-a-pg_dumpall-postgresql-dump/
#!/bin/bash
if [ $# -ne 2 ]; then
cat << EOF
usage: $0 [FILE]... [DBNAME]...
Extracts a single database from a sql file dumped with pg_dumpall and outputs
its content to stdout.
EOF
exit 14
fi
db_file="$1"
db_name="$2"
if [ ! -f "$db_file" -o ! -r "$db_file" ]; then
echo "error: $db_file not found or not readable" >&2
exit 14
fi
while read line; do
bytes=$(echo $line | cut -d: -f1)
if [ -z "$start_point" ]; then
start_point=$bytes
else
end_point=$bytes
fi
done < <(grep -b '^\\\connect' "$db_file" | grep -m 1 -A 1 "$db_name$")
if [ -n "$start_point" -a -z "$end_point" ]; then
end_point=`wc -c < $db_file`
fi
if [ -z "$start_point" -o -z "$end_point" ]; then
echo "error: start or end not found" >&2
exit 14
fi
db_length=$(($end_point - $start_point))
tail -c +$start_point $db_file | head -c $db_length | tail -n +3 \
| grep -v "OWNER TO $db_name" | grep -v -E "^(REVOKE|GRANT|--)"
#!/bin/bash
files=(pgx_extract_db pgx_list_dbs)
deleted_files=(pgx_dumped_dbs)
for file in "${deleted_files[@]}"
do
rm -f /usr/local/bin/$file
done
for file in "${files[@]}"
do
rm -f /usr/local/bin/$file
sudo wget -O /usr/local/bin/$file https://gist.github.com/runekaagaard/7924963/raw/$file
sudo chmod +x /usr/local/bin/$file
done
sudo wget -O /usr/local/bin/pgx_installer https://gist.github.com/runekaagaard/7924963/raw/pgx_installer
sudo chmod +x /usr/local/bin/pgx_installer
#!/bin/bash
if [ $# -ne 1 ]; then
cat << EOF
usage: $0 [FILE]...
Lists the names of databases in an .sql file dumped using pg_dumpall.
EOF
exit 14
fi
db_file="$1"
echo "The databases dumped in the file $db_file are:"
grep '^\\\connect' "$db_file" | sed 's/\\connect //g' | \
grep -vE "postgres|template[0-9]" | sort -u | uniq
@zstar69
Copy link

zstar69 commented Sep 25, 2014

Your code seems to have some errors:

root@db-1 backup]# sh extract_dbs.sh test123db testing.db
extract_dbs.sh: line 26: syntax error near unexpected token <' extract_dbs.sh: line 26: done < <(grep -b '^\\\connect' "$db_file" | grep -m 1 -A 1 "$db_name$")'

@srstsavage
Copy link

Works for me (Debian 7, zsh). Thanks!

@ceztko
Copy link

ceztko commented Jan 20, 2015

@zstar69 Have you tried just chmoding +x the script and running it? The chosen interpreter is /bin/bash, not a default (legacy) interpreter /bin/sh can point to.

@Abdull
Copy link

Abdull commented Jun 23, 2016

I had problems running the following command:

pgx_extract_db big-dump-file.sql my-database >my-database.sql

I got the following error message: "error: start or end not found". Notice the database name my-database. It includes a dash character.

Iff a database name includes dash characters, pg_dumpall will generate a \connect statement with its database name parameter in double quotes, e.g. the full dump won't include the statement \connect my-database, but \connect "my-database".

The solution / workaround then is to run pg_extract_db with the database name in (shell-escaped) double quotes:

pgx_extract_db big-dump-file.sql \"my-database\" >my-database.sql

@Abdull
Copy link

Abdull commented May 12, 2017

I ran into a new problem: Since some time, pg_dumps may add a -reuse-previous=on|off and "dbname='my-database'" to \connect statements, e.g.: \connect -reuse-previous=on "dbname='my-database'".

Notice the surrounding double quotations marks in parameter "dbname=bla". They let pgx_extract_db's end-of-line match "$db_name$" in done < <(grep -b '^\\\connect' "$db_file" | grep -m 1 -A 1 "$db_name$") fail.

My woraround is to change line
done < <(grep -b '^\\\connect' "$db_file" | grep -m 1 -A 1 "$db_name$")
to
done < <(grep -b '^\\\connect' "$db_file" | grep -m 1 -A 1 "$db_name")
(notice the removed dollar sign at the end)

For \connect statements that use "dbname='my-database'", my previous comment becomes invalid. Instead, use the usual statement pgx_extract_db big-dump-file.sql my-database >my-database.sql (ie., no double quote escapes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment