Skip to content

Instantly share code, notes, and snippets.

@sanp
Last active February 21, 2017 21:43
Show Gist options
  • Save sanp/e77be072c7f515a8051fbbe0e4e06e8c to your computer and use it in GitHub Desktop.
Save sanp/e77be072c7f515a8051fbbe0e4e06e8c to your computer and use it in GitHub Desktop.
Lightening Talk for Centro Tech Team on 2/15/17

SSH Tunneling

Problem

You want to query a DB and get a result set, but you don't have access to that DB directly from your localhost.

  • Bad solution A: Cry
  • Bad solution B: Ask someone who does have access to run your query for you
  • Bad solution C: ssh into a box that has access, then psql into the DB and run your query (then scp your result set from the remote box to your localhost)
  • (Probably least) Bad solution D: Connect to a VPN (if one exists) to access the DB directly from your machine

Good solution

SSH tunnelling. It's as easy as:

  1. Create a tunnel
    • Forward connections from your local port to the remote
  2. psql into the remote database, but use localhost as the host rather than the DB's host
  3. Have a beer!

Create a tunnel in one terminal:

ssh -L <local port>:<remote private ip>:<remote port> <ssh user>@<remote public host name>

Then query your db in another terminal:

psql -h localhost -p <remote port> -d <db name> -U <db user> -c "<some query>"

Note: If you already have postgres running on your local machine, you may need to stop your postgres server for this to work.

Two practical examples

The Security/Firewall Use Case

"I want to get data from the DW production slave db, but I don't have any fancy VPN access."

If I run this in my local terminal:

psql -h 10.0.2.203 -d dm_production -U dm_production

It times out with a cannot connect error message.

psql: could not connect to server: Operation timed out
        Is the server running on host "10.0.2.203" and accepting
        TCP/IP connections on port 5432?

We can tunnel in through, e.g., cmm-staging:

ssh -L 5432:10.0.2.203:5432 centro@staging01.cmm.ourcentro.net

This says: Forward connections from local port 5432 to 10.0.2.203:5432

Then (in another terminal):

psql -h localhost -d dm_production -U dm_production -c "select campaign_id from centro_direct.fact_monthly_flight_contract_delivery limit 2;"
Password for user dm_production:
             campaign_id
--------------------------------------
 02767f6b-b72c-4edc-a01d-92c7d1541b8b
 07b6bf45-51be-474c-ba55-96dc2e3a0b23
(2 rows)

No fuss access!

Note: If you want to add to your .pgpass the JDBC connection that you're accessing via an ssh tunnel, you must also use localhost in place of the db host in the connection string:

localhost:5432:dm_production:dm_production:<password>

The Data Size Use Case

"OK, but I can ssh into a box that has access, run my query, and then scp the result to my local, so why not do that?"

BC scping large files is the worst...

Instead, tunnel through, e.g., cmm-staging again (that box seems to have access to a lot of things...):

ssh -L 5432:10.0.2.203:5432 centro@staging01.cmm.ourcentro.net

Then:

psql -h localhost -d dm_production -U datasci -F ',' -A < some_large_query.sql > output.csv

Note: You can also use ssh tunnelling to access a remote DB through a GUI postgres tool. When you make your DB connection, simply replace the host IP with localhost. Some - like Postico - even have options to connect via ssh

More Info

This talk focused on one use of ssh tunnelling: accessing a remote databse. There are plenty of other uses. For more info, see extra reading.

@ddrscott
Copy link

psql -h localhost -d dm_production -U datasci -F ',' -A < some_large_query.sql > output.csv

can also be done as:

ssh centro@staging01.cmm.ourcentro.net "psql -U datasci -F ',' -A" < some_large_query.sql > output.csv

In this way, you'd utilize the remote psql client. This is also nice when you don't have a client installed locally.

@sanp
Copy link
Author

sanp commented Feb 21, 2017

@ddrscott That's a nice solution as well! We talked about this offline, but to follow up for posterity: One limitation to this approach is that it doesn't seem to work if the DB your accessing doesn't have an entry in the .pgpass for the box you're tunneling through. In this case, for example, cmm-staging doesn't have an entry for the dm_production DB. So when you psql from that box, it prompts you for a password. Since the command is then redirected (< some_large_query.sql), you can't enter in the password, and the command fails. Maybe you could get around this with some clever piping or something - but for this case, I like ssh tunneling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment