Skip to content

Instantly share code, notes, and snippets.

@jakebrinkmann
Created July 3, 2017 14:19
Show Gist options
  • Save jakebrinkmann/de7fd185efe9a1f459946cf72def057e to your computer and use it in GitHub Desktop.
Save jakebrinkmann/de7fd185efe9a1f459946cf72def057e to your computer and use it in GitHub Desktop.
Read SQL query from psycopg2 into pandas dataframe
import pandas as pd
import pandas.io.sql as sqlio
import psycopg2
conn = psycopg2.connect("host='{}' port={} dbname='{}' user={} password={}".format(host, port, dbname, username, pwd))
sql = "select count(*) from table;"
dat = sqlio.read_sql_query(sql, conn)
conn = None
@cgi1
Copy link

cgi1 commented Aug 18, 2019

Straightforward - Thanks (=

@mmeasic
Copy link

mmeasic commented Sep 2, 2019

I think conn = None will close and clean up the database connection. The same thing can be accomplished using a with block:

import pandas as pd
import psycopg2

with psycopg2.connect("host='{}' port={} dbname='{}' user={} password={}".format(host, port, dbname, username, pwd)) as conn:
    sql = "select count(*) from table;"
    dat = pd.read_sql_query(sql, conn)

However, you can still access the conn object and create cursors from it. In this case, the context manager does not work. Check this:

with pg.connect(host='localhost', port=54320, dbname='ht_db', user='postgres') as connection:
    df_task1 = pd.read_sql_query(query, connection)

cur = connection.cursor()
cur.execute('SELECT COUNT(1) FROM users')
print(cur.rowcount)
1

@paulkernfeld
Copy link

@mmeasic thanks for correcting me! Do you happen to know what this context manager does on exit, if anything? Based on attempting to look at the psycopg2 source code, it appears to commit or roll back.

@mmeasic
Copy link

mmeasic commented Sep 4, 2019

@paulkernfeld No problem! It seems the same thing happens if you try to wrap a cursor, the context manager does not automatically clean up the state of the transaction (commit if success/rollback if exception). I decorate my connect() function with @contextmanager and then do it there, calling close() after yield.

@thomascapote
Copy link

@mmeasic Good catch regarding the behavior of conn in the with block! Would you mind sharing the relevant bits of code when using @contextmanager and connect()? I'm not sure I understand what you mean.

@Lolologist
Copy link

Thank you!

@Shuaib-8
Copy link

Very concise and understandable. Thanks for sharing this standard SQL/pandas workflow!

@denisesenguel
Copy link

this is so helpful, thank you :)

@seanchon
Copy link

🙌

@Bezzmozzg
Copy link

Здоровья тебе и твоим детям

@aaryapatel007
Copy link

Good!

@mezza
Copy link

mezza commented Sep 11, 2020

👏

@adityasatalkar
Copy link

Thanks a ton!

@marselositorus1997
Copy link

nice!!

@Grossmend
Copy link

Thanks!

@chiragsanghvi10
Copy link

How can I handle this in python pandas if my data frame has 4000K rows?

@SwapnilShirke
Copy link

nice , this quick solution and helpful

@naolip
Copy link

naolip commented May 28, 2021

Thx

@Exganza
Copy link

Exganza commented Jul 19, 2021

Very helpful
THANK YOU

@xanderladd
Copy link

thanks 👍

@jackycwwang
Copy link

Thanks for sharing the knowledge!

@Witomi
Copy link

Witomi commented Dec 14, 2021

Thanks a lot!

@evanrrees
Copy link

Perfect, thank you!

@marymlucas
Copy link

Most excellent! Thank you.

@saarthaksangam
Copy link

❤️

@efaraneda
Copy link

beautiful :D

Copy link

ghost commented Aug 24, 2022

this is awesome
Thanks alot.

@moenk
Copy link

moenk commented Dec 5, 2022

The example does not work for me:
UserWarning: pandas only supports SQLAlchemy connectable (engine/connection) or database string URI or sqlite3 DBAPI2 connection. Other DBAPI2 objects are not tested. Please consider using SQLAlchemy. df = pd.read_sql_query(sql, conn)

@aymenkrifa
Copy link

aymenkrifa commented Mar 2, 2023

> The example does not work for me: `UserWarning: pandas only supports SQLAlchemy connectable (engine/connection) or database string URI or sqlite3 DBAPI2 connection. Other DBAPI2 objects are not tested. Please consider using SQLAlchemy. df = pd.read_sql_query(sql, conn)

It should actually work and extract your table though because it's just a UserWarning, a warning that you can ignore (but it's not recommended to ignore it)
In Pandas, Psycopg2 is not tested as the warning implies, and it prefers using SQLAlchemy instead.

You can remake the example above by running the following code:

import pandas as pd
from sqlalchemy import create_engine
from sqlalchemy.sql import text

engine = create_engine(
    "{dialect}+{driver}://{username}:{password}@{host}:{port}/{database}".format(
        dialect="postgresql",
        driver="psycopg2",
        username="john",
        password="12345",
        host="localhost",
        port=5432,
        database="items"
    )
)

with engine.connect() as db_conn:
    sql_query = "SELECT COUNT(*) FROM table_name"
    df = pd.read_sql_query(sql=text(sql_query), con=db_conn)

You can actually create an engine by passing a URL object instead of the string:

from sqlalchemy import create_engine, URL

url_object = URL.create(
    drivername="DIALECT+DRIVER",
    username="username",
    password="password",
    host="host",
    database="database_name",
)

engine = create_engine(url=url_object)

I'm not an expert but I hope it helps!

@dss010101
Copy link

does anyone now if pd.read_sql blocks? im have a multi-thread process that uses it, and it seems to block

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment