Skip to content

Instantly share code, notes, and snippets.

@mkaranasou
Last active March 21, 2020 13:01
Show Gist options
  • Save mkaranasou/cd43a06ede92300c277e4fd97b5cddda to your computer and use it in GitHub Desktop.
Save mkaranasou/cd43a06ede92300c277e4fd97b5cddda to your computer and use it in GitHub Desktop.
Pyspark read from database (Postgres)
user = 'postgres'
password = 'secret'
db_driver = 'org.postgresql.Driver'
host = '127.0.0.1'
db_url = f'jdbc:postgresql://{host}:5432/dbname?user={user}&password={password}'
df = spark.read.format(
'jdbc'
).options(
url=db_url,
driver=db_driver,
dbtable='table_name',
user=user,
password=password,
fetchsize=1000, # optional, increase fetchsize to get bigger chunks - get data faster
).load()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment