Skip to content

Instantly share code, notes, and snippets.

@rvaidya
rvaidya / database_to_parquet.py
Last active November 5, 2023 11:15
Dump database table to parquet file using sqlalchemy and fastparquet. Useful for loading large tables into pandas / Dask, since read_sql_table will hammer the server with queries if the # of partitions/chunks is high. Using this you write a temp parquet file, then use read_parquet to get the data into a DataFrame
import pandas as pd
import numpy as np
import fastparquet
from sqlalchemy import create_engine, schema, Table
# Copied from pandas with modifications
def __get_dtype(column, sqltype):
import sqlalchemy.dialects as sqld
@topheman
topheman / git-notes.md
Created June 29, 2015 17:39
Git notes cheat sheet