Skip to content

Instantly share code, notes, and snippets.

@mmatkinson
Created November 4, 2016 13:39
Show Gist options
  • Save mmatkinson/bff6e06da00e93ceb5c905bfe05f1ea0 to your computer and use it in GitHub Desktop.
Save mmatkinson/bff6e06da00e93ceb5c905bfe05f1ea0 to your computer and use it in GitHub Desktop.
take in a dataframe and output (redshift) DDL For creating a table of that format.
def df_to_ddl(df, tablename='test.mytable'):
data_dtypes = df.dtypes.reset_index().rename(columns = {'index':'colname',0:'datatype'})
# Map pandas datatypes into SQL
data_dtypes['sql_dtype'] = data_dtypes.datatype.astype(str).map(
{'object':'varchar(24)',
'float64':'float',
'int64':'int',
'bool':'boolean'} )
ddl = ", ".join([ "{colname} {sql_dtype} \n".format(**row)
for row in
data_dtypes[['colname','sql_dtype']].to_dict('records')])
ddl_statement = """
create table {tablename}
(
{ddl}
)
DISTSTYLE ALL
;
""".format(tablename=tablename,ddl=ddl)
return ddl_statement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment