Skip to content

Instantly share code, notes, and snippets.

@olooney
Created November 17, 2015 22:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save olooney/7e1593246c4e23ca7357 to your computer and use it in GitHub Desktop.
Save olooney/7e1593246c4e23ca7357 to your computer and use it in GitHub Desktop.
def pipe(inputCursor, outputCursor, outputTablename):
'''
Read rows from the inputCursor and immediately INSERT them into the
outputTablename using the supplied outputCursor. There are faster
mechanisms like copy_from() for particular databases but this is extremely
portable: not only do the inputCursor and outputCursor not have to be for
the same database, they can be for different database engines without
problem.
'''
# if the query returned no results (e.g, an INSERT or UPDATE) then just ignore it.
if not inputCursor.description:
return
# DB API 2.0 specifies the description layout, so it's pretty portable.
columnNames = [c[0] for c in inputCursor.description]
# generate a generic INSERT statement to map columns in the input to columns
# in the output. Note that this happens by column name; the order doesn't matter.
# use AS to name the columns in the input query to match their destinations in the
# target table.
insertSql = 'INSERT INTO %(tableName)s (%(columns)s) VALUES(%(placeholders)s)' % {
"tableName": outputTablename,
"columns": ",".join(columnNames),
"placeholders": ",".join( ['%s'] * len(columnNames) ),
}
# this is where the magic happens. executemany() takes an iterator of bound parameters
# and executes the statement for each one. The inputCursor itself is an iterator that
# yields rows. By simply passing one into the other, we have the writer iterating directly
# over the input, like true co-routines.
try:
outputCursor.executemany(insertSql, inputCursor)
finally:
outputCursor.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment