Skip to content

Instantly share code, notes, and snippets.

@gwax
Created May 10, 2016 17:35
Show Gist options
  • Save gwax/589befb7343ef423aa3f4ea03ceaafdb to your computer and use it in GitHub Desktop.
Save gwax/589befb7343ef423aa3f4ea03ceaafdb to your computer and use it in GitHub Desktop.
Sqlalchemy result batching for inserts
def batch_results(results, batch_size=10000):
batch = results.fetchmany(batch_size)
while batch:
yield batch
batch = results.fetchmany(batch_size)
for batch in batch_results(conn.execute(query)):
new_rows = [dict(row) for row in batch]
# Modify new_rows here, if needed
conn.execute(table.insert(), new_rows)
@jiffyclub
Copy link

I came up with this solution that is shorter and delegates more work to Python. This uses of the two-argument, sentinel usage of iter and the fact that fetchmany returns an empty list when all rows have been exhausted:

def chunk_results(results, size):
    """
    Yields chunks of a SQL select result.
    Arguments:
        results (sqlalchemy.ResultProxy): Execution result
        size (int): Chunk size
    Yields:
        List[sqlalchemy.RowProxy]: List of size results.
    """
    yield from iter(lambda: results.fetchmany(size=size), [])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment