Skip to content

Instantly share code, notes, and snippets.

@GaryLee
Last active November 25, 2015 00:34
Show Gist options
  • Save GaryLee/05dbb0393c52636e79b9 to your computer and use it in GitHub Desktop.
Save GaryLee/05dbb0393c52636e79b9 to your computer and use it in GitHub Desktop.
Batch data processing example. Two methods included, generator and cmdlet.
#!python
# coding: utf-8
# --------------------------------------------
# Make data for test.
with file('g:\\data.txt', 'w') as fd:
for i in range(1000):
fd.write('%d\n' % i)
fd.close()
# --------------------------------------------
# Use generator for batch processing
def get_data(filename, num_of_data):
"""Gather data for batch processing."""
data_collection = []
for i,ln in enumerate(file(filename, 'r'), 1):
data_collection.append(ln.strip())
if i % num_of_data == 0:
yield data_collection
data_collection = []
for data_50 in get_data('g:\\data.txt', 50):
# process data here.
data_sum = sum(map(int, data_50))
print data_sum
# --------------------------------------------
# Use cmdlet package to handle same task.
from cmdlet.cmds import *
for data_50 in (readline('g:\\data.txt') | pack(50)):
# process data here.
data_sum = sum(map(int, data_50))
print data_sum
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment