Skip to content

Instantly share code, notes, and snippets.

@dipta007
Created June 7, 2020 13:45
Show Gist options
  • Save dipta007/63cb3617437369936e546e884a55b74d to your computer and use it in GitHub Desktop.
Save dipta007/63cb3617437369936e546e884a55b74d to your computer and use it in GitHub Desktop.
def data_generator():
total_row = NUMBER_OF_ROWS
files = [
'a.csv',
'b.csv',
'c.csv',
'd.csv',
]
pds = get_all(files, chunksize=batch_size)
cnt = 0
while True:
data_frames = []
for reader in pds:
data_frames.append(reader.get_chunk())
cnt += batch_size
merged = merge_all(data_frames)
x = merged.iloc[:, 1:].to_numpy()
y = merged.iloc[:, 0].to_numpy()
if cnt >= total_row:
pds = get_all(files, chunksize=batch_size)
cnt = 0
yield x, y
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment