Skip to content

Instantly share code, notes, and snippets.

@timehaven
Created July 19, 2017 15:48
Show Gist options
  • Save timehaven/3458d168b70eb2ede67c8db7bc057e5c to your computer and use it in GitHub Desktop.
Save timehaven/3458d168b70eb2ede67c8db7bc057e5c to your computer and use it in GitHub Desktop.
while 1:
...
df = df.sample(frac=1) # shuffle all rows
...
i, j = 0, batch_size
for _ in range(nbatches):
sub = df.iloc[i:j]
idx = sub.index.values
X2 = bcolz.open(bcolz_dir)[idx]
...
# Calculate X and Y appropriately
...
yield [X, X2], Y
i = j
j += batch_size
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment