Skip to content

Instantly share code, notes, and snippets.

@bbeck
Last active August 9, 2016 13:46
Show Gist options
  • Save bbeck/4553588 to your computer and use it in GitHub Desktop.
Save bbeck/4553588 to your computer and use it in GitHub Desktop.
k-way merge sort in python
def merge(generators):
r"Perform a k-way merge of the data contained in the generators."
heap = []
# The last known value for a given generator
values = [None for g in generators]
# Initialize the heap with the first value from each generator
for i, g in enumerate(generators):
heapq.heappush(heap, (g.next(), g, i))
while heap:
# Peek and get the current timestamp
tm = heap[0][0][0]
# Pull off all of the data for now, saving the values along the way
while heap and heap[0][0][0] == tm:
(tm, value), g, i = heapq.heappop(heap)
values[i] = value
# Get the next data point for this generator and enqueue it
try:
next_tm, next_value = g.next()
heapq.heappush(heap, ((next_tm, next_value), g, i))
except StopIteration:
# This generator is finished
pass
yield tm, values
@glangford
Copy link

glangford commented Aug 9, 2016

Why not use heapq.merge, which can merge sorted iterables. For example:

>>> import heapq
>>> x = [1,4,7,11]
>>> y = [2,3,9,13]
>>> heapq.merge(x,y)
<generator object merge at 0x101310888>
>>> [a for a in heapq.merge(x,y)]
[1, 2, 3, 4, 7, 9, 11, 13]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment