Skip to content

Instantly share code, notes, and snippets.

@sudheesh001
Last active December 19, 2015 12:39
Show Gist options
  • Save sudheesh001/5956512 to your computer and use it in GitHub Desktop.
Save sudheesh001/5956512 to your computer and use it in GitHub Desktop.
Complexity and Pseudocode for a K-Way merge algorithm.

For the base case we have something a little less trivial since our smallest merge will have k elements in it, which for a large data set could have 10 to 50 partitions OR MORE. To solve this we can sort by putting them in to a heap real quick and pulling them back out, in the given source code below, I didn't write the heap, but it's pretty trivial to do so.

if (high < low + k) {
    // the subarray has k or fewer elements
    // just make one big heap and do deleteMins on it
    Comparable[] subarray = new MergesortHeapNode[high - low + 1];
    for (int i = 0, j = low; i < subarray.length; i++, j++) {
        subarray[i] = new MergesortHeapNode(data[j], 0);
    }
    BinaryHeap heap = BinaryHeap.buildHeap(subarray);
    for (int j = low; j <= high; j++) {
        try {
            data[j] = ((MergesortHeapNode) heap.deleteMin()).getKey();
        }
        catch (EmptyHeapException e) {
            System.out.println ("Tried to delete from an empty heap.");
        }
    }
    
}

And for the non-base case where we have several sorted arrays we run in to a similar problem, we can no longer do a simple, is left less than right. We have to put the first index of all arrays in to a heap to figure out which array has the lowest first element. And once we figure out which array has the lowest from the heap we have to pull the element from that array and put it where it goes and then we have to refill the heap with the next element in the small array we just took from.

else {
    // divide the array into k subarrays and do a k-way merge
    final int subarrSize = high-low+1;
    final int[] tempArray = new int[subarrSize];
    
    // Make temp array
    for (int i = low; i < high + 1; i++)
        tempArray[i-low] = data[i];
    
    // Keep subarray index to keep track of where we are in each subarray
    final int[] subarrayIndex = new int[k];
    for (int i = 0; i < k; i++)
        subarrayIndex[i] = i*(subarrSize)/k;
    
    // Build heap
    Comparable[] subarray = new MergesortHeapNode[k];
    for (int i = 0; i < k; i++)
        subarray[i] = new MergesortHeapNode(tempArray[subarrayIndex[i]++], i);

    BinaryHeap heap = BinaryHeap.buildHeap(subarray);
    
    // For each element low to high, find the lowest in each k subarray
    for (int i = low; i < high + 1; i++)
    {
        
        // Take lowest element and add back in to original array
        try
        {
            MergesortHeapNode a = ((MergesortHeapNode) heap.deleteMin());
            data[i] = a.getKey();
            if (subarrayIndex[a.getWhichSubarray()] < (a.getWhichSubarray()+1)*(subarrSize)/k)
            {
                heap.insert(new MergesortHeapNode(tempArray[subarrayIndex[a.getWhichSubarray()]]++, a.getWhichSubarray()));
                
                // Increment the subarray index where the lowest element resides
                subarrayIndex[a.getWhichSubarray()]++;
            }
        } catch (EmptyHeapException e)
        {
            System.out.println ("Tried to delete from an empty heap.");
        }
    }
}

Time complexity can be mistaken to be O(kn) looking at the processing, carefully look at it again and you'll notice its O(nk2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment