Given two sorted arrays, A of length N where N in [0, inf), and B of length M in [0, inf), find the kth smallest element in the union U = A u B in sub O(k) time.
Most of the solutions that I have googled do not consider the case where N and M can be as small as 0, but this solution does.
The strategy to this problem is to chop off part of the arrays that we know can't be the kth element. In doing so, when we recurse on the smaller arrays, the number 'k' will change to reflect tossing out elements smaller than the kth element. The base case is when either one of the arrays is empty, the kth smallest element is the the non-empty array at index k-1:
def kth_smallest(k, a, b) return b[k-1] if a.empty? return a[k-1] if b.empty? # ... end
We just have to be smart in which elements to toss out. The trivial elements to throw out are the ones of index k or greater in either arrays, since the kth smallest in the union of the two arrays can't be one of those elements. In ruby, we can use
slice! to toss out parts of the array.
a.slice!(k..-1) if a.length > k b.slice!(k..-1) if b.length > k
Now since we are looking for sub O(k) time, we can probably do some sort of mid point thingy on both arrays and try to figure out what can stay and what can't. Let's grab the two midpoints of A and B, call the indices i and j respectively. It would be nice if A[i] is always greater than B[j], so we can just swap names if this isn't the case:
i = (a.length - 1)/2 j = (b.length - 1)/2 unless a[i] >= b[j] a, b = b, a i, j = j, i end
Now it's better to work with visual examples. Consider A with 6 elements, B with 3, and k = 6. Here's the sketch
i v A: [*,*,*,*,*,*] B: [*,*,*] ^ j
Remember that A[i] > B[j] and A and B are both sorted. What's the least position in U that A[i] can be? Well looking at A there are i elements that are less than A[i] and in B there are at least j + 1 elements less than A[i]. We don't know whether B[j+1] is greater than A[i] or not. So the least position (not index) A[i] can be is i - j + 2. Similarly, what's the greatest position (not index) in U that A[i] can be? This is when everything in B is smaller than A[i], so we got i + B.length + 1.
Do the same for B[j]. What's the least position it can be? If everything in A was greater, then the least position is just the number of elements to the left of B[j] + 1, which is j + 1. What's the most it can be? If everythign before A[i] was smaller than B[j], then it would be i + j + 1.
a_lower = i + j + 2 a_upper = b.length + i + 1 b_lower = j + 1 b_upper = i + j + 1
Remember, we are finding the kth element in U. If k is not in these ranges, then there is no way A[i] or B[j] can be the kth smallest element. If k IS in the ranges, then we actually do nothing to the array. So let's look at the case when k is not in the range. If a_lower is greater than k, then we know that A[i] and everything after that can't work. The other case is if k is bigger than a_greater, in which we know that A[i] and everything before it can't work, so we throw those out. But how many we throw out will change what k we are looking for in the next recursion, so we have to change k to k - amount of stuff we throw out which is k - i + 1. Do the same for B and with j instead.
unless a_lower <= k && k <= a_upper if a_lower > k a.slice!(i..-1) else a.shift(i+1) k = k - i - 1 end end
Finally recurse and we have the full solution:
def kth_smallest(k, a, b) return b[k-1] if a.empty? return a[k-1] if b.empty? a.slice!(k..-1) if a.length > k b.slice!(k..-1) if b.length > k i = (a.length - 1)/2 j = (b.length - 1)/2 unless a[i] >= b[j] a, b = b, a i, j = j, i end a_lower = i + j + 2 a_upper = b.length + i + 1 b_lower = j + 1 b_upper = i + j + 1 unless a_lower <= k && k <= a_upper if a_lower > k a.slice!(i..-1) else a.shift(i+1) k = k - i - 1 end end unless b_lower <= k && k <= b_upper if b_lower > k b.slice!(j..-1) else b.shift(j+1) k = k - j - 1 end end return kth_smallest(k, a, b) end
You might be wondering, what if k is in both ranges? Well that's not possible. Look at a_lower and b_upper. The don't overlap so it's not possible that k is in both.
Then you might be wondering, what if k is in neither ranges? Well then both arrays get some parts of it cut off.