ahmadsb86/maxMedian.cpp

## maxMedian.cpp
/*

Given the constraints in the problem, our algo has to be at most O(n log n). We can acheive this bound using binary search

Suppose we have already created a function f(x) which, in O(n) time, can compute whether or not a subarray exists with median greater than (or equal to) x. Then, answering this question becomes as simple as finding the largest value of x such that f(x) is true. Finding this largest value of x can be done in O(log n) time with the method explained below.

If the maximum median of a test case is 5, the outputs of f(x) would look something like the following:

x:          1   2   3   4   5   6   7   8   9
f(x):       T   T   T   T   T   F   F   F   F

Notice how f(x) is true for all inputs less than (or equal to) 5 and false for all inputs greater than 5. This is because
1. f(x) implies f(x-1) since if a subarray exists with median >= x, then a subarray must also exist with median >= x-1  (i.e. the same subarray)
2. !f(x) implies !f(x+1) since if no subarray exists with median >= x, then no subarray can exist with median >= x+1

Now to find the largest value of x such that f(x) is still true, we simply have to find the inflection point of f(). That is, the point where the output of f(x) switches from true to false. This can be done via binary search. We can pick some input value near the the middle of the input spectrum as a test value and pass it into f(). If the output is true, we know the inflection point is to the right of (i.e. greater than)  this test value. If the output is false, we know the infelction point is to the left of this test value. With this information we can discard one half of the entire input spectrum and search the other half in the same way. This is enough to obtain O(n log n).


Now comes the fun (and hard) part: creating the supposed function f(x) which can determine whether a subarry exists within a[] that contains a median >= x in O(n) time. This can be done using some modified prefix sum magic.

The key observation here is that if the majority of numbers in a subarray are greater than (or equal to) x, then intuitively the median must be greater than (or equal to) x. Therefore if a subarray exists in which majority of the numbers are greater than or equal to x, f(x) is true. If such a subarray doesn't exsits, f(x) is false. More formally, if and only if there exists a subarray a[l...r] such that the number of elements greater than (or equal to) x in a[l...r] is strictly greater than (r-l-1)/2 (half of the size of the array), f(x) is true. Otherwise f(x) is false.

A modified prefix sum can help us head in the right direction here. If we create an array such that p[i] stores the number of elements greater than (or equal to) x from index 0 to index i in a[], we can compute in O(1) time the number of elements greater than  (or equal to) x. For example, the number of elements greater than (or equal to) x in the subarray a[i...j] can be computed by doing p[j] - p[i].  p[i] here can be created inductively like any other prefix sum with O(n) additive overhead, but allows us to query the number of elements greater than (or equal to) x in O(1)


This helps us simplify the problem to the following: Create a function f(x) which in O(n) time, finds out whether a value of r and l exist such that p[r]-p[l-1] > (r-l-1)/2. Now we can use perhaps my favorite technique in CP - rearranging the mathematical equation.

If we need want to check if an r and l exists such that
p[r]-p[l-1] > (r-l-1)/2
we can instead check if an r and l exists such that
p[r]-r/2 > p[l-1] - (l-1)/2

If we create an array g such that g[i] = p[i]-i/2 (creating this array can trivially be done in O(1) ), the question becomes

Does a value of r and l exists such that
g[r] > g[l-1]

The only problem here is that the above statement is a slight lie. The question also specifies that the subarray must be of length k and so the r-l-1 must be greater than (or equal to) k.

In reality, the question has become
is there a g[r] > g[l-1] such that r-l-1 is >= to k?

This can be computed easily in O(n) time, using a suffix max array. All we have to do is iterate over every possible value of l (i.e. all values from 0 to n-k) and with this value of l fixed, check whether a corresponding value of r exists (i.e. if a value exists that is greater than g[l] and is at least k numbers ahead in the array). For example if the array is

i:      0   1   2   3   4
g[i]:   2   0   1   3   0

we can loop all g[i] and check if there exists a number larger than g[i] that is at least k numbers ahead. This can be done by checking comparing g[i] with the largest number in the subarray g[i+k...n] (i.e. all numbers after and including index i+k in g[]). We can query the largest number in g[i+k...n] for any i in O(1) with a suffix max array. This allows every g[i] to be checked in O(1) time and since the length of g is n, we can compute the result of f(x) in O(n) time.

ezpz

*/


#include <bits/stdc++.h>
using namespace std;

int main() {

    //take input
    int n, k; cin >> n >> k;
    int a[n];
    for (int i = 0; i < n; i++) cin >> a[i];


    //initialize binary search bounds
    int lo = 1;
    int hi = n;

    lo--;
    while (lo < hi) {
        int test = lo + (hi - lo + 1) / 2;      //middle value of divide & conquer to be tested

        int p[n + 1];
        float x[n + 1];
        float m[n + 1];


        p[0] = 0;
        for (int i = 0; i < n; i++) {
            if (a[i] >= test) p[i + 1] = p[i] + 1;
            else p[i + 1] = p[i];
        }

        for (int i = 0; i < n + 1; i++) {
            x[i] = p[i] - (i - 1) / 2.0;
        }

        m[n] = x[n];
        for (int i = n - 1; i >= 0; i--) {
            m[i] = max(m[i + 1], x[i]);
        }

        bool works = false;
        for (int i = 0; i < (n + 1) - k; i++) {
            if (m[i + k] > x[i]) works = true;
        }


        //update binary search bounds
        if (works) {
            lo = test;
        }
        else {
            hi = test - 1;
        }
    }

    cout << lo;

}
	/*

	Given the constraints in the problem, our algo has to be at most O(n log n). We can acheive this bound using binary search

	Suppose we have already created a function f(x) which, in O(n) time, can compute whether or not a subarray exists with median greater than (or equal to) x. Then, answering this question becomes as simple as finding the largest value of x such that f(x) is true. Finding this largest value of x can be done in O(log n) time with the method explained below.

	If the maximum median of a test case is 5, the outputs of f(x) would look something like the following:

	x: 1 2 3 4 5 6 7 8 9
	f(x): T T T T T F F F F

	Notice how f(x) is true for all inputs less than (or equal to) 5 and false for all inputs greater than 5. This is because
	1. f(x) implies f(x-1) since if a subarray exists with median >= x, then a subarray must also exist with median >= x-1 (i.e. the same subarray)
	2. !f(x) implies !f(x+1) since if no subarray exists with median >= x, then no subarray can exist with median >= x+1

	Now to find the largest value of x such that f(x) is still true, we simply have to find the inflection point of f(). That is, the point where the output of f(x) switches from true to false. This can be done via binary search. We can pick some input value near the the middle of the input spectrum as a test value and pass it into f(). If the output is true, we know the inflection point is to the right of (i.e. greater than) this test value. If the output is false, we know the infelction point is to the left of this test value. With this information we can discard one half of the entire input spectrum and search the other half in the same way. This is enough to obtain O(n log n).


	Now comes the fun (and hard) part: creating the supposed function f(x) which can determine whether a subarry exists within a[] that contains a median >= x in O(n) time. This can be done using some modified prefix sum magic.

	The key observation here is that if the majority of numbers in a subarray are greater than (or equal to) x, then intuitively the median must be greater than (or equal to) x. Therefore if a subarray exists in which majority of the numbers are greater than or equal to x, f(x) is true. If such a subarray doesn't exsits, f(x) is false. More formally, if and only if there exists a subarray a[l...r] such that the number of elements greater than (or equal to) x in a[l...r] is strictly greater than (r-l-1)/2 (half of the size of the array), f(x) is true. Otherwise f(x) is false.

	A modified prefix sum can help us head in the right direction here. If we create an array such that p[i] stores the number of elements greater than (or equal to) x from index 0 to index i in a[], we can compute in O(1) time the number of elements greater than (or equal to) x. For example, the number of elements greater than (or equal to) x in the subarray a[i...j] can be computed by doing p[j] - p[i]. p[i] here can be created inductively like any other prefix sum with O(n) additive overhead, but allows us to query the number of elements greater than (or equal to) x in O(1)


	This helps us simplify the problem to the following: Create a function f(x) which in O(n) time, finds out whether a value of r and l exist such that p[r]-p[l-1] > (r-l-1)/2. Now we can use perhaps my favorite technique in CP - rearranging the mathematical equation.

	If we need want to check if an r and l exists such that
	p[r]-p[l-1] > (r-l-1)/2
	we can instead check if an r and l exists such that
	p[r]-r/2 > p[l-1] - (l-1)/2

	If we create an array g such that g[i] = p[i]-i/2 (creating this array can trivially be done in O(1) ), the question becomes

	Does a value of r and l exists such that
	g[r] > g[l-1]

	The only problem here is that the above statement is a slight lie. The question also specifies that the subarray must be of length k and so the r-l-1 must be greater than (or equal to) k.

	In reality, the question has become
	is there a g[r] > g[l-1] such that r-l-1 is >= to k?

	This can be computed easily in O(n) time, using a suffix max array. All we have to do is iterate over every possible value of l (i.e. all values from 0 to n-k) and with this value of l fixed, check whether a corresponding value of r exists (i.e. if a value exists that is greater than g[l] and is at least k numbers ahead in the array). For example if the array is

	i: 0 1 2 3 4
	g[i]: 2 0 1 3 0

	we can loop all g[i] and check if there exists a number larger than g[i] that is at least k numbers ahead. This can be done by checking comparing g[i] with the largest number in the subarray g[i+k...n] (i.e. all numbers after and including index i+k in g[]). We can query the largest number in g[i+k...n] for any i in O(1) with a suffix max array. This allows every g[i] to be checked in O(1) time and since the length of g is n, we can compute the result of f(x) in O(n) time.

	ezpz

	*/




	#include <bits/stdc++.h>
	using namespace std;

	int main() {

	//take input
	int n, k; cin >> n >> k;
	int a[n];
	for (int i = 0; i < n; i++) cin >> a[i];


	//initialize binary search bounds
	int lo = 1;
	int hi = n;

	lo--;
	while (lo < hi) {
	int test = lo + (hi - lo + 1) / 2; //middle value of divide & conquer to be tested

	int p[n + 1];
	float x[n + 1];
	float m[n + 1];


	p[0] = 0;
	for (int i = 0; i < n; i++) {
	if (a[i] >= test) p[i + 1] = p[i] + 1;
	else p[i + 1] = p[i];
	}

	for (int i = 0; i < n + 1; i++) {
	x[i] = p[i] - (i - 1) / 2.0;
	}

	m[n] = x[n];
	for (int i = n - 1; i >= 0; i--) {
	m[i] = max(m[i + 1], x[i]);
	}

	bool works = false;
	for (int i = 0; i < (n + 1) - k; i++) {
	if (m[i + k] > x[i]) works = true;
	}


	//update binary search bounds
	if (works) {
	lo = test;
	}
	else {
	hi = test - 1;
	}
	}

	cout << lo;

	}