Skip to content

Instantly share code, notes, and snippets.

@elsdrium
Created March 28, 2014 08:11
Show Gist options
  • Save elsdrium/9827713 to your computer and use it in GitHub Desktop.
Save elsdrium/9827713 to your computer and use it in GitHub Desktop.
{
"metadata": {
"name": "Algo HW5"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": "<font size=5>8.2-4</font>"
},
{
"cell_type": "markdown",
"metadata": {},
"source": "Let $A[1...n]$ be input integers in the range $0$ to $k$."
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b>Design</b>:<br />\n<b>Preprocess</b>($A$):<br />\n&emsp;&emsp;Create array $R[-1...k]$ which all values are $0$<br />\n&emsp;&emsp;for $i$ from $1$ to $n$:<br />\n&emsp;&emsp;&emsp;&emsp;$R[ A[i] ] := R[ A[i] ] + 1$<br />\n\n&emsp;&emsp;for $j$ from $0$ to $k$:<br />\n&emsp;&emsp;&emsp;&emsp;$R[j] := R[j-1] + R[j]$<br />\n\n&emsp;&emsp;Return $R$<br />\n\n<br />\n$c$ $:=$ Preprocess($A$)<br />\n<b>Query</b>($a$, $b$):<br />\n&emsp;&emsp;Return $c[b] - c[a-1]$"
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b>Correctness</b>:<br />\nFirstly, we are gonna consider <i>Preprocess</i>. Observe that the first for-loop will never access $R$ out of range, since $0 \\leq A[i] \\leq k$ for all $1 \\leq i \\leq n$.<br />\nWhen the first for-loop terminates, it's easy to see that $R[j]$ records how many elements in $A$ equals to $j$.<br />\nTo complete the proof, we have to prove $R[j] = \\sharp \\{ A[i] : A[i] \\leq j \\}$. &emsp; $(*)$ <br />\nBecause $(*)$ is not so obvious, we are gonna use loop invariant to prove it.<br />\n&emsp;<b>Loop Invariant</b>: At the start of the $l$-th iteration, $(*)$ is true for all $-1 \\leq j < l$.<br />\n&emsp;<b>Initialization</b>: When the first iteration starts, $R[-1]$ is $0$, so loop invariant holds.<br />\n&emsp;<b>Maintenance</b>:<br />\n&emsp;&emsp;Suppose loop invariant holds when the $l$-th iteration starts. Observe that $R[j]$ is unmodified for all $l \\leq j \\leq k$ until the start of current iteration.<br />\n&emsp;&emsp;It implies that $R[l]$ stores how many $l$'s in $A[1...n]$. Therefore, when the current iteration ends, $R[l]$ equals to $\\sharp \\{ A[i] : A[i] \\leq l \\}$, and $(*)$ is true for all $-1 \\leq j \\leq l$.<br />\n&emsp;<b>Termination</b>: As our discussion in previous part, when the final iteration ends, $(*)$ is true for all $-1 \\leq j \\leq k$.\n\nThe correctness of <i>Query</i> follows immediately from $(*)$."
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b>Analysis</b>:<br />\n1. <i>Preprocess</i>:<br />\n&emsp;(i) The construction and initialization for $R[-1...k]$ consumes $(k+2)t_1$ time, where $t_1$ is a constant. <br />\n&emsp;(ii) The first for-loop iterates $n$ times, it takes constant time operation in each iteration and totally consumes $n t_2$ time for some constant $t_2$.<br />\n&emsp;(iii) The second for-loop iterates $k+1$ times, it takes constant time operation in each iteration and totally consumes $(k+1) t_3$ time for some constant $t_3$.<br />\n&emsp;(iv) Return operation consumes constant time $t_4$.\n\n&emsp;Hence, the time consumption of <i>Preprocess</i> is $T(n,k) = (k+2)t_1 + n t_2 + (k+1)t_3 + t_4$.<br />\n&emsp;And we have $\\min\\{ t_1, t_2, t_3\\} \\cdot (n + k) \\leq T(n,k) \\leq 2\\max\\{ t_1, t_2, t_3\\} \\cdot (n + k)$. Therefore, $T(n,k) = \\Theta(n+k)$.\n<br />\n2. <i>Query</i>:<br />\n&emsp;Clearly, its time complexity is $\\Theta(1)$."
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<br />\n<font size=5>8-5</font>"
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b><font size=3.5>a.</font></b><br />\nIf $k=1$, then $k$-sorted implies that $$A[i] = \\sum_{j=i}^{i+1-1}{A[j]} \\leq \\sum_{j=i+1}^{i+1}{A[j]} = A[i+1]$$ for all $i = 1, 2, ..., n-1$. $\\implies$ $A$ is sorted."
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b><font size=3.5>b.</font></b><br />\n$1$, $6$, $2$, $7$, $3$, $8$, $4$, $9$, $5$, $10$"
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b><font size=3.5>c.</font></b><br />\nSuppose $A$ is $k$-sorted. Then\n$$\\frac{\\sum_{j=i}^{i+k-1}{A[j]}}{k} \\leq \\frac{\\sum_{j=i+1}^{i+k}{A[j]}}{k} \\iff \\sum_{j=i}^{i+k-1}{A[j]} \\leq \\sum_{j=i+1}^{i+k}{A[j]} \\iff 0 \\leq A[i+k] - A[i] \\iff A[i] \\leq A[i+k]$$\nfor all $i = 1, 2, ..., n-1$."
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b><font size=3.5>d.</font></b><br />\nAs our disscussion in <b>c.</b> An array $A$ is $k$-sorted if and only if $A[i] \\leq A[i+k]$ for all $i = 1, 2, ..., n-1$.<br />\n\n<b>Algorithm</b>($A$, $n$, $k$):<br />\n&emsp;&emsp;Partition the $n$ elements into $k$ groups of $\\lceil \\frac{n}{k} \\rceil$ elements each($G_j[1 ... \\lceil \\frac{n}{k} \\rceil]$, $j=1, ... , k-1$), <br />\n&emsp;&emsp;&emsp;&emsp;and the last group made up of $n - (k-1)\\lceil \\frac{n}{k} \\rceil$ elements.($G_j[1 ... (k-1)\\lceil \\frac{n}{k} \\rceil]$, $j = k$)<br />\n&emsp;&emsp;for $j$ from $1$ to $k$:<br />\n&emsp;&emsp;&emsp;&emsp;Sort $G_j$<br />\n&emsp;&emsp;for $j$ from $1$ to $k$:<br />\n&emsp;&emsp;&emsp;&emsp;for $l$ from $1$ to $G_j.length$:<br />\n&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;$A[k(j-1)+l] := G_j[l]$\n\nEvaluate the time complexity of our algorithm, <br />\n&emsp;(i) partition part takes $O(n)$ time <br />\n&emsp;(ii) the first for-loop takes $O(k \\cdot \\lceil \\frac{n}{k} \\rceil \\lg {\\lceil \\frac{n}{k} \\rceil}) = O(n \\lg{\\frac{n}{k}})$ <br />\n&emsp;(iii) the second for-loop takes $O(n)$ since there are $n$ elements in total.\n\nHence, the time complexity is $O(n \\lg{\\frac{n}{k}})$."
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b><font size=3.5>f.</font></b><br />\nLet $k$ be constant. By our disscussion in <b>c.</b> , we have to sort a subsequence $A[1+0k]$, $A[1+1k]$, $...$ , $A[1+(\\lfloor \\frac{n}{k} \\rfloor -1)k]$, and it takes $\\Omega(\\lfloor \\frac{n}{k} \\rfloor \\lg{\\lfloor \\frac{n}{k} \\rfloor}) = \\Omega(n \\lg n)$ since $k$ is constant. This completes the proof."
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<br />\n<font size=5>9.3-1</font>"
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b>For groups of 7</b>:<br />\nConsider the same algorithm by replacing with \"groups of 7\", the number of elements $\\geq x$ is $4( \\lceil \\frac{1}{2} \\lceil \\frac{n}{7} \\rceil \\rceil -2 ) \\geq \\frac{2n}{7} -8$. Then we have the recurrence relation of time consumption:\n$$T(n) \\leq T(\\lceil \\frac{n}{7} \\rceil) + T(\\frac{5n}{7} + 8) + O(n)$$\nProve $T(n) = O(n)$ by substitution method:<br />\nGuess $T(n) = O(n)$, then by induction hypothesis, we have<br />\n$T(n) \\leq c \\lceil \\frac{n}{7} \\rceil + c(\\frac{5n}{7} + 8) + an$ <br />\n$\\leq c \\frac{n}{7} + c + c(\\frac{5n}{7} + 8) + an$<br />\n$\\leq (\\frac{6c}{7}+a)n + 9c$<br />\n$\\implies$ By choosing $c > 14a$ and $n \\geq n_0 \\geq \\frac{9c}{14}$, therefore we have $T(n) \\leq cn$ and hence $T(n) = O(n)$ by induction.\n\n\n<b>For groups of 3</b>:<br />\nSuppose $T(n) = O(n)$, then $T(n) \\leq T(\\lceil \\frac{n}{3} \\rceil) + T(\\frac{2n}{3} + 4) + O(n) \\leq c \\frac{n}{3} + c + c(\\frac{2n}{3} + 4) + an$ is impossible to be $\\leq cn$."
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<br />\n<font size=5>9.3-8</font>"
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b>Design</b>:<br />\n<b>Algorithm</b>($X$, $Y$, $n$):<br />\n&emsp;&emsp;$p_x := 1$, $p_y := 1$<br />\n&emsp;&emsp;$q_x := n$, $q_y := n$<br />\n\n&emsp;&emsp;while( True ):<br />\n&emsp;&emsp;&emsp;&emsp;$m_x := \\lfloor \\frac{p_x+q_x}{2} \\rfloor$, $m_y := \\lfloor \\frac{p_y+q_y}{2} \\rfloor$<br />\n&emsp;&emsp;&emsp;&emsp;if $X[m_x] > Y[m_y]$:<br />\n&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;if $p_x$ == $q_x$:<br />\n&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;Return $Y[m_y]$<br />\n&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;$q_x := m_x$<br />\n&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;$p_y := m_y$<br />\n&emsp;&emsp;&emsp;&emsp;else:<br />\n&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;if $p_y$ == $q_y$:<br />\n&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;Return $X[m_x]$<br />\n&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;$p_x := m_x$<br />\n&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;$q_y := m_y$<br />\n"
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b>Correctness</b>:<br />\nWe are gonna use loop invariant to prove correctness.<br />\n&emsp;<b>Loop Invariant</b>:<br />\n&emsp;&emsp;At the start of each iteration, the median(we will use \"the Median\" for latter discussion) of all $2n$ elements in arrays $X$ and $Y$ is contained in $X[p_x...q_x] \\cup Y[p_y...q_y]$. $(*)$<br />\n&emsp;<b>Initialization</b>:<br />\n&emsp;&emsp;When the while-loop starts, $X[p_x...q_x] = X[1...n]$ and $Y[p_y...q_y] = Y[1...n]$. $\\implies (*)$ holds.<br />\n&emsp;<b>Maintenance</b>: <br />\n&emsp;&emsp;Suppose $(*)$ holds for current iteration.<br />\n&emsp;&emsp;$\\because$ $X$ and $Y$ are sorted.<br />\n&emsp;&emsp;$\\therefore$ Any subarray of $X$, $Y$ are sorted also.\nNote that the Median must lie between the median of $X[p_x...q_x]$ and the median of $Y[p_y...q_y]$, <br />\n&emsp;&emsp;&emsp;ie. $\\min\\{ X[\\lfloor \\frac{p_x+q_x}{2} \\rfloor], Y[\\lfloor \\frac{p_y+q_y}{2} \\rfloor] \\} \\leq$ the Median $\\leq \\max\\{ X[\\lfloor \\frac{p_x+q_x}{2} \\rfloor], Y[\\lfloor \\frac{p_y+q_y}{2} \\rfloor] \\}$.<br />\n&emsp;&emsp;Suppose $(*)$ holds for current iteration, that is, the Median is contained in $X[p_x...q_x] \\cup Y[p_y...q_y]$.<br />\n&emsp;&emsp;&emsp;If $X[m_x] > Y[m_y]$, then the removal of $Y[p_y ... m_y-1] \\cup X[m_x+1 ... q_x]$ keeps $(*)$ hold for next iteration.<br />\n&emsp;&emsp;&emsp;If $X[m_x] \\leq Y[m_y]$, then the removal of $X[p_x ... m_x-1] \\cup Y[m_y+1 ... q_y]$ keeps $(*)$ hold for next iteration.<br />\n&emsp;<b>Termination</b>:<br />\n&emsp;&emsp;Note that $p_x$ == $q_x \\iff p_y$ == $q_y$ since $q_x-p_x$ always equals to $q_y-p_y$.<br />\n&emsp;&emsp;Therefore, if $p_x$ == $q_x$ or $p_y$ == $q_y$, then it only remains $X[p_x]$ and $Y[p_y]$ two elements, then we return the smaller one.<br />\n\nBecause it has $2n$ elements and we removes even number elements in each iteration, it must terminate and remain only two elements in the final iteration, the smaller one must be the Median."
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<b>Analysis</b>:<br />\nIn each iteration, $q_x-p_x$ == $q_y-p_y$, it removes $\\lfloor\\frac{q_x - p_x}{2}\\rfloor + \\lceil\\frac{q_y - p_y}{2}\\rceil = \\lfloor\\frac{q_y - p_y}{2}\\rfloor + \\lceil\\frac{q_x - p_x}{2}\\rceil = q_x - p_x = q_y - p_y$ elements of $X[p_x...q_x] \\cup Y[p_y...q_y]$. <br />\n$\\implies$ It remains half of remainder of previous iteration.<br />\n$\\implies$ Since the time consumption of each iteration is constant, the algorithm's time complexity is $O(\\lg n)$."
},
{
"cell_type": "markdown",
"metadata": {},
"source": "<br />"
}
],
"metadata": {}
}
]
}
@CoffeeHacking
Copy link

very interesting!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment