Instantly share code, notes, and snippets.

# dhermes/advancing_front.py Last active Sep 17, 2018

 import itertools import operator def is_canonical(n_tuple): # |b1| <= |bn| ==> -b1 <= bn ==> b1 + bn >= 0 return n_tuple[-1] + n_tuple >= 0 def beta_values_to_p(n_tuple, n): # beta1 = -1 - p1 ==> p1 = -1 - beta1 p = [None] * n p = -1 - n_tuple for i in xrange(1, n): p[i] = n_tuple[i] - n_tuple[i - 1] - 1 return p def advance_tuple(n_tuple, n): first_update = n_tuple[:] for tuple_index in xrange(n): first_update[tuple_index] -= 1 yield first_update for index in xrange(1, n): next_update = n_tuple[:] for tuple_index in xrange(index, n): next_update[tuple_index] += 1 yield next_update def advance_stage(stage, n): new_stage = set() for n_tuple in stage: for new_tuple in advance_tuple(n_tuple, n): new_stage.add(tuple(new_tuple)) return sorted(map(list, new_stage), reverse=True) def compute_values(curr_stage, stage, sd, sd_plus_one): print('Stage: %d' % (curr_stage,)) print('-' * 40) for n_tuple in stage: sd_val = sd(n_tuple) sd_plus_one_val = sd_plus_one(n_tuple) message = '%20s -> %s' % (n_tuple, sd_plus_one_val) if sd_val == 0: if is_canonical(n_tuple): print(message) def advance_front(n, sd, sd_plus_one, num_stages=3): stage = [range(-1, -1 + n)] compute_values(0, stage, sd, sd_plus_one) for curr_stage in xrange(num_stages): stage = advance_stage(stage, n) compute_values(curr_stage + 1, stage, sd, sd_plus_one) def make_sj(j): def sj(n_tuple): result = 0 for val_tuple in itertools.combinations(n_tuple, j): result += reduce(operator.mul, val_tuple, 1) return result return sj if __name__ == '__main__': s1 = make_sj(1) s2 = make_sj(2) print('') advance_front(5, s1, s2, num_stages=11)
 \documentclass[a4paper,10pt]{article} \usepackage[margin=1in]{geometry} \usepackage{amsmath,amsthm,amssymb} \usepackage{hyperref} \usepackage{fancyhdr} \renewcommand{\qed}{$$\blacksquare$$} \hypersetup{colorlinks=true,urlcolor=blue} \pagestyle{fancy} \lhead{} \rhead{Toying Around with Finite Difference Approximations \\ Danny Hermes} \renewcommand{\headrulewidth}{0pt} \begin{document} \section*{Some fun facts about approximations} \ \vspace{-.2in} \paragraph{Putting a max on the order of approximation of $$u'$$} If we seek to approximate the value $$u'(c)$$ from $$n$$ points, we'll show that we can not achieve an $$\mathcal{O}(h^{n + 1})$$ approximation. Consider the approximation given by $\sum_{i = 1}^n \alpha_i u(c + \beta_i h)$ where both the coefficients of the outputs $$u(\cdot)$$ and the distances of the points ($$\beta_i h$$) from the center $$c$$ may vary. Currently the only condition on the points are that the $$\beta_i$$ are distinct (else we would have fewer than $$n$$ points) and that the differences $$\beta_i h$$ are $$\mathcal{O}(h)$$ (which we've indicated clearly by the presence of $$h$$ in the difference) so that we can use a Taylor approximation. Taylor expanding $$u(c + \beta_i h)$$ about $$c$$ we have $u(c + \beta_i h) = \sum_{k = 0}^{n + 1} u^{(k)}(c) \left(\beta_i h\right)^k + \mathcal{O}(h^{n + 2}) = \sum_{k = 0}^{n + 1} u^{(k)}(c) \left(\beta_i h\right)^k + \mathcal{O}(h^{n + 2}).$ Since we want $$\sum_{i = 1}^n \alpha_i u(c + \beta_i h)$$ to approximate we'll want to set $u'(c) = \sum_{i = 1}^n \alpha_i \left[u'(c) \beta_i h\right]$ so that the $$\alpha_i = \mathcal{O}\left(\frac{1}{h}\right)$$. Due to this, in order to have a $$\mathcal{O}(h^{n + 1})$$ approximation we'd need $\sum_{i = 1}^n \alpha_i u^{(k)}(c) \left(\beta_i h\right)^k = 0$ for $$k = 0$$ and $$2 \leq k \leq n + 1$$. These encompass all the terms, and the error would be $$\mathcal{O}(\alpha_i h^{n + 2}) = \mathcal{O}(h^{n + 1})$$. Since we want to do this for generic $$u(\cdot)$$ we assume that $$u^{(k)}(c) \neq 0$$ so that dividing by $$u^{(k)}(c) h^k$$ we have the system $\sum_{i = 1}^n \alpha_i \beta_i = \frac{1}{h} \quad \text{ and } \quad \sum_{i = 1}^n \alpha_i \beta_i^k = 0 \quad \text{for } k \neq 1.$ This is a system of $$n + 2$$ equations in $$2n$$ variables, so it seems on the surface that we should have a solution, hence should be able to get an $$\mathcal{O}(h^{n + 1})$$ approximation of $$u'(c)$$ with $$n$$ points. However, we'll show this system does not have a solution. Writing this as a matrix, equation, we have $\left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1 & \cdots & \beta_n \\ \beta_1^2 & \cdots & \beta_n^2 \\ \vdots & \ddots & \vdots \\ \beta_1^{n + 1} & \cdots & \beta_n^{n + 1} \end{array}\right] \left[ \begin{array}{c} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \end{array}\right] = \left[ \begin{array}{c} 0 \\ \frac{1}{h} \\ 0 \\ \vdots \\ 0 \end{array}\right].$ Dropping the second and third equations from this system, we have a square system given by $\left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1^3 & \cdots & \beta_n^3 \\ \vdots & \ddots & \vdots \\ \beta_1^{n + 1} & \cdots & \beta_n^{n + 1} \end{array}\right] \left[ \begin{array}{c} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \end{array}\right] = \left[ \begin{array}{c} 0 \\ 0 \\ \vdots \\ 0 \end{array}\right].$ If we define the augmented Vandermonde matrix $$A_n$$ by $\left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1^3 & \cdots & \beta_n^3 \\ \vdots & \ddots & \vdots \\ \beta_1^{n + 1} & \cdots & \beta_n^{n + 1} \end{array}\right],$ then our system is $$A_n \boldsymbol{\alpha} = \mathbf{0}$$. (For notation's sake, we should probably just introduce a notation like $A_n(0, 3, \ldots, n + 1)$ to represent the exponents of each row and we should also probably note that this depends on the $$\beta_i$$, but we didn't -- so there.) If we can show that $$A_n$$ is invertible then we'd have $$\boldsymbol{\alpha} = A_n^{-1} \mathbf{0} = \mathbf{0}$$ forcing our approximation of $$u'(c)$$ to be uniformly $$0$$. This clearly is not $$\mathcal{O}(h^{n + 1})$$. \paragraph{Showing that this system is degenerate} With the same invariants $$\beta_1, \ldots, \beta_n$$, define the Vandermonde matrix $$V_n$$ as the matrix $\left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1 & \cdots & \beta_n \\ \vdots & \ddots & \vdots \\ \beta_1^{n - 1} & \cdots & \beta_n^{n - 1} \end{array}\right].$ (This is the transpose of what's typically called the Vandermonde but it's fine this way, we're over it.) To show that $$A_n$$ is invertible we'll define a fancy matrix $$F_n$$ and show that $$A_n = F_n V_n$$. This will be important, since the determinant of $$V_n$$ is well-known and (we'll see) the determinant of $$F_n$$ can be easily computed. But first, the definition $\left[ \begin{array}{c c c c c c c} 1 & 0 & 0 & 0 & \cdots & 0 & 0 \\ 0 & 0 & 0 & 1 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & 0 & \cdots & 1 & 0 \\ 0 & 0 & 0 & 0 & \cdots & 0 & 1 \\ f_{n - 1,1} & f_{n - 1, 2} & f_{n - 1, 3} & f_{n - 1, 4} & \cdots & f_{n - 1, n - 1} & f_{n - 1, n} \\ f_{n,1} & f_{n, 2} & f_{n, 3} & f_{n, 4} & \cdots & f_{n, n - 1} & f_{n, n} \end{array}\right].$ This is clearly not enough, but just to indicate that the last two rows are where the real content is. The first $$n - 2$$ rows make plenty of sense. We have row $$1$$ equal to $$e_1^T$$ and then row $$i$$ for $$2 \leq i \leq n - 2$$ is equal to $$e_{i + 2}^T$$. Before defining what the unnamed $$f_{i, j}$$ are, we can compute both $$\det(F_n)$$ and the product $$F_n V_n$$ in terms of these. First, since we have a shifted identity in the first $$n - 2$$ rows, the only terms in the cofactor that are nonzero will be from the second and third column and the last two rows. Hence, up to a sign we know $\pm \det(F_n) = f_{n - 1, 2} f_{n, 3} - f_{n, 2} f_{n - 1, 3}.$ %% (See the \underline{\hyperref[det-sign]{note}} below to see that this is (See the \hyperref[det-sign]{note} below to see that this is the correct sign.) To compute some parts of $$F_n V_n$$ note that the product $$e_i^T M$$ represents the $$i$$th row of $$M$$. Hence, the majority of the multiplication just moves rows: $F_n V_n = \left[ \begin{array}{c} e_1^T \\ e_4^T \\ \vdots \\ e_n^T \\ f_{n - 1}^T \\ f_n^T \end{array}\right] \left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1 & \cdots & \beta_n \\ \vdots & \ddots & \vdots \\ \beta_1^{n - 1} & \cdots & \beta_n^{n - 1} \end{array}\right] = \left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1^3 & \cdots & \beta_n^3 \\ \vdots & \ddots & \vdots \\ \beta_1^{n - 1} & \cdots & \beta_n^{n - 1} \\ \multicolumn{3}{ c }{f_{n - 1}^T V_n} \\ \multicolumn{3}{ c }{f_n^T V_n} \end{array}\right].$ The first $$n - 2$$ rows of $$F_n V_n$$ match those of $$A_n$$ so to show equality we need to show the last $$2$$ match as well. Hence at this point we'll define the last $$2$$ rows of $$F_n$$. These will be defined in terms of the elementary symmetric polynomials. We'll use $$s_k$$ to denote the $$k$$th symmetric polynomial; for example $$s_1 = \beta_1 + \cdots + \beta_n$$, $$s_2 = \sum_{1 \leq i < j \leq n} \beta_i \beta_j$$ and $$s_n = \beta_1 \cdots \beta_n$$. For row $$n - 1$$, we have $f_{n - 1, k} = (-1)^{n - k} s_{n - k + 1}.$ For row $$n$$, we first have $f_{n, 1} = (-1)^{n - 1} s_1 s_n$ and $f_{n, k} = (-1)^{n - k} \left[s_1 s_{n - k + 1} - s_{n - k + 2}\right].$ for $$2 \leq k \leq n$$. Viewing this it makes sense that we defined $$f_{n, 1}$$ separately since for $$k = 1$$, $$s_{n - k + 2} = s_{n + 1}$$ which is not defined since we are drawing from only $$n$$ indeterminates. With this, we are ready to compute $$\det(F_n)$$ and confirm the last two rows of the product. If we define $g(X) = (X - \beta_1) \cdots (X - \beta_n)$ then we know by Vieta's relations that $g(X) = X^n - s_1 X^{n - 1} + s_2 X^{n - 2} + \cdots + (-1)^{n} s_n$ (and by construction that $$g(\beta_i) = 0$$ for each $$i$$). To compute row $$n - 1$$ of $$F_n V_n$$ note that since each column of $$V_n$$ is $$\left[\begin{array}{c c c c} 1 & \beta_i & \cdots & \beta_i^{n - 1} \end{array}\right]^T$$, entry $$i$$ in row $$n - 1$$ is \begin{align*} \sum_{k = 1}^n f_{n - 1, k} \cdot \beta_i^{k - 1} &= \sum_{k = 1}^n (-1)^{n - k} s_{n - k + 1} \beta_i^{k - 1} \\ &= (-1)^{n - 1} s_n + (-1)^{n - 2} s_{n - 1} \beta_i + \cdots + (-1)^0 s_1 \beta_i^{n - 1} \\ &= -\left(g(\beta_i) - \beta_i^n\right) = \beta_i^n. \end{align*} By similar logic, entry $$i$$ in row $$n$$ is \begin{align*} \sum_{k = 1}^n f_{n, k} \cdot \beta_i^{k - 1} &= \sum_{k = 1}^n (-1)^{n - k} s_{n - k + 1} \beta_i^{k - 1} \\ &= (-1)^{n - 1} s_1 s_n + \sum_{k = 2}^n (-1)^{n - k} \left[s_1 s_{n - k + 1} - s_{n - k + 2}\right] \beta_i^{k - 1} \\ &= s_1 \left[(-1)^{n - 1} s_n + (-1)^{n - 2} s_{n - 1} \beta_i + \cdots + (-1)^0 s_1 \beta_i^{n - 1}\right] \\ &- \left[(-1)^{n - 2} s_n \beta_i + \cdots + (-1)^0 s_2 \beta_i^{n - 1}\right] \\ &= s_1 \left[\beta_i^n\right] - \beta_i \left[(-1)^n s_n + \cdots + (-1)^2 s_2 \beta_i^{n - 2}\right] \\ &= s_1 \beta_i^n - \beta_i \left[g(\beta_i) - \beta_i^n + s_1 \beta_i^{n - 1} \right] \\ &= s_1 \left[\beta_i^n - \beta_i \cdot \beta_i^{n - 1}\right] + \beta_i \cdot \beta_i^n \\ &= \beta_i^{n + 1}. \end{align*} To compute $$\det(F_n)$$, note that it involves the following: \begin{align*} f_{n - 1, 2} &= (-1)^{n - 2} s_{n - 1} = (-1)^n s_{n - 1} \\ f_{n - 1, 3} &= (-1)^{n - 3} s_{n - 2} = (-1)^{n - 1} s_{n - 2} \\ f_{n, 2} &= (-1)^{n - 2}\left[s_1 s_{n - 1} - s_n\right] = (-1)^n \left[s_1 s_{n - 1} - s_n\right] \\ f_{n, 3} &= (-1)^{n - 3}\left[s_1 s_{n - 2} - s_{n - 1}\right] = (-1)^{n - 1} \left[s_1 s_{n - 2} - s_{n - 1}\right]. \end{align*} Combining these with our previous definition \begin{align*} \det(F_n) &= f_{n - 1, 2} f_{n, 3} - f_{n, 2} f_{n - 1, 3} \\ &= (-1)^{2n - 1} s_{n - 1} \left[s_1 s_{n - 2} - s_{n - 1}\right] - (-1)^{2n - 1} \left[s_1 s_{n - 1} - s_n\right] s_{n - 2} \\ &= s_1 s_{n - 1} s_{n - 2} - s_n s_{n - 2} - \left[s_1 s_{n - 1} s_{n - 2} - s_{n - 1}^2\right] \\ &= s_{n - 1}^2 - s_n s_{n - 2}. \end{align*} \paragraph{Showing $$A_n$$ is invertible} Assuming the $$\beta_i$$ are distinct, we can show $$A_n$$ is invertible by showing that it has nonzero determinant. Since we showed $$A_n = F_n V_n$$ we know that $\det(A_n) = (s_{n - 1}^2 - s_n s_{n - 2}) \prod_{1 \leq i < j \leq n} ( \beta_j - \beta_i)$ where the product is a well-known result about the Vandermonde determinant. Since the $$\beta_i$$ are distinct, the only way this can be zero is if $$s_{n - 1}^2 - s_n s_{n - 2} = 0$$. We'll rewrite this product as a sum of squares and will show that it can't be $$0$$ when the $$\beta_i$$ are distinct. Since the subsets of size $$k$$ and $$n - k$$ are in full bijection (via complement) we can write $s_{n - 1} = \sum_{i = 1}^n \prod_{j \neq i} \beta_j.$ If we define the formal sum $s_{-1} = \sum_{i = 1}^n \frac{1}{\beta_i}$ then we see that $$s_{n - 1} = s_n s_{-1}$$. This sum may not necessarily be defined (e.g. if some $$\beta_i = 0$$), but using it for manipulation of sums formally is perfectly fine so long as it is multiplied by $$s_n$$ since this will remove all the potentially problematic terms from the denominator. Similarly, define $$s_{-2}$$ so that $$s_{n - 2} = s_n s_{-2}$$: $s_{-2} = \sum_{1 \leq i < j \leq n} \frac{1}{\beta_i \beta_j}.$ With this, $$s_{n - 1}^2 - s_n s_{n - 2}$$ becomes $$s_n^2\left(s_{-1}^2 - s_{-2}\right)$$. To analyze this, note that (formally) $s_{-1}^2 - s_{-2} = \left(\sum_{i = 1}^n \frac{1}{\beta_i^2} + 2 s_{-2} \right) - s_{-2} = \sum_{i = 1}^n \frac{1}{\beta_i^2} + s_{-2}.$ To write this as a sum of squares, we combine the above fact about $$s_{-1}^2$$ with another formal sum: $\sum_{1 \leq i < j \leq n} \left(\frac{1}{\beta_i} + \frac{1}{\beta_j} \right)^2 = (n - 1) \sum_{i = 1}^n \frac{1}{\beta_i^2} + 2 s_{-2},$ (where the coefficient is $$n - 1$$ since each $$\beta_i$$ is paired with $$\binom{n - 1}{1}$$ other $$j$$). Combining these (formally) we have a sum of squares: \begin{align*} & (n - 3)s_{-1}^2 + \sum_{1 \leq i < j \leq n} \left(\frac{1}{\beta_i} + \frac{1}{\beta_j}\right)^2 \\ &= ((n - 3) + (n - 1)) \sum_{i = 1}^n \frac{1}{\beta_i^2} + (2(n - 3) + 2) s_{-2} \\ &= (2n - 4)\left[\sum_{i = 1}^n \frac{1}{\beta_i^2} + s_{-2}\right] \\ &= 2(n - 2)\left[s_{-1}^2 - s_{-2}\right]. \end{align*} Multiplying back by $$s_n^2$$, we have $s_{n - 1}^2 - s_n s_{n - 2} = \frac{(n - 3) s_{n -1}^2 + \displaystyle \sum_{1 \leq i < j \leq n} \left(\frac{s_n}{\beta_i} + \frac{s_n}{\beta_j}\right)^2}{2(n - 2)}.$ (Note that $$n = 2$$ will cause problems, but we won't in the end be considering $$n$$ below $$3$$.) Since it is a sum of squares, this can only occur when $$s_{n - 1} = 0$$ (so long as $$n > 3$$) and $\frac{s_n}{\beta_i} + \frac{s_n}{\beta_j} = 0 \text{ for all } 1 \leq i < j \leq n.$ Introducing the intermediate terms $$S_i = \frac{s_n}{\beta_i} = \prod_{k \neq i} \beta_k$$, we know that $$s_{n - 1} = \sum_{k = 1}^n S_k$$ and the other condition becomes $$S_i = - S_j$$ for all $$j \neq i$$. This forces $0 = \sum_{k = 1}^n S_k = S_i + \sum_{k \neq i} S_k = S_i + (n - 1)(-S_i) = (2 - n) S_i.$ Again, since $$n > 2$$, this forces $$-S_j = S_i = 0$$. Now since $$\prod_{k \neq i} \beta_k = S_i = 0$$, we must have $$\beta_{k_0} = 0$$ for some $$k_0 \neq i$$. But we also know that $$0 = S_{k_0} = \prod_{k \neq k_0} \beta_k$$ hence there is some $$k_1 \neq k_0$$ such that $$\beta_{k_1} = 0 = \beta_{k_0}$$. But since $$k_1 \neq k_0$$, this would give us nondistinct $$\beta_i$$ (both $$0$$) which is not allowed. Hence, the sum of squares $$s_{n - 1}^2 - s_n s_{n - 2}$$ can never be $$0$$ and so neither can $$\det(A_n)$$. \paragraph{Note about Sign of Determinant} \label{det-sign} We can show the sign is correct by considering the two transpositions $$\tau, \sigma$$ result in a nonzero product in $\det F_n = \sum_{\sigma' \in S_n} \operatorname{sgn}(\sigma') \prod_{i = 1}^n f_{i, \sigma'(i)}.$ We need $$\sigma(1) = \tau(1) = 1$$ (so $$1$$ is not part of the cycle decomposition). We also need $$\sigma(i) = \tau(i) = i + 2$$ for $$2 \leq i \leq n - 2$$ due to the column shift. This leaves us with only two choices for the remaining inputs. Let $$\tau(n - 1) = 2, \tau(n) = 3$$ and $$\sigma(n - 1) = 3, \sigma(n) = 2$$, then we have $\det F_n = \operatorname{sgn}(\tau) f_{n - 1, 2} f_{n, 3} + \operatorname{sgn}(\sigma) f_{n - 1, 3} f_{n, 2}$ since $$f_{1, 1} = f_{i, i + 2} = 1$$ for all other $$i$$. Clearly, since $$\sigma = (2 3) \cdot \tau$$ we know that $$\operatorname{sgn}(\tau) = - \operatorname{sgn}(\sigma)$$ so it's enough to find $$\operatorname{sgn}( \tau)$$. If $$n$$ is even, then we have $2 \xrightarrow{\tau} 4 \xrightarrow{\tau} \cdots \xrightarrow{\tau} n \xrightarrow{\tau} 3 \xrightarrow{\tau} 5 \xrightarrow{\tau} \cdots \xrightarrow{\tau} (n - 1) \xrightarrow{\tau} 2$ hence $$\tau = (2 \, 4 \cdots n \, 3 \cdots (n - 1))$$ is a cycle of length $$n - 1$$ (it contains every number except $$1$$) hence $$\operatorname{sgn}( \tau) = 1$$ (odd cycles have even sign). Similarly, if $$n$$ is odd, then $2 \xrightarrow{\tau} 4 \xrightarrow{\tau} \cdots \xrightarrow{\tau} (n - 1) \xrightarrow{\tau} 2; \quad 3 \xrightarrow{\tau} 5 \xrightarrow{\tau} \cdots \xrightarrow{\tau} n \xrightarrow{\tau} 3$ Hence $$\tau$$ is the product of two disjoint cycles of the same length, so it has sign $$(\pm 1)^2 = 1$$ here as well. Hence we always have $1 = \operatorname{sgn}(\tau) = - \operatorname{sgn}(\sigma)$ forcing $\det(F_n) = f_{n - 1, 2} f_{n, 3} - f_{n, 2} f_{n - 1, 3}.$ \end{document} \documentclass[letterpaper,10pt]{article} % http://stackoverflow.com/a/11331659/1068170 \usepackage{upquote} \usepackage[margin=1in]{geometry} \usepackage{amsmath,amsthm,amssymb,hyperref,fancyhdr} %% https://tex.stackexchange.com/a/396829/32270 \allowdisplaybreaks \renewcommand*{\today}{November 27, 2013} \renewcommand{\qed}{$$\blacksquare$$} \hypersetup{colorlinks=true,urlcolor=blue} \pagestyle{fancy} \lhead{} \rhead{Toying Around with Finite Difference Approximations \\ Danny Hermes} \renewcommand{\headrulewidth}{0pt} \DeclareMathOperator{\rank}{rank} \title{Finite Approximations of $$u^{(k)}(c)$$} \begin{document} \maketitle \pagebreak \tableofcontents \section{Introduction} If we seek to approximate the derivative $$u^{(k)}$$ (of a function of one real variable) with $$n$$ points, we can do so by taking advantage of Taylor's theorem with well chosen points. Doing so is in many ways quite natural and when creating these approximations there are a few natural questions that arise. \section{Motivation} Take the following as a motivation. Due to the definition of the derivative, we can approximate $$u'(c)$$ with $$2$$ points via quantities like $D_{+} u(c) = \frac{u(c + h) - u(c)}{h}, \, D_{-} u(c) = \frac{u(c) - u(c - h)}{h}, \text{ and } D_0 u(c) =\frac{u(c + h) - u(c - h)}{2h}.$ Taylor's theorem tells us that $u(c \pm h) = u(c) \pm h u'(c) + \frac{h^2}{2} u''(c) \pm \frac{h^3}{6} u'''(c) + \frac{h^4}{24} u^{(4)}(c) + \mathcal{O}(h^5),$ so that $D_{\pm} u(c) - u'(c) = \frac{1}{h} \mathcal{O}\left(\pm \frac{h^2}{2} u''(c) \right) = \mathcal{O}(h) \text{ and } D_0 u(c) - u'(c) = \frac{1}{h} \mathcal{O}\left(\frac{2 h^3}{6} u'''(c)\right) = \mathcal{O}(h^2).$ Additionally, in a mean-value theorem inspired mode, we define $D^2 u(c) = \frac{\frac{u(c + h) - u(c)}{h} - \frac{u(c) - u(c - h)}{h}}{h} = \frac{u(c + h) - 2 u(c) + u(c - h)}{h^2}.$ as an approximation for $$u''(c)$$ so that $D^2 u(c) - u''(c) = \frac{1}{h^2} \mathcal{O}\left(\frac{2 h^4}{24} u^{(4)}(c)\right) = \mathcal{O}(h^2).$ We see that we can (easily) find three different approximations for $$u'(c)$$ using only \textbf{two} points and with orders of accuracy $$\mathcal{O}(h)$$ and $$\mathcal{O}(h^2)$$. Similarly, we can find an approximation for $$u''(c)$$ using \textbf{three} points with order of accuracy $$\mathcal{O}(h^2)$$. \section{Natural Questions} Along with the conclusions of our motivating examples, a few natural questions arise: \begin{itemize} \item Given an approximation of $$u^{(k)}(c)$$ with $$n$$ points, what is the maximum order of accurary? \item What is the largest $$k$$ such that $$u^{(k)}(c)$$ can be approximated with $$n$$ points? \item What conditions on the points must be satisfied to achieve the maximum order of accurary? \end{itemize} \section{Framework for approximations} Before answering these questions, we'll introduce some notation and explicitly write down what it means to say approximate with $$n$$ points''. In general, if we want to approximate $$u^{(k)}(c)$$ with $$n$$ points, we seek to find coefficients $$\left\{\alpha_i\right\}_{i = 1}^n$$ and \textbf{distinct} points $$\left\{c_i\right\}_{i = 1}^n$$ so that $u^{(k)}(c) \approx \sum_{i = 1}^n \alpha_i u(c_i).$ In order to get the order in terms of $$\mathcal{O}(h^p)$$ we instead write our points as $$\left\{c + \beta_i h\right\}_{i = 1}^n$$, which are centered around $$c$$ and with differences from the center $$\left\{c - c_i\right\}$$ that are $$\mathcal{O}(h)$$. The condition that the $$c_i$$ be distinct also means that the $$\beta_i$$ must be distinct. Using these points in a Taylor approximation of order $$p$$ we have: $u(c + \beta_i h) = \sum_{j = 0}^p u^{(j)}(c) \frac{\left( \beta_i h\right)^j}{j!} + \mathcal{O}\left(h^{p + 1}\right).$ Expanding these within $$\sum_{i = 1}^n \alpha_i u(c + \beta_i h)$$ we have $u^{(k)}(c) \approx \sum_{j = 0}^p u^{(j)}(c) \left[\sum_{i = 1}^n \alpha_i \beta_i^j\right] \frac{h^j}{j!} + \mathcal{O}\left(h^{p + 1} \sum_{i = 1}^n \alpha_i \right).$ We approximate $$u^{(k)}(c)$$ by setting the coefficients of the $$u^{(j)}(c)$$ in the sum. To do this to approximate $$u^{(k)}(c)$$ we need $$k$$ to be in the sum hence $$p \geq k$$. In order for all the terms in the Taylor expansion to approximate our derivative, we need to satisfy the system: $\left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1 & \cdots & \beta_n \\ \vdots & \ddots & \vdots \\ \beta_1^p & \cdots & \beta_n^p \end{array}\right] \left[ \begin{array}{c} \alpha_1 \\ \vdots \\ \alpha_n \end{array}\right] = \left[\underbrace{0 \, \cdots \, 0}_{\text{first } k \text{ terms}} \, \frac{k!}{h^k} \, \underbrace{0 \, \cdots \, 0}_{\text{last } (p - k) \text{ terms}}\right]^T.$ %% H/T: http://tex.stackexchange.com/a/180269/32270 \section[\texorpdfstring{Largest $k$ given $n$}{Largest k given n}]{Largest $$k$$ given $$n$$} Using the above description we can show that we must have $$k < n$$. Assuming the contrary --- $$k \geq n$$ --- the first $$n$$ rows of the system must be $$0$$. This is because they correspond to the terms $u(c) h^0, u'(c) h^1, \ldots, u^{(n - 1)}(c) h^{n - 1},$ all of which come before $$u^{(k)}(c) = u^{(n)}(c)$$. These first $$n$$ rows result in the square system $$V_n \boldsymbol{\alpha} = \mathbf{0}$$ where $$V_n$$ is the Vandermonde matrix $\left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1 & \cdots & \beta_n \\ \vdots & \ddots & \vdots \\ \beta_1^{n - 1} & \cdots & \beta_n^{n - 1} \end{array}\right].$ (This is the transpose of what's typically called the Vandermonde but this is the matrix that arises for our problem.) However, it's well-known that $$\det(V_n) = \displaystyle \prod_{1 \leq i < j \leq n} (\beta_j - \beta_i)$$. Since our $$\beta_i$$ are distinct, we see $$\det(V_n) \neq 0$$, which tells us that $$\boldsymbol{\alpha} = V_n^{-1} \mathbf{0} = \mathbf{0}$$. This of course results in the approximation $u^{(k)}(c) \approx 0$ which is $$\mathcal{O}(1)$$ and hence is a very poor approximation. Effectively, this means that we can't approximate (in any reasonable way) $$u^{(k)}(c)$$ using $$n$$ points when $$k \geq n$$. \section{Maximum order of accurary} To determine the order of accurary of a given approximation, assume there is a solution $$\left\{\alpha_i\right\}, \left\{\beta_i\right\}$$ of $\left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1 & \cdots & \beta_n \\ \vdots & \ddots & \vdots \\ \beta_1^p & \cdots & \beta_n^p \end{array}\right] \left[ \begin{array}{c} \alpha_1 \\ \vdots \\ \alpha_n \end{array}\right] = \left[\underbrace{0 \, \cdots \, 0}_{\text{first } k \text{ terms}} \, \frac{k!}{h^k} \, \underbrace{0 \, \cdots \, 0}_{\text{last } (p - k) \text{ terms}}\right]^T.$ If we consider the $$\beta_i$$ to be fixed, we have $$\alpha_i = \mathcal{O}\left(\frac{1}{h^k}\right)$$ due to the strong influence of the nonzero item on the right hand side of the system. This results in the approximation \begin{align*} & u^{(k)}(c) \left[\sum_{i = 1}^n \alpha_i \beta_i^k\right] \frac{h^k}{k!} + \sum_{0 \leq j \leq p, j \neq k} u^{(j)}(c) \left[\sum_{i = 1}^n \alpha_i \beta_i^j\right] \frac{h^j}{j!} + \mathcal{O}\left(h^{p + 1} \sum_{i = 1}^n \alpha_i\right) \\ &= u^{(k)}(c) \left[\frac{k!}{h^k}\right] \frac{h^k}{k!} + \sum_{0 \leq j \leq p, j \neq k} u^{(j)}(c) \left[0\right] \frac{h^j}{j!} + \mathcal{O}\left( h^{p + 1} \frac{1}{h^k}\right) \\ &= u^{(k)}(c) + \mathcal{O}\left(h^{p - k + 1}\right). \end{align*} This of course depends on the existence of a solution. We've already said that we need $$p \geq k$$ and $$n > k$$. We claim (and will show) that \begin{enumerate} \item No solution exists for $$p \geq n + 1$$. \item A family of solutions does exist for $$p = n$$. \end{enumerate} Assuming both of the above facts, this actually tells us that the best approximation we can get is $\sum_{i = 1}^n \alpha_i u(c + \beta_i h) - u^{(k)}(c) = \mathcal{O} \left(h^{n - k + 1}\right).$ \section{A preliminary fact} In order to actually show that no solution exists when $$p \geq n + 1$$ and that a solution does exist for $$p = n$$, we will show later that some matrix is invertible and that some other matrix has a dimension one kernel. As we'll see, doing this will involve factoring out the Vandermonde matrix $$V_n$$ from some other matrix and hence will require representing powers $$\beta_i^j$$ in terms of the columns of $$V_n$$; these must be linear combinations of $$\left\{1, \beta_i, \ldots, \beta_i^{n - 1}\right\}$$. If we want to find $$\beta_i^j$$ for $$0 \leq j \leq n - 1$$, this is trivial to find as a linear combination of the above elements. However, if we want higher powers, we'll need to utilize Vieta's relations. To do so, we define $g(X) = (X - \beta_1) \cdots (X - \beta_n)$ and we know by Vieta's relations that $g(X) = X^n - s_1 X^{n - 1} + s_2 X^{n - 2} + \cdots + (-1)^{n} s_n$ and by construction that $$g(\beta_i) = 0$$ for each $$i$$. In the above, the $$s_j$$ are the elementary symmetric polynomials in the $$\beta_i$$. For example $s_1 = \beta_1 + \cdots + \beta_n, \, s_2 = \sum_{1 \leq i < j \leq n} \beta_i \beta_j, \text{ and } s_n = \beta_1 \cdots \beta_n.$ Using these, we can write both $$\beta_i^n$$ and $$\beta_i^{n + 1}$$ as a linear combination (with coefficients determined by the $$s_j$$) of the lower powers of $$\beta_i$$, as desired. First, $0 = g(\beta_i) = \beta_i^n - s_1 \beta_i^{n - 1} + s_2 \beta_i^{n - 2} + \cdots + (-1)^{n} s_n \Rightarrow \beta_i^n = s_1 \beta_i^{n - 1} - s_2 \beta_i^{n - 2} + \cdots + (-1)^{n - 1} s_n.$ If we wanted to compute this in terms of $$V_n$$, we can see this as the dot product $\left[\begin{array}{c c c c c} (-1)^{n - 1} s_n & (-1)^{n - 2} s_{n - 1} & \cdots & -s_2 & s_1 \end{array}\right] (V_n)_i$ (where $$(V_n)_i$$ is the $$i$$th column). For future use, we'll denote this row vector as $$R_1^T$$ (we use $$1$$ since it's the vector which produces $$1$$ extra power of $$\beta_i$$ from the first $$n$$ nonnegative powers). By similar, logic, to compute $$\beta_i^{n + 1}$$ we see that \begin{align*} 0 = \beta_i g(\beta_i) &= \beta_i^{n + 1} - s_1 \beta_i^n + s_2 \beta_i^{n - 1} + \cdots + (-1)^{n} s_n \beta_i \\ \Rightarrow \beta_i^{n + 1} &= s_1 \beta_i^n - s_2 \beta_i^{n - 1} + \cdots + (-1)^{n - 1} s_n \beta_i \\ &= s_1 \left[s_1 \beta_i^{n - 1} - s_2 \beta_i^{n - 2} + \cdots + (-1)^{n - 1} s_n\right] \\ &- s_2 \beta_i^{n - 1} + \cdots + (-1)^{n - 1} s_n \beta_i \\ &= \left[s_1 s_1 - s_2\right] \beta_i^{n - 1} - \left[s_1 s_2 - s_3\right] \beta_i^{n - 2} + \cdots \\ &+ (-1)^{n - 2} \left[s_1 s_{n - 1} - s_n\right] \beta_i + (-1)^{n - 1} s_1 s_n. \end{align*} Again, viewing this in terms of matrix computations we see it as the dot product $\left[\begin{array}{c c c c c} (-1)^{n - 1} s_1 s_n & (-1)^{n - 2} \left( s_1 s_{n - 1} - s_n\right) & \cdots & -\left(s_1 s_2 - s_3\right) & \left( s_1 s_1 - s_2\right) \end{array}\right] (V_n)_i.$ For future use, we'll denote this row vector as $$R_2^T$$. \section[\texorpdfstring{No solution exists for $p \geq n + 1$}{No solution exists for p geq n + 1}]{No solution exists for $$p \geq n + 1$$} Now with the machinery we need, we're ready to prove that accuracy greater than or equal to $$(n + 1) - k + 1$$ is not possible. For such $$p$$, the system $\left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1 & \cdots & \beta_n \\ \vdots & \ddots & \vdots \\ \beta_1^p & \cdots & \beta_n^p \end{array}\right] \left[ \begin{array}{c} \alpha_1 \\ \vdots \\ \alpha_n \end{array}\right] = \left[\underbrace{0 \, \cdots \, 0}_{\text{first } k \text{ terms}} \, \frac{k!}{h^k} \, \underbrace{0 \, \cdots \, 0}_{\text{last } (p - k) \text{ terms}}\right]^T$ must contain the square subsystem $\left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \vdots & \ddots & \vdots \\ \beta_1^n & \cdots & \beta_n^n \\ \beta_1^{n + 1} & \cdots & \beta_n^{n + 1} \end{array}\right] \left[ \begin{array}{c} \alpha_1 \\ \vdots \\ \alpha_n \end{array}\right] = \left[ \begin{array}{c} 0 \\ \vdots \\ 0 \end{array}\right]$ where the matrix on the left does not contain the row corresponding to $$u^{(k)}(c)$$ or the row directly after it. (This last condition is to make the system square since the exponents $$0, \ldots, n + 1$$ are in $$n + 2$$ rows and we need to remove $$2$$ of those rows, one of which must be the nonzero row corresponding to $$k < n$$.) When $$k = n - 1$$, removing the row directly after $$u^{(k)}(c)$$ would remove the powers of $$\beta_i^n$$ so in this case we remove the row directly before it since we'll end up working with those powers. We introduce the notation $$A_n(k)$$ for the augmented'' Vandermonde matrix described above. Note that the last statement is equivalent to $$A_n(n - 1) = A_n(n - 2)$$ and for $$k < n - 1$$ the rows of $$A_n(k)$$ correspond to the powers $$\left\{0, \ldots, k - 1, k + 2, \ldots, n, n + 1\right\}$$ of the $$\beta_i$$. Since we seek to show that no solution exists for $$p \geq n + 1$$, we'll show that $$A_n(k)$$ is invertible, which means the system $$A_n(k) \boldsymbol{\alpha} = \mathbf{0}$$ has only the trivial solution $$\boldsymbol{\alpha} = \mathbf{0}$$. As we saw above, this would mean our approximation was $$u^{(k)}(c) \approx 0$$ which is $$\mathcal{O}(1)$$ and not a valuable approximation. \section{Invertibility of first augmented Vandermonde} To show that $$A_n(k)$$ is invertible, we proceed much like we did before but first show that we can factor this as $$A_n(k) = C_n(k) V_n$$ for some coefficient matrix $$C_n(k)$$ which depends on $$k$$. By construction, the first $$n - 2$$ rows of $$A_n(k)$$ are powers of $$\beta_i$$ lower than $$n$$ hence they are rows of $$V_n$$. Clearly $$e_j^T V_n$$ produces the $$j$$th row of $$V_n$$ which is the row of $$\beta_i^{j - 1}$$. As we showed above, $$R_1^T V_n$$ will produce the row of $$\beta_i^n$$ and $$R_2^T V_n$$ will produce the row of $$\beta_i^{n + 1}$$. Altogether, the row exponents $$\left\{0, \ldots, k - 1, k + 2, \ldots, n, n + 1 \right\}$$ of $$A_n(k)$$ correspond to multiplications by $$\left\{e_1^T, \ldots, e_k^T, e_{k + 3}^T, \ldots, R_1^T, R_2^T\right\}$$. So if we define $C_n(k) = \left[\begin{array}{c c c c c c c} e_1 & \cdots & e_k & e_{k + 3} & \cdots & R_1 & R_2 \end{array}\right]^T$ then we must have $$A_n(k) = C_n(k) V_n$$. (As before, since we define $$A_n(n - 1) = A_n(n - 2)$$ we'll also define $$C_n(n - 1) = C_n(n - 2)$$. Also, in the case that $$k = n - 2$$, $$e_{k + 3} = e_{n + 1}$$, so the description of $$C_n(n - 2)$$ shouldn't actually include $$e_{k + 3}$$. This makes perfect sense looking back at the exponents since the exponent $$k + 2 = n$$ corresponds to the exponent given by $$R_1$$ rather than the exponent given by $$e_{k + 3}$$.) In order to show that $$A_n(k)$$ is invertible, it's enough to show that it has nonzero determinant for $$1 \leq k \leq n - 2$$. But $\det(A_n(k)) = \det(C_n(k)) \det(V_n).$ As we showed above, $$\det(V_n) \neq 0$$ since the $$\beta_i$$ are distinct so it remains to show that $$\det(C_n(k)) \neq 0$$. Since determinant is multilinear, in general $\det(\ldots, e_j, \ldots, v, \ldots) = \det(\ldots, e_j, \ldots, v - v_j e_j, \ldots);$ in other words, when computing the determinant, if one of the columns is the basis vector $$e_j$$ then we can assume all the other entries in row $$j$$ are $$0$$. We know the first $$n - 2$$ columns of $$C_n(k)^T$$ are $$e_j$$ for $$1 \leq j \leq n$$ except for $$j = k + 1, k + 2$$. Hence, the only entries of $$R_1, R_2$$ that contribute to the determinant will be in rows $$k + 1, k + 2$$ of $$C_n(k)^T$$. After removing all entries of $$R_1, R_2$$ but those in rows $$k + 1, k + 2$$ we have a $$2 \times 2$$ square matrix in the final two columns and an augmented copy of $$I_{n - 2}$$ in the first $$n - 2$$ columns hence $\pm \det(C_n(k)) = R_{1, k + 1} R_{2, k + 2} - R_{1, k + 2} R_{2, k + 1}$ where the sign may be $$\pm 1$$ depending on the cofactor expansion. (We can actually show it's always $$1$$ but it's not needed here.) Hence to show $$\det(A_n(k)) \neq 0$$ it's enough to show that $$R_{1, k + 1} R_{2, k + 2} - R_{1, k + 2} R_{2, k + 1} \neq 0$$. Converting this to the symmetric polynomials involved, since $$k \leq n - 2$$ we have $R_{1, k + 1} = (-1)^{n - k - 1} s_{n - k}, R_{1, k + 2} = (-1)^{n - k - 2} s_{n - k - 1}$ and $R_{2, k + 1} = (-1)^{n - k - 1} \left(s_1 s_{n - k} - s_{n - k + 1}\right), R_{2, k + 2} = (-1)^{n - k - 2} \left(s_1 s_{n - k - 1} - s_{n - k}\right).$ (We don't need to worry about $$R_{2, 1} = (-1)^{n - 1} s_1 s_n$$ --- which has a different form than the rest of $$R_2$$ --- since $$k + 1 \geq 2$$. Also since $$k \leq n - 2$$ the terms $$R_{r, k + 2}$$ will always be defined.) With these, we have \begin{align*} R_{1, k + 1} R_{2, k + 2} - R_{1, k + 2} R_{2, k + 1} &= (-1)^{n - k - 1} s_{n - k} (-1)^{n - k - 2} \left(s_1 s_{n - k - 1} - s_{n - k}\right) \\ &- (-1)^{n - k - 2} s_{n - k - 1} (-1)^{n - k - 1} \left(s_1 s_{n - k} - s_{n - k + 1}\right) \\ &= (-1)^{-3} \left(s_{n - k} s_1 s_{n - k - 1} - s_{n - k}^2\right)\\ &- (-1)^{-3} \left(s_{n - k - 1} s_1 s_{n - k} - s_{n - k - 1} s_{n - k + 1}\right) \\ &= s_{n - k}^2 - s_{n - k - 1} s_{n - k + 1}. \end{align*} However, by \href{http://en.wikipedia.org/wiki/Newton's\_inequalities} {Newton's inequalities}, we know that. $\left(\frac{s_{n - k}}{\binom{n}{n - k}}\right)^2 \geq \frac{s_{n - k - 1}}{\binom{n}{n - k - 1}} \frac{s_{n - k + 1}}{\binom{n} {n - k + 1}} \Longleftrightarrow s_{n - k}^2 \geq \left(\frac{k + 1}{n - k} s_{n - k - 1}\right) \left(\frac{n - k + 1}{k} s_{n - k + 1}\right).$ Since both of the fractions $$\frac{k + 1}{k}$$ and $$\frac{n - k + 1} {n - k}$$ are greater than $$1$$ when $$1 \leq k \leq n - 2$$, we actually see that $$s_{n - k}^2 > s_{n - k - 1} s_{n - k + 1}$$ whenever $$s_{n - k - 1} s_{n - k + 1} > 0$$. If $$s_{n - k - 1} s_{n - k + 1} < 0$$ then $$s_{n - k}^2 \geq 0 > s_{n - k - 1} s_{n - k + 1}$$ so the case $$s_{n - k - 1} s_{n - k + 1} = 0$$ is the only other potentially problematic case. We'd like to show that $$s_{n - k}^2 - s_{n - k - 1} s_{n - k + 1} = 0$$ can't occur when $$s_{n - k - 1} s_{n - k + 1} = 0$$ and will show this by contradiction. Assuming the two hold, then we also have $$s_{n - k}^2 = 0$$ forcing $$s_{n - k} = s_{n - k - 1} = 0$$ or $$s_{n - k} = s_{n - k + 1} = 0$$. So it's enough to show that $$s_{j - 1} = s_j = 0$$ forces $$\beta_{i_0} = \beta_{i_1}$$ for some $$i_0 \neq i_1$$ which contradicts our requirement of distinct $$\beta_i$$. (This assumes $$n \geq 2$$. If $$n = 1$$ we can't approximate any derivatives by the above so it's not worth addressing.) \section[\texorpdfstring{The equality $s_{j - 1} = s_j = 0$ can't occur for distinct $\beta_i$}{The equality s\{j-1\} = sj = 0 can't occur for distinct beta}]{The equality $$s_{j - 1} = s_j = 0$$ can't occur for distinct $$\beta_i$$} If $$j = n$$ then $$0 = s_j = s_n$$ means that the product of all the $$\beta_i$$ is $$0$$, forcing $$\beta_{i_0} = 0$$ for some $$i_0$$. Since all the $$\binom{n}{n - 1}$$ terms in the sum $$s_{n - 1}$$ involve $$\beta_{i_0}$$ except for $$\binom{n - 1}{n - 1} = 1$$ of them, $$\beta_{i_0} = 0$$ combined with $$s_{n - 1} = 0$$ tells us that $0 = s_{n - 1} = \beta_{i_0} \cdot \left(\text{something}\right) + \prod_{i \neq i_0} \beta_i = \prod_{i \neq i_0} \beta_i.$ Hence, there is also $$i_1 \neq i_0$$ with $$\beta_{i_1} = 0$$. To do this for $$j < n$$, we'll need a \hyperref[reduction-lemma]{lemma} which reduces symmetric polynomials of $$n$$ points to symmetric polynomials of $$n - 1$$ points. The lemma shows that for a given $$j < n$$ $\frac{s_j(\beta_1, \ldots, \beta_n)}{\binom{n}{j}} = \frac{s_j\left( \beta^{(1)}_1, \ldots, \beta^{(1)}_{n - 1}\right)}{\binom{n - 1}{j}}$ where $$\left\{\beta^{(1)}_i\right\}$$ is a set of $$n - 1$$ points, each of which lies between two consecutive $$\beta_i$$. (Note we have used $$s_j(\, \cdot \,)$$ without defining it. This is simply the elementary symmetric polynomial with respect to the variables listed.) Hence we see that $s_j(\beta_1, \ldots, \beta_n) = 0 \Longleftrightarrow s_j\left( \beta^{(1)}_1, \ldots, \beta^{(1)}_{n - 1}\right) = 0.$ From the $$\beta^{(1)}_{i}$$ we can get another set of $$n - 2$$ points that also have $$s_j = 0$$ and so on to even lower numbers of points. Let $$\beta^{(r)}_1, \ldots, \beta^{(r)}_{n - r}$$ denote this set of points after $$r$$ reductions (and for completeness of the notation let $$\beta^{(0)}_i = \beta_i$$). We'll show that $$s_{j - 1} = s_j = 0$$ forces at least two members of the $$\beta^{(n - j)}_i$$ to be zero which in turn forces at least $$N_r = n - j - r + 2$$ members of the $$\beta^{(r)}_i$$ to be zero for $$0 \leq r \leq n - j$$. Hence when $$r = 0$$ we have at least $$N_0 = n - j + 2 \geq 1 + 2 = 3$$ members equal to $$0$$ which is not possible since we have distinct $$\beta^{(0)}_i = \beta_i$$. First, when $$r = n - j$$, we have $$n - r = j$$ points with $s_{j - 1}\left(\beta^{(r)}_1, \ldots, \beta^{(r)}_j\right) = s_j\left( \beta^{(r)}_1, \ldots, \beta^{(r)}_j\right) = 0.$ As we showed above, this means that at least two members of the $$\beta^{(r)}_i$$ are equal to $$0$$. Since we know that each $$\beta^{(r)}_i$$ lies between two of the $$\beta^{(r - 1)}_i$$, this also means that at least one of the $$\beta^{(r - 1)}_i$$ lies between between the two $$\beta^{(r)}_i$$ which are equal to $$0$$. Clearly the only points with $$0 \leq p \leq 0$$ are $$0$$ hence we see at least one of the $$\beta^{(r - 1)}_i$$ is $$0$$, say $$\beta^{(r)}_{i_0} = 0$$. Since $$n - (r - 1) = j + 1$$, we have $$j + 1$$ points at this level and of the $$\binom{j + 1}{j}$$ terms in the sum $$s_j$$ only $$\binom{j}{j} = 1$$ of them don't involve $$\beta^{(r)}_{i_0}$$. Hence the elementary symmetric polynomial $$\widehat{s}_j$$ for the $$j$$ points excluding $$\beta^{(r)}_{i_0}$$ satisfies $$\widehat{s}_j = 0$$ as well since $0 = s_j = \beta^{(r)}_{i_0} \cdot \left(\text{something}\right) + \widehat{s}_j = 0 + \widehat{s}_j = \widehat{s}_j.$ Similarly, of the $$\binom{j + 1}{j - 1}$$ terms in the sum $$s_{j - 1}$$ only $$\binom{j}{j - 1} = j$$ of them don't involve $$i_0$$. Hence the elementary symmetric polynomial $$\widehat{s}_{j - 1}$$ for the $$j$$ points excluding $$\beta^{(r)}_{i_0}$$ satisfies $$\widehat{s}_{j - 1} = 0$$ as well. Thus we are again reduced to the situation where $$\widehat{s}_{j - 1} = \widehat{s}_j = 0$$ with $$j$$ points and so find distinct $$i_1, i_2$$ from the set excluding $$i_0$$ with $$\beta^{(r)}_{i_1} = \beta^{(r)}_{i_2} = 0$$. This gives us $$3$$ points which agrees with $$N_r = n - (n - j - 1) - j + 2 = 3$$ when $$r = n - j - 1$$. Carrying forward, we see that if we have at least $$N_r$$ of the $$\beta^{(r)}_i$$ equal to $$0$$ then backing up a step will result in $$N_r - 1$$ of the $$\beta^{(r - 1)}_i$$ equal to $$0$$ since the points at the $$r$$ reduction must lie between the points at the $$r - 1$$ reduction. But this reduces the $$n - (r - 1)$$ points $$\beta^{(r - 1)}_i$$ to $$n - (r - 1) - (N_r - 1) = n - r - N_r + 2$$ potentially nonzero points. For $$r = n - j$$ and $$r = n - j - 1$$ we saw that $$N_r = n - r - j + 2$$ so that at each stage we reduce to $$n - r - N_r + 2 = j$$ potentially nonzero points and have $$\widehat{s}_{j - 1} = \widehat{s}_j = 0$$. Again, by the same logic as above this gives two more $$\beta^{(r - 1)}_i = 0$$ bringing the total to $$N_{r - 1} = (N_r - 1) + 2 = N_r + 1$$. So in general, reducing $$r$$ by $$1$$ raises $$N_r$$ by $$1$$ and the value $j = n - r - N_r + 2$ remains invariant (this is needed to reduce to $$n - r - N_r + 2 = j$$ points when backing up a step/reducing $$r$$). \section[\texorpdfstring{A family of solutions exists for $p = n$}{A family of solutions exists for p = n}]{A family of solutions exists for $$p = n$$} Since we showed the system is degenerate when $$p > n$$, we know the maximal order for our approximation which has not yet been ruled out occurs when $$p = n$$. We'll show both that it's possible for the system to have a solution and give a condition on such a solution. When $$p = n$$ we have a system of $$p + 1 = n + 1$$ equations in $$2n$$ indeterminates: $\left[ \begin{array}{c c c} 1 & \cdots & 1 \\ \beta_1 & \cdots & \beta_n \\ \vdots & \ddots & \vdots \\ \beta_1^n & \cdots & \beta_n^n \end{array}\right] \left[ \begin{array}{c} \alpha_1 \\ \vdots \\ \alpha_n \end{array}\right] = \left[\underbrace{0 \, \cdots \, 0}_{\text{first } k \text{ terms}} \, \frac{k!}{h^k} \, \underbrace{0 \, \cdots \, 0}_{\text{last } (n - k) \text{ terms}}\right]^T.$ Before, to get a square system with all zeroes on the right, we had to remove \textbf{two} of the first $$n + 2$$ rows but now we have only one to remove and a unique choice at that: the row corresponding to $$\beta_i^k$$. As before we introduce the notation $$B_n(k)$$ for the augmented'' Vandermonde matrix where the rows of $$B_n(k)$$ correspond to the powers $$\left\{0, \ldots, k - 1, k + 1, \ldots, n\right\}$$ of the $$\beta_i$$. By similar calculations as above we must have $B_n(k) = \left[\begin{array}{c c c c c c c} e_1 & \cdots & e_k & e_{k + 2} & \cdots & R_1 \end{array}\right]^T V_n = D_n(k) V_n$ for some coefficient matrix $$D_n(k)$$. (In the case that $$k = n - 1$$, the $$e_{k + 2}$$ above does not belong since $$k + 1$$ corresponds to $$R_1$$ directly and the first $$0, \ldots, k - 1$$ are the shifted identity. As before we have $$k \leq n - 1$$ since we can't approximate any higher derivatives with $$n$$ points.) In order for a nontrivial solution to exist (the opposite of what we wanted above), $$B_n(k)$$ can't be invertible (also the opposite). Hence we need $$\det(B_n(k)) = 0$$. As we've seen, for distinct $$\beta_i$$, $$\det(V_n) \neq 0$$ so we need $$\det(D_n(k)) = 0$$. Again, due to multilinearity of the determinant, all the entries in the last row --- $$R_1$$ --- of $$D_n(k)$$ can be considered $$0$$ except for $$R_{1,k + 1}$$ since $$e_{k + 1}$$ is the only basis vector missing in the first $$n - 1$$ rows. Hence $\pm \det(D_n(k)) = R_{1, k + 1}$ where the sign may be $$\pm 1$$ depending on the cofactor expansion. (We can actually show it's always $$(-1)^{n - k - 1}$$ but it's not needed here.) Since $$R_{1, k + 1} = (-1)^{n - k - 1} s_{n - k}$$, we see that a \fbox{nontrivial solution can only exist if $$s_{n - k} = 0$$}. \section{Inspecting the solutions} Having $$s_{n - k} = 0$$ means that we can pick any $$\beta_1, \ldots, \beta_{n - 1}$$ we like and $$\beta_n$$ will be uniquely determined. To see this, denote $$\widehat{s}_j$$ as the $$j$$th symmetric sum in the $$\beta_1, \ldots, \beta_{n - 1}$$ (note that $$\beta_n$$ is excluded). With this, we have $s_{n - k} = \beta_n \widehat{s}_{n - k - 1} + \widehat{s}_{n - k}$ so that $$s_{n - k} = 0$$ gives us the unique (only unique to $$k$$) choice $\beta_n = - \frac{\widehat{s}_{n - k}}{\widehat{s}_{n - k - 1}}.$ (We define $$\widehat{s}_0 = 1$$. This is needed for the case $$s_1 = \beta_n \widehat{s}_0 + \widehat{s}_1$$ to make sense and also fits with the general case where $$1$$ is just the value of an empty product.) We see that if $$\widehat{s}_{n - k - 1} = 0$$ then there is no solution, so there is some part of $$\mathbf{R}^{n - 1}$$ where no solution exists but this is a measure $$0$$ set. Once we've picked our $$\left(\beta_1, \ldots, \beta_{n - 1}\right)$$ to find $$\boldsymbol{\alpha}$$ we need to determine the kernel of $$B_n(k)$$ since $$B_n(k) \boldsymbol{\alpha} = \mathbf{0}$$. Since the first $$n - 1$$ rows of $$D_n(k)$$ are linearly independent standard basis vectors we have $$\rank(D_n(k)) \geq n - 1$$. Since $$\det(D_n(k)) = 0$$ we know it can't have full rank hence the rank is exactly $$n - 1$$ meaning the kernel of $$D_n(k)$$ is one dimensional. Hence the same is true of $$B_n(k)$$ since $$V_n$$ is invertible. This is quite helpful since it limits our choices of $$\boldsymbol{\alpha}$$ to a single scalar multiple of a basis vector of the kernel. To find a choice for this basis vector we start with $B_n(k) v = 0 \Longleftrightarrow (D_n(k) V_n)v = 0 \Longleftrightarrow D_n(k) w = 0, w = V_n v.$ But since the first $$n - 1$$ rows of $$D_n(k)$$ are the standard basis vectors this means $$0 = e_j^T w = w_j$$ for $$j \neq k + 1$$, making $$w_{k + 1}$$ the only nonzero component. Since $$R_1^T e_{k + 1} = R_{1, k + 1} = 0$$ by construction, $$w = e_{k + 1}$$ will do the trick as our basis for the kernel of $$D_n(k)$$ hence $$v = V_n^{-1} w = V_n^{-1} e_{k + 1}$$ works as a basis for the kernel of $$B_n(k)$$. So we have $$\boldsymbol{\alpha} = \ell V_n^{-1} e_{k + 1}$$ for some scalar $$\ell$$ and the final condition is $\left[\begin{array}{c c c} \beta_1^k & \cdots & \beta_n^k \end{array} \right] \boldsymbol{\alpha} = \frac{k!}{h^k} \Longleftrightarrow \ell \left[\begin{array}{c c c} \beta_1^k & \cdots & \beta_n^k \end{array} \right] V_n^{-1} e_{k + 1} = \frac{k!}{h^k}.$ As it turns out $$\left[\begin{array}{c c c} \beta_1^k & \cdots & \beta_n^k \end{array} \right] V_n^{-1} e_{k + 1} = 1$$ which tells us that $$\ell = \frac{k!}{h^k}$$. To see this note that $$e_j^T V_n$$ gives the $$j$$th row of $$V_n$$, hence $e_{k + 1}^T V_n = \left[\begin{array}{c c c} \beta_1^k & \cdots & \beta_n^k \end{array} \right] \Rightarrow \left(\left[\begin{array}{c c c} \beta_1^k & \cdots & \beta_n^k \end{array} \right] V_n^{-1}\right) e_{k + 1} = e_{k + 1}^T e_{k + 1} = 1.$ Thus the solution is $$\boldsymbol{\alpha} = \frac{k!}{h^k} V_n^{-1} e_{k + 1}$$; for one, this confirms our notion that $$\alpha_i = \mathcal{O} \left(\frac{1}{h^k}\right)$$. Alternatively, this is the same as the solution we would have obtained had we just dropped the $$(n + 1)$$st row from the system and tried to solve $$V_n \boldsymbol{\alpha} = \frac{k!}{h^k} e_{k + 1}$$ (this would not have guaranteed the order condition, which we needed $$s_{n - k} = 0$$ for). \section[\texorpdfstring{Choosing the $n$ points}{Choosing the n points}]{Choosing the $$n$$ points} Though we know the theoretical bound on the estimate and a condition on points to achieve this bound, this doesn't narrow down the choices for our $$\beta_i$$. Given an approximation for $$u^{(k)}(c)$$, we know by the above that the $$u^{(n + 1)}(c)$$ terms in the Taylor expansion can never sum to $$0$$. However, we may be able to make the coefficient of this term small to get a better estimate. For $$\left\{\alpha_i\right\}$$, $$\left\{\beta_i\right\}$$ which give a maximum order approximation of $$u^{(k)}(c)$$ we have $u^{(k)}(c) \approx \sum_{i = 1}^n \alpha_i u(c + \beta_i h) = u^{(k)}(c) + u^{(n + 1)}(c) \frac{h^{n + 1}}{(n + 1)!} \sum_{i = 1}^n \alpha_i \beta_i^{n + 1} + \mathcal{O}\left(h^{n + 2} \sum_{1 = 1}^n \alpha_i \right).$ So the part of the coefficient we can control is determined by $\sum_{i = 1}^n \alpha_i \beta_i^{n + 1} = \left[\begin{array}{c c c} \beta_1^{n + 1} & \cdots & \beta_n^{n + 1}\end{array}\right] \left[ \begin{array}{c} \alpha_1 \\ \vdots \\ \alpha_n \end{array}\right]$ However, as we showed above $$R_2^T V_n = \left[\begin{array}{c c c} \beta_1^{n + 1} & \cdots & \beta_n^{n + 1}\end{array} \right]$$ hence the coefficient is $R_2^T V_n \boldsymbol{\alpha} = R_2^T V_n \left(\frac{k!}{h^k} V_n^{-1} e_{k + 1}\right) = \frac{k!}{h^k} R_2^T e_{k + 1} = \frac{k!}{h^k} R_{2, k + 1}.$ Since $$k + 1 \geq 2$$ ($$R_{2, 1}$$ is different than the rest of the entries in $$R_2$$) we know $$R_{2, k + 1}$$ has the form $R_{2, k + 1} = (-1)^{n - k - 1} \left(s_1 s_{n - k} - s_{n - k + 1} \right) = (-1)^{n - k} s_{n - k + 1}$ since we already have $$s_{n - k} = 0$$. Since $$s_{n - k}, s_{n - k + 1}$$ are homogenous we can make $$s_{n - k + 1}$$ as small as we want by simply scaling the $$\beta_i$$ that make $$s_{n - k} = 0$$. Thus the infimum of $$\left|s_{n - k + 1}\right|$$ is $$0$$ but this can never be attained since we showed before that the uniqueness condition makes $$s_{n - k} = s_{n - k + 1} = 0$$ impossible. However, it's still worth noting that our approximation takes the form $u^{(k)}(c) - \frac{k! \cdot u^{(n + 1)}(c) \cdot s_{n - k + 1}}{(n + 1)!} (-h)^{n - k + 1} + \mathcal{O}\left(h^{n - k + 2}\right).$ \section{Reasonable Choices} Since we'd be approximating on a grid, it's much more reasonable to assume that $$(\beta_1, \ldots, \beta_n) \in \mathbf{Z}^n$$ than the much less restricted $$\mathbf{R}^n$$. Now, the equality $\beta_n = - \frac{\widehat{s}_{n - k}}{\widehat{s}_{n - k - 1}}$ has some more meaning. The computations iterate over tuples which satisfy $$s_{n - k} = 0$$ and a few more constraints. Since we seek to minimize $$s_{n - k + 1}$$, we require the tuple has no common divisor $$d > 1$$, for if it did we could divide by it and reduce the size of $$s_{n - k + 1}$$ by a factor of $$d^{n - k + 1}$$. In addition, since tuples are the same up to re-ordering we require that $$\beta_1 < \cdots < \beta_n$$ (with the strictness due to the uniqueness constraint on the $$\beta_i$$). Finally, since scaling by $$-1$$ has no impact on the zero condition we require that $\left|\beta_n\right| = \max_i \left|\beta_i\right|$ (really only $$\left|\beta_1\right|$$ and $$\left|\beta_n\right|$$ were candidates, one being the smallest negative, the other the largest positive). With these tuples in hand we compute some minima of $$\left|s_{n - k + 1} \right|$$: \begin{align*} %% n = 2 n &= 2 \\ \left|s_{2}\left(-1, 1\right)\right| &= 1 \text{ when } k = 1 \\ %% s_{2 - 1}\left(-1, 1\right) &= 0 \\ %% \left|s_{2 - 1 + 1}\left(-1, 1\right)\right| &= 1 \\ %% n = 3 n &= 3 \\ \left|s_{3}\left(-2, 3, 6\right)\right| &= 36 \text{ when } k = 1 \\ \left|s_{2}\left(-1, 0, 1\right)\right| &= 1 \text{ when } k = 2 \\ %% s_{3 - 1}\left(-2, 3, 6\right) &= 0 \\ %% \left|s_{3 - 1 + 1}\left(-2, 3, 6\right)\right| &= 36 \\ %% s_{3 - 2}\left(-1, 0, 1\right) &= 0 \\ %% \left|s_{3 - 2 + 1}\left(-1, 0, 1\right)\right| &= 1 \\ %% n = 4 n &= 4 \\ \left|s_{4}\left(-2, -1, 1, 2\right)\right| &= 4 \text{ when } k = 1 \\ \left|s_{3}\left(-2, 1, 2, 4\right)\right| &= 20 \text{ when } k = 2 \\ \left|s_{2}\left(-2, -1, 1, 2\right)\right| &= 5 \text{ when } k = 3 \\ %% s_{4 - 1}\left(-2, -1, 1, 2\right) &= 0 \\ %% \left|s_{4 - 1 + 1}\left(-2, -1, 1, 2\right)\right| &= 4 \\ %% s_{4 - 2}\left(-2, 1, 2, 4\right) &= 0 \\ %% \left|s_{4 - 2 + 1}\left(-2, 1, 2, 4\right)\right| &= 20 \\ %% s_{4 - 3}\left(-2, -1, 1, 2\right) &= 0 \\ %% \left|s_{4 - 3 + 1}\left(-2, -1, 1, 2\right)\right| &= 5 \\ %% n = 5 n &= 5 \\ \left|s_{5}\left(-2, -1, 1, 3, 6\right)\right| &= 36 \text{ when } k = 1 \\ \left|s_{4}\left(-2, -1, 0, 1, 2\right)\right| &= 4 \text{ when } k = 2 \\ \left|s_{3}\left(-2, 0, 1, 2, 4\right)\right| &= 20 \text{ when } k = 3 \\ \left|s_{2}\left(-2, -1, 0, 1, 2\right)\right| &= 5 \text{ when } k = 4 \\ %% s_{5 - 1}\left(-2, -1, 1, 3, 6\right) &= 0 \\ %% \left|s_{5 - 1 + 1}\left(-2, -1, 1, 3, 6\right)\right| &= 36 \\ %% s_{5 - 2}\left(-2, -1, 0, 1, 2\right) &= 0 \\ %% \left|s_{5 - 2 + 1}\left(-2, -1, 0, 1, 2\right)\right| &= 4 \\ %% s_{5 - 3}\left(-2, 0, 1, 2, 4\right) &= 0 \\ %% \left|s_{5 - 3 + 1}\left(-2, 0, 1, 2, 4\right)\right| &= 20 \\ %% s_{5 - 4}\left(-2, -1, 0, 1, 2\right) &= 0 \\ %% \left|s_{5 - 4 + 1}\left(-2, -1, 0, 1, 2\right)\right| &= 5 n &= 6 \\ \left|s_{6}\left(-3, -2, -1, 1, 2, 3\right)\right| &= 36 \text{ when } k = 1 \\ \left|s_{5}\left(-2, -1, 0, 1, 3, 6\right)\right| &= 36 \text{ when } k = 2 \\ \left|s_{4}\left(-3, -2, -1, 1, 2, 3\right)\right| &= 49 \text{ when } k = 3 \\ \left|s_{3}\left(-2, -1, 0, 1, 3, 7\right)\right| &= 50 \text{ when } k = 4 \\ \left|s_{2}\left(-3, -2, -1, 1, 2, 3\right)\right| &= 14 \text{ when } k = 5 \\ n &= 7 \\ \left|s_{7}\left(-3, -2, -1, 1, 2, 4, 12\right)\right| = \left|s_{7}\left(-4, -2, -1, 1, 3, 4, 6\right)\right| &= 576 \text{ when } k = 1 \\ \left|s_{6}\left(-3, -2, -1, 0, 1, 2, 3\right)\right| &= 36 \text{ when } k = 2 \\ \left|s_{5}\left(-5, -4, -3, -1, 0, 2, 14\right)\right| &= 2036 \text{ when } k = 3 \\ \left|s_{4}\left(-3, -2, -1, 0, 1, 2, 3\right)\right| &= 49 \text{ when } k = 4 \\ \left|s_{3}\left(-3, -1, 0, 1, 2, 3, 5\right)\right| &= 70 \text{ when } k = 5 \\ \left|s_{2}\left(-3, -2, -1, 0, 1, 2, 3\right)\right| &= 14 \text{ when } k = 6 \\ n &= 8 \\ \left|s_{8}\left(-4, -3, -2, -1, 1, 2, 3, 4\right)\right| &= 576 \text{ when } k = 1 \\ \left|s_{7}\left(-3, -2, -1, 0, 1, 2, 4, 12\right)\right| = \left|s_{7} \left(-4, -2, -1, 0, 1, 3, 4, 6\right)\right| &= 576 \text{ when } k = 2 \\ \left|s_{6}\left(-4, -3, -2, -1, 1, 2, 3, 4\right)\right| &= 820 \text{ when } k = 3 \\ \left|s_{5}\left(-8, -4, -3, -2, 0, 1, 2, 11\right)\right| &= 3276 \text{ when } k = 4 \\ \left|s_{4}\left(-2, -1, 0, 1, 2, 3, 4, 5\right)\right| &= 231 \text{ when } k = 5 \\ \text{this broke the} &\text{ pattern for } s_{2m} \\ \left|s_{3}\left(-3, -2, -1, 0, 1, 3, 4, 9\right)\right| &= 182 \text{ when } k = 6 \\ \left|s_{2}\left(-4, -3, -2, -1, 1, 2, 3, 4\right)\right| &= 30 \text{ when } k = 7 \end{align*} \section{Finding Minimizers: Programming} We can freely abstract our problem as $\min_{\boldsymbol{\beta}} \left|s_{d + 1}\right| \qquad \text{such that } s_d = 0, \boldsymbol{\beta} \in \mathbf{Z}^n, \; \beta_i \neq \beta_j \; \forall i, j.$ This is only well-defined if $$d + 1 \leq n$$ since there are no more elementary symmetric polynomials with only $$n$$ points. We also must have $$0 < d$$ since $$s_0 = 1$$ can never equal $$0$$. Since these are symmetric, we can WLOG say $\beta_1 \leq \beta_2 \leq \cdots \leq \beta_n.$ Since the $$\left\{\beta_j\right\}$$ must be distinct integers, we can say \begin{align*} \beta_1 + 1 &\leq \beta_2 \\ \beta_2 + 1 &\leq \beta_3 \\ &\vdots \\ \beta_{n - 1} + 1 &\leq \beta_n. \end{align*} What's more, we must have $$\beta_1 < 0$$. (If not then we'd have $$s_d > 0$$ since there would be no negative terms to cancel with.) With this, we can move this problem into the positive orthant $$\mathbf{Z}_{\geq 0}^n$$. We write $\beta_1 = - 1 - p_1 \leq -1 \qquad \text{for } p_1 \geq 0$ and further define \begin{align*} p_2 = \beta_2 - \beta_1 - 1 &\geq 0 \\ p_3 = \beta_3 - \beta_2 - 1 &\geq 0 \\ &\vdots \\ p_n = \beta_n - \beta_{n - 1} - 1 &\geq 0 \\ \end{align*} Now by construction we have $$\mathbf{p} = \left(p_1, \ldots, p_n\right) \in \mathbf{Z}_{\geq 0}^n$$. \subsection{Example: $n = 2$} We have $\beta_1 = -1 - p_1, \quad \beta_2 = 1 + \beta_1 + p_2 = -p_1 + p_2$ The only allowable values is $$d = 1$$. We have $s_1 = \beta_1 + \beta_2 = -1 - 2 p_1 + p_2$ and $s_2 = \beta_1 \beta_2 = \left(-1 - p_1\right)\left(-p_1 + p_2\right).$ To solve $s_1 = 0 \Longrightarrow p_2 = 1 + 2 p_1 \Longrightarrow \left|s_2\right| = \left|\left(-1 - p_1\right)\left(1 + p_1\right)\right| = \left(1 + p_1\right)^2.$ For $$p_1 \geq 0 > -1$$, this function is increasing so choosing $$p_1 = 0$$ minimizes which gives $$p_2 = 1 + 2(0) = 1$$. Converting back to $$\left\{\beta_j\right\}$$, we confirm the solution we've seen before: $\boxed{\beta_1 = -1 - 0 = -1, \qquad \beta_2 = -0 + 1 = 1}.$ \subsection{Example: $n = 3$} We add to the above with $\beta_3 = 1 + \beta_2 + p_3 = 1 - p_1 + p_2 + p_3$ and \begin{align*} s_1 &= \left[-1 - 2 p_1 + p_2\right] + \beta_3 = - 3 p_1 + 2 p_2 + p_3 \\ s_2 &= \left[\left(-1 - p_1\right)\left(-p_1 + p_2\right)\right] + \beta_3 \left[-1 - 2 p_1 + p_2\right] \\ &= 3 p_{1}^{2} - 4 p_{1} p_{2} - 2 p_{1} p_{3} + p_{2}^{2} + p_{2} p_{3} - p_{2} - p_{3} - 1 \\ s_3 &= \left(-1 - p_1\right)\left(-p_1 + p_2\right)\left(1 - p_1 + p_2 + p_3\right) \end{align*} \paragraph{When $$d = 1$$:} We can use \begin{align*} s_1 &= 0 \\ \Longrightarrow p_3 &= 3 p_1 - 2 p_2 \\ \Longrightarrow s_2 &= - 3 p_{1}^{2} + 3 p_{1} p_{2} - 3 p_{1} - p_{2}^{2} + p_{2} - 1. \end{align*} Viewing this as a parabola in $$p_2$$, it is downward opening (lead coefficient $$-1$$). Since $\frac{\partial s_2}{\partial p_2} = 3 p_1 - 2 p_2 + 1$ the max value of $$s_2$$ (the down-opening parabola) occurs when $$p_2 = \frac{3 p_1 + 1}{2}$$, at which point $s_2\left(p_1, \frac{3 p_1 + 1}{2}\right) = - \frac{3}{4}\left(p_1 + 1\right)^2.$ Thus $$s_2$$ is non-positive for \textbf{any} (real) choice of $$p_1$$, hence $\left|s_2\right| = 3 p_{1}^{2} - 3 p_{1} p_{2} + 3 p_{1} + p_{2}^{2} - p_{2} + 1.$ To minimize these, we first look to the value above. For a fixed $$p_1$$ the best'' choice for $$p_2$$ gives $p_3 = 3 p_1 - 2 \left(\frac{3 p_1 + 1}{2}\right) = -1.$ This is not allowed, so we need to consider other choices of $$p_2$$ (again for a fixed $$p_1$$). Letting $p_2 = \frac{3 p_1 + 1}{2} + \delta \Longrightarrow p_3 = -1 - 2 \delta, \; \left|s_2(p_1, p_2)\right| = \frac{3}{4}\left(p_1 + 1\right)^2 + \delta^2.$ Thus to minimize $$\left|s_2\right|$$ (for a fixed $$p_1$$) we want an acceptable $$\delta$$ that is as close to zero as possible. We see that $0 \leq p_3 = -1 - 2 \delta \Longrightarrow \delta \leq -\frac{1}{2}.$ The parity of $$p_1$$ determines the value of $$\delta$$ nearest to $$-\frac{1}{2}$$ such that $$p_2 \in \mathbf{Z}$$: $\delta = \begin{cases} -\frac{1}{2} & \text{if } p_1 \equiv 0 \bmod{2} \\ -1 & \text{if } p_1 \equiv 1 \bmod{2} \end{cases} \Longrightarrow \min \left|s_2\right| = \begin{cases} \frac{3}{4}\left(p_1 + 1\right)^2 + \frac{1}{4} & \text{if } p_1 \in \left\{0, 2, 4, \ldots\right\} \\ \frac{3}{4}\left(p_1 + 1\right)^2 + 1 & \text{if } p_1 \in \left\{1, 3, 5, \ldots\right\} \end{cases}.$ Noting that $$\left(p_1 + 1\right)^2$$ is increasing for $$p_1 \geq 0 > -1$$, we have two candidates for $$\min \left|s_2\right|$$: $\frac{3}{4}\left(0 + 1\right)^2 + \frac{1}{4} = 1 \quad \text{and} \quad \frac{3}{4}\left(1 + 1\right)^2 + 1 = 4$ Thus we choose $p_1 = 0 \Longrightarrow p_2 = \frac{3 \cdot 0 + 1}{2} - \frac{1}{2} = 0 \Longrightarrow p_3 = 3 (0) - 2(0) = 0.$ From this we confirm the solution we've seen before: $\boxed{\beta_1 = -1 - 0 = -1, \qquad \beta_2 = -0 + 0 = 0, \qquad \beta_3 = 1 - 0 + 0 + 0 = 1}.$ \paragraph{When $$d = 2$$:} We can use \begin{align*} 0 &= s_2 = 3 p_{1}^{2} - 4 p_{1} p_{2} - 2 p_{1} p_{3} + p_{2}^{2} + p_{2} p_{3} - p_{2} - p_{3} - 1 \\ \Longrightarrow p_3 &= \frac{3 p_1^2 - 4 p_1 p_2 + p_2^2 - p_2 - 1}{2 p_1 - p_2 + 1} \\ &= \frac{\left(2 p_1 - p_2 + 1\right)^2 - \left(p_1 + 1\right)^2 - \left(2 p_1 - p_2 + 1\right)}{2 p_1 - p_2 + 1} \\ &= \left(2 p_1 - p_2\right) - \frac{\left(p_1 + 1\right)^2}{2 p_1 - p_2 + 1} \end{align*} The only case this is not true is if $0 = 2 p_1 - p_2 + 1$ which in turn forces (via $$s_2 = 0$$): $3 p_1^2 - 4 p_1 p_2 + p_2^2 - p_2 - 1 = 0$ Solving this system of two variables in two unknowns gives $$p_1 = p_2 = -1$$. Thus we know that no triple satisfying $$s_2 = 0$$ can have $$2 p_1 - p_2 + 1 = 0$$ since by definition $$p_1, p_2 \geq 0$$. One can also show that $$p_3 \geq 0$$ forces $\frac{p_2 - 1}{3} \leq p_1 \leq \frac{p_2 - 2}{2} \quad \text{or} \quad p_2 + 1 \leq p_1.$ With this, we have $s_3 = \frac{\left(p_1 + 1\right)^2 \left(p_1 - p_2\right)^2}{2 p_1 - p_2 + 1} = \frac{E^2 \left(E - D\right)^2}{D}$ where $D = 2 p_1 - p_2 + 1, E = p_1 + 1 \Longrightarrow p_2 - p_1 = p_1 - D + 1$ To minimize $$s_3$$, neither $$E = 0$$ or $$E - D = 0$$ is allowed. The first requires $$p_1 = -1$$ which isn't positive and the second forces $$p_2 = p_1$$ which in turn forces $$p_3 = -1$$. With this substitution we have $p_3 = D - 1 - \frac{E^2}{D}.$ In particular that tells us $$D \mid E^2$$ since $$\frac{E^2}{D} = D - 1 - p_3 \in \mathbf{Z}$$. To minimize $$\left|s_3\right|$$, we consider the cases $$D$$ positive and negative separately and then try to minimize for a fixed $$E$$. $\left|s_3\right| = \begin{cases} \frac{E^2 \left(E - D'\right)^2}{D'} & \text{if } D' = D > 0 \\ \frac{E^2 \left(E + D'\right)^2}{D'} & \text{if } -D' = D < 0 \end{cases} = \begin{cases} E^2 \left(\frac{E^2}{D'} + D' - 2 E\right) & \text{if } D' = D > 0 \\ E^2 \left(\frac{E^2}{D'} + D' + 2 E\right) & \text{if } -D' = D < 0 \end{cases}$ Thus minimizing corresponds to positive $$D'$$ that minimize $0 < \frac{\left(E - D'\right)^2}{D'} = \frac{E^2}{D'} + D' - 2 E \quad \text{and} \quad 0 < \frac{\left(E + D'\right)^2}{D'} = \frac{E^2}{D'} + D' + 2 E.$ Though $$\frac{E^2}{D'} + D' - 2 E < \frac{E^2}{D'} + D' + 2 E$$ (since $$E = p_1 + 1$$ is always positive) we have, values of $$D'$$ that may be usable for one expression may not be for the other. This is because we need to satisfy $0 \leq p_2 = \begin{cases} 2E - 1 - D' & \text{if } D' = D > 0 \\ 2E - 1 + D' & \text{if } -D' = D < 0 \end{cases} \quad \text{and} \quad 0 \leq p_3 = \begin{cases} \left(D' - \frac{E^2}{D'}\right) - 1 & \text{if } D' = D > 0 \\ \left(\frac{E^2}{D'} - D'\right) - 1 & \text{if } -D' = D < 0 \end{cases}.$ These inequalities mean that $E < D' \leq 2E - 1 \text{ when } D' = D \quad \text{and} \quad 1 \leq D' < E \text{ when } D' = -D.$ We know $$\frac{E^2}{D'} + D'$$ is minimized at $$D' = E$$ (by AM-GM inequality, for example) but we've seen above that $$D' = E$$ is not allowed in either case. As $$D'$$ moves out from $$E$$ in either direction, that value increases from the minimum value $$2E$$. Thus the best value of $$D'$$ in either $$\left[1, E\right)$$ or $$\left(E, 2E - 1\right]$$ is the one closest to $$E$$. Since we must have $$D' \mid E^2$$, there is a unique choice $$D' \in \left[1, E\right)$$ that is closest to $$E$$. Conversely, the equivalent value $$D'' = \frac{E^2}{D'}$$ is the unique choice $$D'' \in \left(E, E^2\right]$$ that is closest to $$E$$. In the case that $$D'' \in \left(E, 2E - 1\right]$$, then we must have $\frac{E^2}{D''} + D'' - 2E = D' + \frac{E^2}{D'} - 2E < \frac{E^2}{D'} + D' + 2E.$ Hence for a fixed $$E$$, we have $\min \left|s_3\right| = \begin{cases} E^2 \left(\frac{E^2}{D'} + D' - 2 E\right) & \text{if } \exists E < D' \leq 2E - 1 \text{ such that } D' \mid E^2 \\ E^2 \left(\frac{E^2}{D'} + D' + 2 E\right) & \text{otherwise} \end{cases}$ Where the values of $$D'$$ above are understood to be the factors of $$E^2$$ nearest to $$E$$ (above and below). We \textbf{claim} that $$\min \left|s_3\right| = 36$$ and can check this by considering all values $$1 \leq E \leq 6$$. \begin{itemize} \item $$E = 1$$: No possible values since $$\left[1, E\right) = \left(E, 2E - 1\right] = \varnothing$$ \item $$E = 2$$: Only possible values are $$1 = D' = \left[1, E\right)$$ and $$3 = D' \in \left(E, 2E - 1\right]$$. But since $$3 \nmid E^2$$ our only choice is $$-D = D' = 1$$. This yields $\left|s_3\right| = 2^2\left(\frac{2^2}{1} + 1 + 2(2)\right) = 36.$ \item $$E = 3$$: The only value in $$\left[1, E\right) = \left[1, 3\right)$$ dividing $$E^2 = 9$$ is $$D' = 1$$ and no value in $$\left(E, 2E - 1\right] = \left(3, 5\right]$$ divides $$E^2$$. Thus our only choice gives $\left|s_3\right| = 3^2\left(\frac{3^2}{1} + 1 + 2(3)\right) = 144.$ \item $$E = 4$$: The values in $$\left[1, E\right) = \left[1, 4\right)$$ dividing $$E^2 = 16$$ are $$D' = 1, 2$$ and no value in $$\left(E, 2E - 1\right] = \left(4, 7\right]$$ divides $$E^2$$. Thus our choice nearest $$E = 4$$ gives $\left|s_3\right| = 4^2\left(\frac{4^2}{2} + 2 + 2(4)\right) = 288.$ \item $$E = 5$$: The only value in $$\left[1, E\right) = \left[1, 5\right)$$ dividing $$E^2 = 25$$ is $$D' = 1$$ and no value in $$\left(E, 2E - 1\right] = \left(5, 9\right]$$ divides $$E^2$$. Thus our only choice gives $\left|s_3\right| = 5^2\left(\frac{5^2}{1} + 1 + 2(5)\right) = 900.$ \item $$E = 6$$: The values in $$\left[1, E\right) = \left[1, 6\right)$$ dividing $$E^2 = 36$$ are $$D' = 1, 2, 3, 4$$ and the only value in $$\left(E, 2E - 1\right] = \left(6, 11\right]$$ dividing $$E^2$$ is $$D' = 9$$. Thus we take the choice above $$E = 6$$, which gives $\left|s_3\right| = 6^2\left(\frac{6^2}{9} + 9 - 2(6)\right) = 36.$ \end{itemize} Thus the minimum $$\left|s_3\right| = 36$$ occurs when $E = 2, D = -D' = -1 \Longrightarrow p_1 = 1, p_2 = 2E - D - 1 = 4, p_3 = D - 1 - \frac{E^2}{D} = 2$ and $E = 6, D = D' = 9 \Longrightarrow p_1 = 5, p_2 = 2E - D - 1 = 2, p_3 = D - 1 - \frac{E^2}{D} = 4$ From this we confirm the solution(s) we've seen before: $\boxed{\beta_1 = -1 - 1 = -2, \qquad \beta_2 = -1 + 4 = 3, \qquad \beta_3 = 1 - 1 + 4 + 2 = 6}.$ and $\boxed{\beta_1 = -1 - 5 = -6, \qquad \beta_2 = -5 + 2 = -3, \qquad \beta_3 = 1 - 5 + 2 + 4 = 2}.$ \textbf{NOTE}: These are actually the same since multiplying by $$-1$$ preserves $$s_d = 0$$ and also preserves the value of $$\left|s_{d + 1}\right|$$. If we seek to find a \textbf{unique} solution we could require that $\left|\beta_1\right| \leq \left|\beta_n\right| \Longleftrightarrow -\beta_1 \leq \beta_n \Longleftrightarrow 0 \leq \beta_1 + \beta_n$ (noting that just as $$\beta_1 < 0$$ is required for cancellation, $$\beta_n > 0$$ is also required). \subsection{Example: $n = 4$} We add to the above with $\beta_4 = 1 + \beta_3 + p_4 = 2 - p_1 + p_2 + p_3 + p_4$ and \begin{align*} s_1 &= \left[- 3 p_1 + 2 p_2 + p_3\right] + \beta_4 = 2 - 4 p_1 + 3 p_2 + 3 p_2 + p_4 \\ s_2 &= \left[\beta_1 \beta_2 + \beta_2 \beta_3 + \beta_3 \beta_1\right] + \beta_4\left[\beta_1 + \beta_2 + \beta_3\right] \\ &= \left[3 p_{1}^{2} - 4 p_{1} p_{2} - 2 p_{1} p_{3} + p_{2}^{2} + p_{2} p_{3} - p_{2} - p_{3} - 1\right] \\ &+ \left(2 - p_1 + p_2 + p_3 + p_4\right)\left[- 3 p_1 + 2 p_2 + p_3\right] \end{align*} (We'll leave out the rest for now.) \paragraph{When $$d = 1$$:} We have $s_1 = 0 \Longrightarrow \beta_4 = -\left(\beta_1 + \beta_2 + \beta_3\right)$ hence \begin{align*} s_2 &= \left[\beta_1 \beta_2 + \beta_2 \beta_3 + \beta_3 \beta_1\right] + \beta_4\left[\beta_1 + \beta_2 + \beta_3\right] \\ &= \left(\beta_1 \beta_2 + \beta_2 \beta_3 + \beta_3 \beta_1\right) - \left(\beta_1 + \beta_2 + \beta_3\right)^2 \\ &= -\left(\beta_1^2 + \beta_1 \beta_2 + \beta_1 \beta_3 + \beta_2^2 + \beta_2 \beta_3 + \beta_3^2\right) \\ &= - \frac{1}{2} \left(\beta_1 + \beta_2\right)^2 - \frac{1}{2} \left(\beta_1 + \beta_3\right)^2 - \frac{1}{2} \left(\beta_2 + \beta_3\right)^2 \end{align*} Luckily this means we can just deal with minimizing \begin{align*} 2\left|s_2\right| &= \left(\beta_1 + \beta_2\right)^2 + \left(\beta_1 + \beta_3\right)^2 + \left(\beta_2 + \beta_3\right)^2 \\ &= \left(- 2 p_{1} + p_{2} - 1\right)^{2} + \left(- 2 p_{1} + p_{2} + p_{3}\right)^{2} + \left(- 2 p_{1} + 2 p_{2} + p_{3} + 1\right)^{2}. \end{align*} We set $$A = -2 p_1 + p_2$$, this becomes $2\left|s_2\right| = \left(A - 1\right)^2 + \left(A + p_3\right)^2 + \left(A + p_2 + p_3 + 1\right)^2$ Since $\beta_4 = 3 p_{1} - 2 p_{2} - p_{3} \Longrightarrow p_4 = 4 p_{1} - 3 p_{2} - 2 p_{3} - 2.$ \subsection{Algorithm: Advancing Front} We use an advancing front \begin{align*} s_1 + \cdots + s_n &= 0 \\ s_1 + \cdots + s_n &= 1 \\ s_1 + \cdots + s_n &= 2 \\ & \vdots \end{align*} until some maximum value is reached. As a proxy, when $$n = 2m$$ is even, note that $\boldsymbol{\beta} = \left(-m, -(m - 1), \ldots, -1, 1, \ldots, m - 1, m\right)$ gives $$s_{2j - 1} = 0$$ since multiplying each term by $$-1$$ does nothing to the sequence but changes the sum by a factor of $$(-1)^{2j - 1} = -1$$. Similarly for the odd sums when $$n = 2m + 1$$ is odd, we can use $\boldsymbol{\beta} = \left(-m, -(m - 1), \ldots, -1, 0, 1, \ldots, m - 1, m\right)$ to get $$s_{2j - 1} = 0$$. Both of these will give use worst cases for $$\left|s_{2j}\right|$$ and will tell us when to \textbf{stop advancing our front}. \section{Lemma: Symmetric Polynomial Order Reduction} \label{reduction-lemma} For real values $$X_1 \leq \cdots \leq X_n$$, define the $$j$$th symmetric polynomial $s_j(X_1, \ldots, X_n) = \sum_{1 \leq i_1 < \cdots < i_j \leq n} X_{i_1} \cdots X_{i_j}$ \textit{Claim}: There exist $$n - 1$$ points $$Y_1, \ldots, Y_{n - 1}$$ \textbf{between} the $$\left\{X_j\right\}$$, i.e. satisfying $X_1 \leq Y_1 \leq X_2 \leq \cdots \leq X_{n - 1} \leq Y_{n - 1} \leq X_n,$ such that $\frac{s_j(X_1, \ldots, X_n)}{\binom{n}{j}} = \frac{s_j(Y_1, \ldots, Y_{n - 1})}{\binom{n - 1}{j}}.$ \begin{proof} We know the polynomial $g(X) = (X - X_1) \cdots (X - X_n)$ takes the form \begin{align*} g(X) &= X^n - s_1(X_1, \ldots, X_n) X^{n - 1} + \cdots + (-1)^n s_n(X_1, \ldots, X_n) \\ &= X^n + \sum_{i = 1}^n (-1)^i s_i(X_1, \ldots, X_n) X^{n - i} \end{align*} by Vieta's relations. Since $$g$$ is degree $$n$$ and has $$n$$ real roots we know that $$g'(X)$$ has $$n - 1$$ real roots. Between any two distinct roots $$X_i < X_{i + 1}$$, Rolle's Theorem tells us that there exists $$X_i < Y_i < X_{i + 1}$$ such that $$g'(Y_{i + 1}) = 0$$. For nondistinct $$X_i$$, if $$X_i = \cdots = X_{i + r - 1}$$ for some number of repetitions $$r$$ then $$(X - X_i)^r$$ divides $$g(X)$$ hence $$(X - X_i)^{r - 1}$$ divides $$g'(X)$$. Thus, in the case the roots are distinct, fully repeated, or somewhere in between, there are always $$r - 1$$ roots between $$X_i$$ and $$X_{i + r - 1}$$. Letting $$i = 1$$, $$r = n$$, we see that we have roots $$Y_i$$ of $$g'(X)$$ satisfying: $X_1 \leq Y_1 \leq \cdots \leq Y_{n - 1} \leq X_n.$ Since $$g'(X) = n X^{n - 1} + \cdots$$ we know that $g'(X) = n (X - Y_1) \cdots (X - Y_{n - 1})$ and by Vieta $\frac{g'(X)}{n} = X^{n - 1} - s_1\left(Y_1, \ldots, Y_{n - 1}\right) X^{n - 2} + \cdots + (-1)^{n - 1} s_{n - 1}\left(Y_1, \ldots, Y_{n - 1}\right).$ Instead, differentiating the original we have $g'(X) = n X^{n - 1} - (n - 1) s_1(X_1, \ldots, X_n) X^{n - 2} + \cdots + (-1)^{n - 1} s_{n - 2}\left(X_1, \ldots, X_n\right).$ So for $$1 \leq j \leq n - 1$$ (i.e. $$s_j$$ among $$s_1, \ldots, s_{n - 1}$$), we have $(n - j) s_j\left(X_1, \ldots, X_n\right) = n s_j\left(Y_1, \ldots, Y_{n - 1}\right).$ Noting that $$(n - j) \binom{n}{j} = n \binom{n - 1}{j}$$, we conclude that $\frac{s_j(X_1, \ldots, X_n)}{\binom{n}{j}} = \frac{s_j(Y_1, \ldots, Y_{n - 1})}{\binom{n - 1}{j}}. \tag*{\qedhere}$ \end{proof} \end{document} 