# From proof to simpler proof

Solve $\sin(x)=2x$ for $x$.

By mere inspection, we have $x=0$.

Visually, it appears that $0$ is the only solution (see Fig.1 or Fig. 2)

Fig. 1

Fig. 2

To show that $0$ is the only solution of $\sin(x)=2x$ analytically, let

$f(x) = \sin(x)-2x$

$0$ is a solution of $\sin(x)=2x$ means

$f(0) = 0\quad\quad\quad(1)$

Suppose there is a solution

$x^* \neq 0\quad\quad\quad(2)$

then,

$f(x^*)=\sin(x^*)-2x^*=0\quad\quad\quad(3)$

Since $f(x)$ is an function continuous and differentiable on $(-\infty, +\infty)$,

by Lagrange’s Mean-Value Theorem (see “A Sprint to FTC“), there $\exists c \in (0, x^*)$ such that

$f(x^*)-f(0)=f'(c)(x^*-0)$

$\overset{(1), (3)}{\implies} 0 = f'(c) x^*\quad\quad\quad(4)$

$\overset{(2)}{\implies} f'(c)=0\quad\quad\quad(5)$.

We know

$f'(x) = \cos(x)-2$.

From (5), we have

$\cos(c)-2=0$.

i.e.,

$\cos(c)=2$.

This is not possible since $\forall x \in R, -1 \le \cos(x) \le 1$.

A simpler alternative without direct applying Lagrange’s Mean-Value Theorem is:

$f(x) = \sin(x)-2x \implies f'(x) = \cos(x) - 2 \overset{-1 \le \cos(x) \le 1}{\implies} f'(x) < 0\implies$

$f(x) =\sin(x)-2x$ is a strictly decreasing function.

Since $f(0) = 0, \forall x<0, f(x) > 0$ and $\forall x>0, f(x) < 0$.

Therefore, $0$ is the only solution of $f(x)=0$. i.e.,

$0$ is the only solution of $\sin(x) = 2x$.

# Analyzing Heron’s Algorithm

Finding the value of $\sqrt{a}$ for $a>0$ is equivalent to seeking a positive number $x$ whose square yields $a$: $x^2=a$. In other words, the solution to $x^2-a=0$.

We can find $\sqrt{4}$ by solving $x^2-4=0$. It is $2$ by mere inspection. $\sqrt{9}$ can also be obtained by solving $x^2-9=0$, again, by mere inspection. But ‘mere inspection’ can only go so far. For Example, what is $\sqrt{15241578750190521}$?

Method for finding the square root of a positive real number date back to the acient Greek and Babylonian eras. Heron’s algorithm, also known as the Babylonian method, is an algorithm named after the $1^{st}$ – century Greek mathematician Hero of Alexandria, It proceeds as follows:

1. Begin with an arbitrary positive starting value $x_0$.
2. Let $x_{k+1}$ be the average of $x_k$ and $\frac{a}{x_k}$
3. Repeat step 2 until the desired accuracy is achieved.

This post offers an analysis of Heron’s algorithm. Our aim is a better understanding of the algorithm through mathematics.

Let us begin by arbitrarily choosing a number $x_0>0$. If $x_0^2-a=0$, then $x_0$ is $\sqrt{a}$, and we have guessed the exact value of the square root. Otherwise, we are in one of the following cases:

Case 1: $x_0^2-a > 0 \implies x_0^2 > a \implies \sqrt{a} x_0 > a \implies \sqrt{a} > \frac{a}{x_0} \implies \frac{a}{x_0} < \sqrt{a} < x_0$

Case 2: $x_0^2-a <0 \implies x_0 < \sqrt{a} \implies \sqrt{a} x_0 < a \implies \sqrt{a} < \frac{a}{x_0} \implies x_0 < \sqrt{a} < \frac{a}{x_0}$

Both cases indicate that $\sqrt{a}$ lies somewhere between $\frac{a}{x_0}$ and $x_0$.

Let us define $e_0$ as the relative error of approximating $\sqrt{a}$ by $x_0$:

$e_0 =\frac{x_0-\sqrt{a}}{\sqrt{a}}$

The closer $e_0$ is to $0$, the better $x_0$ is as an approximation of $\sqrt{a}$.

Since $x_0>0, \sqrt{a}>0$,

$e_0 + 1 = \frac{x_0-\sqrt{a}}{\sqrt{a}} + 1 = \frac{x_0}{\sqrt{a}} > 0\quad\quad\quad(1)$

By (1),

$x_0 = \sqrt{a} (e_0+1)\quad\quad\quad(2)$

Let $x_1$ be the mid-point of $x_0$ and $\frac{a}{x_0}$:

$x_1 = \frac{1}{2} (x_0 + \frac{a}{x_0})\quad\quad\quad(3)$

and, $e_1$ the relative error of $x_1$ approximating $\sqrt{a}$,

$e_1 = \frac{x_1-\sqrt{a}}{\sqrt{a}}\quad\quad\quad(4)$

We have

$x_1-\sqrt{a}\overset{(3)}{=}\frac{1}{2}(x_0+\frac{a}{x_0}) - \sqrt{a} \overset{(2)}{=}\frac{1}{2}(\sqrt{a}(e_0+1)+\frac{a}{\sqrt{a}(e_0+1)})-\sqrt{a}=\frac{\sqrt{a}}{2}\frac{e_0^2}{e_0+1}\overset{(1)}{>}0\quad(5)$

$e_1 = \frac{x_1-\sqrt{a}}{\sqrt{a}} \overset{(5)}{=} \frac{\frac{\sqrt{a}}{2}\frac{e_0^2}{e_0+1}}{\sqrt{a}} = \frac{1}{2}\frac{e_0^2}{e_0+1}\overset{(1)}{ > }0\quad\quad\quad(6)$

By (5),

$x_1 > \sqrt{a} \implies \sqrt{a} x_1 > a \implies \sqrt{a} > \frac{a}{x_1}\implies \frac{\sqrt{a}}{x_1} < \sqrt{a} < x_1$

i.e., $\sqrt{a}$ lies between $\frac{a}{x_1}$ and $x_1$.

We can generate more values in stages by

$x_{k+1} = \frac{1}{2}(x_k + \frac{a}{x_k}), \;k=1, 2, 3, ...\quad\quad(8)$

Clearly,

$\forall k, x_k >0$.

Let

$e_k = \frac{x_k-\sqrt{a}}{\sqrt{a}}$.

We have

$x_k = \sqrt{a}(e_k+1)\quad\quad\quad(9)$

and

$e_{k+1} = \frac{x_{k+1}-\sqrt{a}}{\sqrt{a}}=\frac{\frac{1}{2}(\sqrt{a}(e_k+1) + \frac{a}{\sqrt{a}(e_k+1)})-\sqrt{a}}{\sqrt{a}}=\frac{1}{2}\frac{e_k^2}{e_k + 1}\quad\quad\quad(10)$

Consequently, we can prove that

$\forall k \ge 1, e_{k+1} > 0\quad\quad\quad(11)$

by induction:

When $k = 1, e_{1+1} \overset{(10)}{=} \frac{1}{2}\frac{e_1^2}{e_1+1} \overset{(6)} {>} 0$.

Assume when $k = p$,

$e_{p+1} =\frac{1}{2}\frac{e_p^2}{e_p^2+1} >0\quad\quad\quad(12)$

When $k = p + 1, e_{(p+1)+1} \overset{(10)}{=}\frac{1}{2}\frac{e_{p+1}^2}{e_{p+1} + 1}\overset{(12)}{>}0$.

Since (11) implies

$\forall k \ge 2, e_k > 0\quad\quad\quad(13)$

(10) and (13) together implies

$\forall k \ge 1, e_k > 0\quad\quad\quad(14)$

It follows that

$\forall k \ge 1, 0 \overset{(14)}{<} e_{k+1} \overset{(10)}{=} \frac{1}{2}\frac{e_k^2}{e_k+1} \overset{(14)}{<} \frac{1}{2}\frac{e_k^2}{e_k}=\frac{1}{2}e_k\quad\quad\quad(15)$

and

$\forall k \ge 1, 0 \overset{(14)}{<} e_{k+1} \overset{(10)}{=} \frac{1}{2}\frac{e_k^2}{e_k+1}\overset{(14)}{<}\frac{1}{2}\frac{e_k^2}{1}

(15) indicates that the error is cut at least in half at each stage after the first.

Therefore, we will reach a stage $n$ that

$0 < e_n < 1\quad\quad\quad(17)$

It can be shown that

$\forall k \ge 1, 0< e_{n+k} < e_n^{2^k}\quad\quad\quad(18)$

by induction:

When $k = 1$,

$0 \overset{(14)}{<} e_{n+1} \overset{(10)}{=}\frac{1}{2}\frac{e_n^2}{e_n+1} \overset{}{<} e_n^2=e_n^{2^1}\quad\quad\quad(19)$

Assume when k = p,

$0 < e_{n+p} < e_n^{2^p}\quad\quad\quad(20)$

When $k = p+1$,

$0 \overset{(14)}{<} e_{n+(p+1)} = e_{(n+p)+1} \overset{(16)}{<}e_{n+p}^2\overset{(20)}{<}(e_n^{2^p})^2 = e_n^{2^p\cdot 2} = e_n^{2^{p+1}}$

From (17) and (18), we see that eventually, the error decreases at an exponential rate.

Illustrated below is a script that implements the Heron’s Algorithm:

/*
a   - the positive number whose square root we are seeking
x0  - the initial guess
eps - the desired accuracy
*/
square_root(a, x0, eps) := block(
[xk],
numer:true,
xk:x0,
loop,
if abs(xk^2-a) < eps then return(xk),
xk:(xk+a/xk)/2,
go(loop)
)\$


Suppose we want to find $\sqrt{2}$ and start with $x_0 = 99$. Fig. 2 shows the Heron’s algorithm in action.

Fig. 1

# Déjà vu!

The stages of a two stage rocket have initial masses $m_1$ and $m_2$ respectively and carry a payload of mass $P$. Both stages have equal structure factors and equal relative exhaust speeds. If the rocket mass, $m_1+m_2$, is fixed, show that the condition for maximal final speed is

$m_2^2 + P m_2 = P m_1$.

Find the optimal ratio $\frac{m_1}{m_2}$ when $\frac{P}{m_1+m_2} = b$.

According to multi-stage rocket’s flight equation (see “Viva Rocketry! Part 2“), the final speed of a two stage rocket is

$v = -c \log(1-\frac{e \cdot m_1}{m_1 + m_2 +P})- c\log(1-\frac{e \cdot m_2}{m_2+P})$

Let $m_0 = m_1 + m_2$, we have

$m_1 = m_0-m_2$

and,

$v = -c \log(1-\frac{e \cdot (m_0-m_2)}{m_0 +P})- c\log(1-\frac{e \cdot m_2}{m_2+P})$

Differentiate $v$ with respect to $m_2$ gives

$v' = \frac{c(e-1)e(2m_2 P-m_0 P +m_2^2)}{(P+m_2)(P-e m_2+m_2)(P+e m_2-e m_0+m_0)}= \frac{c(e-1)e(2m_2-m_0 P+m_2^2)}{(P+m_2)(P+(1-e)m_2)(P+e m_2 + (1-e)m_0)}\quad\quad\quad(1)$

It follows that $v'=0$ implies

$2m_2 P-m_0 P + m_2^2 =0$.

That is, $2 m_2 P - (m_1+m_2) P + m_2^2 = (m_2-m_1) P + m_2^2=0$. i.e.,

$m_2^2 + P m_2 = P m_1\quad\quad\quad(2)$

It is the condition for an extreme value of $v$. Specifically, the condition to attain a maximum (see Exercise-2)

When $\frac{P}{m_1+m_2} = b$, solving

$\begin{cases} (m_2-m_1)P+m_2^2=0\\ \frac{P}{m_1+m_2} = b\end{cases}$

yields two pairs:

$\begin{cases} m_1=\frac{(\sqrt{b}\sqrt{b+1}+b+1)P}{b}\\m_2= -\frac{\sqrt{b^2+b}P+bP}{b}\end{cases}$

and

$\begin{cases} m_1= - \frac{(\sqrt{b}\sqrt{b+1}-b-1)P}{b}\\ m_2=\frac{\sqrt{b^2+b}P-bP}{b}\end{cases}\quad\quad\quad(3)$

Only (3) is valid (see Exercise-1)

Hence

$\frac{m_1}{m_2} = -\frac{(\sqrt{b}\sqrt{b+1}-b-1)P}{\sqrt{b^2+b}P-b P} = \frac{\sqrt{b+1}}{\sqrt{b}} = \sqrt{1+\frac{1}{b}}$

The entire process is captured in Fig. 2.

Fig. 2

Exercise-1 Given $b>0, 00$, prove:

1. $- \frac{(\sqrt{b}\sqrt{b+1}-b-1)P}{b}>0$
2. $\frac{\sqrt{b^2+b}P-bP}{b}>0$

Exercise-2 From (1), prove the extreme value attained under (2) is a maximum.

# Boosting rocket flight performance without calculus (Viva Rocketry! Part 2.1)

Fig. 1

Given

$v = c\log(\frac{(a+1)(b+1)((a+1)b+1)}{(1-e+(a+1)b)((1-e)a+(a+1)b+1})$

where $b>0, 00, a>0$, maximize $v$ with appropriate $a$.

The above optimization problem is solved using calculus (see “Viva Rocketry! Part 2“). However, there is an alternative that requires only high school mathematics with the help of a Computer Algebra System (CAS). This non-calculus approach places more emphasis on problem solving through mathematical thinking, as all symbolic calculations are carried out by the CAS (e.g., see Fig. 2). It also makes a range of interesting problems readily tackled with minimum mathematical prerequisites.

The fact that

$\log$ is a monotonic increasing function $\implies v_{max} = c\log(w_{max})$

where

$w = \frac{(a+1)(b+1)((a+1)b+1)}{(1-e+(a+1)b)((1-e)a+(a+1)b+1)}$

or

$w(1-e+(a+1)b)((1-e)a+(a+1)b+1) - (a+1)(b+1)((a+1)b+1)=0\quad\quad\quad(1)$

(1) can be written as

$A_1 a^2 + B_1 a +C_1= 0$

where

$A_1 = -bew+b^2w+bw-b^2-b$

$B_1 = e^2w-2bew-2ew+2b^2w+3bw+w-2b^2-3b-1$,

$C_1 = -bew-ew+b^2w+2bw+w-b^2-2b-1$.

Since $A_1 = 0$ means

$-b e w + b^2 w + b w - b^2 - b =0$.

That is

$w = -\frac{b+1}{e-b-1}$.

Solve

$\frac{(a+1)(b+1)((a+1)b+1)}{(1-e+(a+1)b)((1-e)a+(a+1)b+1)} = -\frac{b+1}{e-b-1}$

for $a$ gives $a = 0 \implies A_1 \neq 0$ if $a > 0$.

Hence, (1) is a quadratic equation. For it to have solution, its discriminant $B_1^2-4A_1C_1$ must be nonnegative, i.e.,

$(e^2-2be-2e+b+1)^2 w^2-2(b+1)(2be^2+e^2-2be-2e+b+1) w +(b+1)^2 \geq 0\quad(2)$

Consider

$(e^2-2be-2e+b+1)^2 w^2-2(b+1)(2be^2+e^2-2be-2e+b+1)w+(b+1)^2 = 0\quad(3)$

If $e^2-2be-2e+b+1 \neq 0$, (3) is a quadratic equation.

Solving (3) yields two solutions

$w_1 = -\frac{(b+1)(2\sqrt{b(b+1)}e^2-2be^2-e^2-2\sqrt{b(b+1)}e+2be+2e-b-1)}{(e^2-2be-2e+b+1)^2}$,

$w_2=\frac{(b+1)(2\sqrt{b(b+1)}e^2+2be^2+e^2-2\sqrt{b(b+1)}e-2be-2e+b+1)}{(e^2-2be-2e+b+1)^2}$.

Since $0 < e < 1$,

$w_1 - w_2 = -\frac{4(b+1)\sqrt{b(b+1)}(e-1)e}{(e^2-2be-2e+b+1)^2} > 0\quad(4)$

(4) implies

$w_1 > w_2$

and, the solution to (2) is

$w \leq w_2$ or $w \ge w_1$

i.e.,

$w \leq \frac{(b+1)(2\sqrt{b(b+1)}e^2+2be^2+e^2-2\sqrt{b(b+1)}e-2be-2e+b+1)}{(e^2-2be-2e+b+1)^2}\quad\quad\quad(4)$

or

$w \ge -\frac{(b+1)(2\sqrt{b(b+1)}e^2-2be^2-e^2-2\sqrt{b(b+1)}e+2be+2e-b-1)}{(e^2-2be-2e+b+1)^2}\quad\quad\quad(5)$

We prove that (4) is true by showing (5) is false:

Consider $w - w_1= 0$:

$\frac{(b+1)(e-1) e \cdot f(a)}{(1-e+ab+b)(a(1-e)+ab+b+1)(e^2-2be-2e+b+1)^2} = 0\quad\quad\quad(6)$

where

$f(a) = 2a\sqrt{b(b+1)}e^2+a^2be^2+be^2+e^2-2a^2b\sqrt{b(b+1)}$

$-4ab\sqrt{b(b+1)}e - 2b\sqrt{b(b+1)}e - 4a\sqrt{b(b+1)}e$

$-2\sqrt{b(b+1)}e-2a^2b^2e-4ab^2e-2a^2be-4abe-4be$

$-2e+2a^2b^2\sqrt{b(b+1)}+4ab^2\sqrt{b(b+1)} + 2b^2\sqrt{b(b+1)} +2a^2b\sqrt{b(b+1)}$

$+6ab\sqrt{b(b+1)} + 4b\sqrt{b(b+1)} + 2a\sqrt{b(b+1)} + 2\sqrt{b(b+1)}$

$+2a^2b^3 + 4ab^3 + 2b^3 +3a^2b^2 + 8ab^2 +5b^2+a^2b+4ab+4b+1$.

It can be written as

$A_2a^2 + B_2a + C_2\quad\quad\quad(7)$

where

$A_2 = be^2-2b\sqrt{b(b+1)}e-2b^2e-2be+2b^2\sqrt{b(b+1)}+2b\sqrt{b(b+1)}$

$+2b^3+3b^2+b$,

$B_2 = 2\sqrt{b(b+1)}e^2-4b\sqrt{b(b+1)}e -4\sqrt{b(b+1)}e$

$-4b^2e-4be+4b^2\sqrt{b(b+1)}+6b\sqrt{b(b+1)}+2\sqrt{b(b+1)}+4b^3+8b^2+4b$,

$C_2=be^2+e^2-2b\sqrt{b(b+1)}e-2\sqrt{b(b+1)}e-2b^2e-4be$

$-2e+2b^2\sqrt{b(b+1)}+4b\sqrt{b(b+1)}+2\sqrt{b(b+1)}+2b^3+5b^2+4b+1$.

Since $A_2 > 0$ (see Exercise 1) and,

solve (7) for $a$ yields

$a = -\sqrt{1+\frac{1}{b}}$.

It follows that for $a > 0, f(a) > 0$.

Consequently, $w-w_1$ is a negative quantity. i.e.,

$w-w1 < 0$

which tells that (5) is false.

Hence, when $e^2-2be-2e+b+1 \neq 0$, the global maximum $w_{max}$ is $w_2$.

Solving $w = w_2$ for $a$:

$\frac{(a+1)(b+1)((a+1)b+1}{(1-e+(a+1)b)((1-e)a+(a+1)b+1)} = \frac{(b+1)(2\sqrt{b(b+1)}e^2+2be^2+e^2-2\sqrt{b(b+1)}e-2be-2e+b+1)}{(e^2-2be-2e+b+1)^2}$,

we have

$a = \sqrt{1+ \frac{1}{b}}$.

Therefore,

$e^2-2be-2e+b+1 \neq 0 \implies w$ attains maximum at $a = \sqrt{1+ \frac{1}{b}}$.

In fact, $w$ attains maxima at $a = \sqrt{1+\frac{1}{b}}$ even when $e^2-2be-2e+b+1 = 0$, as shown below:

Solving $e^2-2be-2e+b+1 = 0$ for $e$, we have

$e_1= -\sqrt{b(b+1)}+b+1$ or $e_2 = \sqrt{b(b+1)} + b + 1$.

Only $e_1$ is valid (see Exercise-2),

When $e = e_1$,

$w(\sqrt{1+\frac{1}{b}} )- w(a) = - \frac{(b+1)g(a)}{4 \sqrt{b(b+1)} (\sqrt{b(b+1)}+ab) (a\sqrt{b(b+1)}+b+1)}\quad(8)$

where

$g(a) = (2a^2b+4ab+2b+2a+2)\sqrt{b(b+1)}-2a^2b^2-4ab^2-2b^2-a^2b-4ab-3b-1$

Solve quadratic equation $g(a) = 0$ for $a$ yields

$a = \sqrt{1+\frac{1}{b}}$.

The coefficient of $a^2$ in $g(a)$ is $2b\sqrt{b(b+1)}-2b^2-b$, a negative quantity (see Exercise-3).

The implication is that $g(a)$ is a negative quantity when $a \neq \sqrt{1 + \frac{1}{b}}$.

Hence, (8) is a positive quantity, i.e.,

$e^2-2be-2e+b+1 = 0, a \neq \sqrt{1+\frac{1}{b}} \implies w(\sqrt{1+\frac{1}{b}})-w(a) > 0$

We therefore conclude

$\forall 0 < e < 1, b > 0, w$ attains its maximum at $a = \sqrt{1+\frac{1}{b}}$.

Fig. 2

Exercise-1 Prove:$00 \implies$

$be^2-2b\sqrt{b(b+1)}e-2b^2e-2be+2b^2\sqrt{b(b+1)}+2b\sqrt{b(b+1)}+2b^3+3b^2+b > 0$

Exercise-2 Prove: $b > 0 \implies 0 <-\sqrt{b(b+1)} + b +1 <1$

Exercise-3 Prove: $b > 0 \implies 2b\sqrt{b(b+1)}-2b^2-b < 0$

# Prelude to Taylor’s theorem

As an application of derivative, we may prove the Binomial theorem that concerns the expansion of $(1+x)^n$ as a polynomial. Namely,

$(1+x)^n = 1+ a_1 x + a_2 x^2 + ... + a_n x^n\quad\quad\quad(1)$

where $a_k = \frac{n(n-1)(n-2)...(n-k+1)}{k!}, 1 \leq k \leq n, n \in N$.

There are two steps:

Step 1) Prove $(1+x)^n$ can be expressed as a polynomial $1+a_1 x + a_2 x^2 + ... + a_n x^n$, i.e.,

$(1+x)^n = 1 + \sum\limits_{i=1}^{n}a_i x^i$

where $a_k$s are constants.

Step 2) Show that

$a_k = \frac{n(n-1)(n-2)...(n-k+1)}{k!}$

We use mathematical induction first.

When $n = 1$,

$(1+x)^1 = 1+x = 1 +a_1 x$

where $a_1 = 1$.

Assume when $n=k, (1+x)^k$is a polynomial:

$(1+x)^k = 1+b_1 x + b_2 x + ... +b_k x^k\quad\quad\quad(2)$

When $n = k+1$,

$(1+x)^{k+1} = (1+x)^k (1+x)$.

By (2), it is

$(1+b_1 x+b_2 x^2+ ...+ b_k x^k) (1+x)$

$= (1+\sum\limits_{i=1}^{k} b_i x^i)(1+x)$

$= 1+ x + \sum\limits_{i=1}^{k}b_i x^i (1+x)$

$= 1+x + \sum\limits_{i=1}^{k}(b_i x^i +b_i x^{i+1})$

$= 1+x +\sum\limits_{i=1}^{k}b_i x^i + \sum\limits_{i=1}^{k} b_i x^{i+1}$

$= 1+x + (b_1 x + b_2 x^2 +... +b_k x^k) + (b_1 x^2 + ... + b_{k-1} x^k + b_k x^{k+1})$

$= 1+ (b_1+1)x + (b_2 + b_1) x^2 + ... + (b_k + b_{k-1}) x^k + b_k x^{k+1}$

$= 1 +a_1 x + a_2 x^2 + ... + a_k x^k + a_{k+1} x^{k+1}$

where $a_1 = b_1+1, a_2=b_2+b_1, ..., a_k = b_k + b_{k-1}, a_{k+1} = b_k$.

Once (1) is established, we proceed to step 2) to construct $a_k$:

From (1),

$\frac{d^k}{dx^k}(1+x)^n = \frac{d^k}{dx^k}(1+a_1 x + a_2 x^2 + ... + a_k x^k + ... +a_n x^n)$.

That is

$n(n-1)(n-2)...(n-k+1)(1+x)^{n-k} = k(k-1)(k-2)...1 \cdot {a_k} + \sum\limits_{i=1}^{n-k} c_i x^i\quad\quad\quad(3)$

where $c_i$s are constants.

Let $x = 0$, (3) becomes

$n(n-1)(n-2)...(n-k+1) \cdot 1= k(k-1)(k-2)...1 \cdot {a_k} + 0$

i.e.,

$n(n-1)(n-2)...(n-k+1) = k!\;a_k\quad\quad\quad(4)$

Solving (4) for $a_k$ gives

$a_k = \frac{n(n-1)(n-2)...(n-k+1)}{k!}$.

Since

$\frac{n(n-1)(n-2)...(n-k+1)}{k!} = \frac{n(n-1)(n-k+1)\boxed{(n-k)(n-k-1)...1}}{\boxed{(n-k)(n-k-1)...1}\;k!}=\frac{n!}{(n-k)!\;k!}$,

$a_k$ is often expressed as

$a_k = \frac{n!}{(n-k)!\;k!}$