Restate Feynman's "Great Identity"

In a letter dated July 7, 1948, Richard Feynman, a Nobel laureate in physics (1965) made a claim:

I am the possessor of a swanky new scheme to do each problem in terms of one with one less energy denominator. It is based on the great identity \frac{1}{ab} = \int \limits_{0}^{1} \frac{1}{(ax+b(1-x))^2}\;dx.

(The Beat of a Different Drum: The Life and Science of Richard Feynman, by Jagdish Mehra, page 262)

Assuming non-zero constants a, b are both real but otherwise arbitrary, let’s check the validity of Feynman’s “great identity”.

If a \ne b,

\int \frac{1}{(ax+b(1-x))^2}\;dx

= \int \frac{1}{((a-b)x+b)^2}\;dx

=  \int \frac{1}{a-b}\cdot \frac{a-b}{((a-b)x+b)^2}\;dx

= \frac{1}{a-b}\int \frac{((a-b)x+b)'}{((a-b)x+b)^2}\;dx

= \frac{1}{a-b}\cdot \frac{-1}{(a-b)x+b}.

Using Leibniz’s rule,

\int\limits_{0}^{1}\frac{1}{(ax+b(1-x))^2}\;dx = \frac{1}{a-b}\cdot\frac{-1}{(a-b)x+b}\bigg| _{0}^{1} =\frac{1}{a-b}(\frac{-1}{a}-\frac{-1}{b})=\frac{1}{ab}.

When a=b,

\int \frac{1}{(ax+b(1-x))^2}\;dx = \int \frac{1}{b^2}\;dx =\frac{1}{b^2}x


\int \limits_{0}^{1}\frac{1}{(ax+b(1-x))^2}\;dx = \frac{1}{b^2}x\bigg|_{0}^{1} = \frac{1}{b^2}=\frac{1}{ab}.

All is as Feynman claimed:

\int\limits_{0}^{1}\frac{1}{(ax+b(1-x))^2}\;dx = \frac{1}{ab}\quad\quad\quad(\star)

There is something amiss:

If a and b have opposite signs i.e., ab<0 then the right hand side of (\star) is negative. But the integrand is squared so the integral on the left hand side of (\star) is never negative, no matter what a and b may be.

Let’s figure it out !

In its full glory, Leibniz’s rule we used to obtain (\star) is

If the real-valued function F on an open interval I in R has the continuous derivative f and a, b \in I then \int\limits_{a}^{b} f(x)\; dx = F(b)-F(a).

Essentially, the rule requires the integrand f to be a continuous function on an open interval that contains a and b.

Solving ax+b(1-x)=0 for x yields

x = \frac{b}{b-a},

the singularity of integrand \frac{1}{(ax+(1-x))^2} in (\star).

For ab<0, we consider the following two cases:

Case (1-1) (a>0, b<0) \implies (a>0, b<0, b-a<0)\implies (\frac{b}{b-a}>0,

\frac{b}{b-a} = \frac{b-a+a}{b-a} = 1+\frac{a}{b-a}<1)\implies 0<\frac{b}{b-a}<1

Case (1-2) (a<0, b>0) \implies (a<0, b>0, b-a>0) \implies (\frac{b}{b-a}>0,

\frac{b}{b-a} = \frac{b-a+a}{b-a} = 1+\frac{a}{b-a}<1) \implies  0<\frac{b}{b-a}<1

From both cases, we see that

when ab<0, \frac{1}{(ax+b(1-x))^2} has a singularity in (0, 1) \implies \frac{1}{(ax+b(1-x))^2} is not continuous in (0, 1).

Applying Leibniz’s rule to \int\limits_{0}^{1} \frac{1}{(ax+b(1-x))^2}\;dx regardless of integrand’s singularity thus ensured an outcome of absurdity.

However, ab>0 paints a different picture.

Since ab \implies (a>0, b>0) or (a<0, b<0), we have

Case (2-0) a=b \implies \frac{1}{(ax+b(1-x))^2}=\frac{1}{b^2} \implies no singularity

Case (2-1) (a>0, b>0, a>b) \implies (a>0, b>0, b-a<0) \implies \frac{b}{b-a}<0

Case (2-2) (a>0, b>0, a<b) \implies (a>0, b>0, b-a>0)

\implies \frac{b}{b-a} = \frac{b-a+a}{b-a}=1+\frac{a}{b-a}>1

Case (2-3) (a<0, b<0, a>b) \implies (a<0, b<0, b-a<0) \implies \frac{b}{b-a}=1+\frac{a}{b-a}>1

Case (2-4) (a<0, b<0, a<b) \implies (a<0, b<0, b-a>0) \implies \frac{b}{b-a} <0

All cases show that when ab>0, the integrand has no singularity in (0,1).

It means that \frac{1}{(ax+b(1-x))^2} is continuous in (0, 1) and therefore, Leibniz’s rule applies.

So let’s restate Feynman’s “great identity”:

a\cdot b > 0 \iff \frac{1}{ab} = \int\limits_{0}^{1}\frac{1}{(ax+b(1-x))^2}\;dx

Exercise-1 Evaluate \int\limits_{0}^{1}\frac{1}{(ax+b(1-x))^2}\;dx using Omega CAS Explorer. For example,

(hint : for ab>0, specify a > b or b>a)

Oh! Matryoshka!

Given polynomial f(x) = a_0 + a_1 x+a_2 x^2 + ... + a_{n-1}x^{n-1}+a_n x^n, we wish to evaluate integral

\int \frac{f(x)}{(x-a)^p}\;dx, \quad p \in N^+\quad\quad\quad(1)

When p = 1,

\int \frac{f(x)}{x-a} \;dx= \int \frac{f(x)-f(a)+f(a)}{x-a}\;dx

= \int \frac{f(x)-f(a)}{x-a}\;dx + \int \frac{f(a)}{x-a}\;dx

=\int \frac{f(x)-f(a)}{x-a}\;dx + f(a)\cdot \log(x-a).


f(x) = a_0 + a_1x + a_2x^2 + ... + a_{n-1}x^{n-1} + a_n x^n


f(a) = a_0 + a_1 a + a_2 a^2 + ... + a_{n-1}a^{n-1} + a_n a^n

It follows that

f(x)-f(a) = a_1(x-a) + a_2(x^2-a^2) + ... + a_{n-1}(x^{n-1}-a^{n-1}) + a_n (x^n-a^n).

That is

f(x)-f(a) = \sum\limits_{k=1}^{n}a_k(x^k-a^k)

By the fact (see “Every dog has its day“) that

x^k-a^k =(x-a)\sum\limits_{i=1}^{k}x^{k-i}a^{i-1},

we have

f(x)-f(a) = \sum\limits_{k=1}^{n}a_k(x-a)\sum\limits_{i=1}^{k}x^{k-i}a^{i-1}=(x-a)\sum\limits_{k=1}^{n}(a_k\sum\limits_{i=1}^{k}x^{k-i}a^{i-1})


\frac{f(x)-f(a)}{x-a}=  \sum\limits_{k=1}^{n}(a_k\sum\limits_{i=1}^{k}x^{k-i}a^{i-1})\quad\quad\quad(2)


\int\frac{f(x)}{x-a}\;dx = \int \sum\limits_{k=1}^{n}(a_k \sum\limits_{i=1}^{k}x^{k-i}a^{i-1})\;dx + f(a)\log(x-a)

=\sum\limits_{k=1}^{n}(a_k \sum\limits_{i=1}^{k}\int x^{k-i}a^{i-1}\; dx)+ f(a)\log(x-a)


\int \frac{f(x)}{x-a} = \sum\limits_{k=1}^{n}(a_k\sum\limits_{i=1}^{k}\frac{x^{k-i+1}}{k-i+1}a^{i-1})+ f(a)\log(x-a)

Let us now consider the case when p>1:

\int \frac{f(x)}{(x-a)^p}\; dx

=\int \frac{f(x)-f(a)+f(a)}{(x-a)^p}\;dx

=\int  \frac{f(x)-f(a)}{(x-a)^p} + \frac{f(a)}{(x-a)^p}\;dx

=\int \frac{f(x)-f(a)}{(x-a)}\cdot\frac{1}{(x-a)^{p-1}} + \frac{f(a)}{(x-a)^p}\;dx

= \int \frac{f(x)-f(a)}{x-a}\cdot\frac{1}{(x-a)^{p-1}}\;dx + \int\frac{f(a)}{(x-a)^p}\; dx

\overset{(2)}{=}\int \frac{g(x)}{(x-a)^{p-1}}\;dx + \frac{f(a)(x-a)^{1-p}}{1-p}


g(x) = \frac{f(x)-f(a)}{x-a}=\sum\limits_{k=1}^{n}(a_k\sum\limits_{i=1}^{k}x^{k-i}a^{i-1}), a polynomial of order n-1.

What emerges from the two cases of p is a recursive algorithm for evaluating (1):

Given polynomial f(x) = \sum\limits_{k=0}^{n} a_k x^k,

\int \frac{f(x)}{(x-a)^p} \;dx, \; p \in N^+= \begin{cases}p=1: \sum\limits_{k=1}^{n}(a_k\sum\limits_{i=1}^{k}\frac{x^{k-i+1}}{k-i+1}a^{i-1})+ f(a)\log(x-a) \\p>1:  \int \frac{g(x)}{(x-a)^{p-1}}\;dx + \frac{f(a)(x-a)^{1-p}}{1-p}, \\ \quad\quad\quad g(x) = \frac{f(x)-f(a)}{x-a}=\sum\limits_{k=1}^{n}(a_k\sum\limits_{i=1}^{k}x^{k-i}a^{i-1}). \end{cases}

Exercise-1 Optimize the above recursive algorithm (hint: examine how it handles the case when f(x)=0)

Integration of Trigonometric Expressions

We will introduce an algorithm for obtaining indefinite integrals such as

\int \frac{(1+\sin(x))}{\sin(x)(1+\cos(x))}\;dx

or, in general, integral of the form

\int R(\sin(x), \cos(x))\;dx\quad\quad\quad(1)

where R is any rational function R(p, q), with p=\sin(x), q=\cos(x).


t = \tan(\frac{x}{2})\quad\quad(2)

Solving (2) for x, we have

x = 2\cdot\arctan(t)\quad\quad\quad(3)

which provides

\frac{dx}{dt} = \frac{2}{1+t^2}\quad\quad\quad(4)


\sin(x) =2\sin(\frac{x}{2})\cos(\frac{x}{2})\overset{\cos^(\frac{x}{2})+\sin^2(\frac{x}{2})=1}{=}\frac{2\sin(\frac{x}{2})\cos(\frac{x}{2})}{\cos^2(\frac{x}{2})+\sin^2(\frac{x}{2})}=\frac{2\frac{\sin(\frac{x}{2})}{\cos(\frac{x}{2})}}{1+\frac{\sin^2(\frac{x}{2})}{\cos^2(\frac{x}{2})}}=\frac{2\tan(\frac{x}{2})}{1+\tan^2(\frac{x}{2})}


\sin(x) = \frac{2 t}{1+t^2}\quad\quad\quad(5)


\cos(x) = \cos^2(\frac{x}{2})-\sin^2(\frac{x}{2})=\frac{\cos^2(\frac{x}{2})-\sin^2(\frac{x}{2})}{\cos^2(\frac{x}{2})+\sin^2(\frac{x}{2})}=\frac{1+\frac{\sin^2(\frac{x}{2})}{\cos^2(\frac{x}{2})}}{1+\frac{\sin^2(\frac{x}{2})}{\cos^2(\frac{x}{2})}}=\frac{1-\tan^2(\frac{x}{2})}{1+\tan^2(\frac{x}{2})}



We also have (see “Finding Indefinite Integrals” )

\int f(x)\;dx \overset{x=\phi(t)}{=} \int f(\phi(t))\cdot\frac{d\phi(t)}{dt}\;dt.


\int R(\cos(x), \sin(x))\;dx \overset{(2), (4), (5), (6)}{=} \int R(\frac{1-t^2}{1+t^2}, \frac{2t}{1+t^2})\cdot\frac{2}{1+t^2}\;dt,

and (1) is reduced to an integral of rational functions in t.

Example-1 Evaluate \int \csc(x)\;dx.

Solution: \csc(x) = \frac{1}{\sin(x)}\implies \int \csc(x)\;dx = \int \frac{1}{\sin(x)}\;dx

= \int \frac{1}{\frac{2t}{1+t^2}}\cdot\frac{2}{1+t^2}\;dt=\int\frac{1}{t}\;dt = \log(t) = \log(\tan(\frac{x}{2})).

Example-2 Evaluate \int \sec(x)\;dx.

Solution: \sec(x) = \frac{1}{\cos(x)}\implies \int \sec(x)\; dx =\int \frac{1}{\cos(x)}\;dx

=  \int \frac{1}{\frac{1-t^2}{1+t^2}}\cdot \frac{2}{1+t^2}\; dt=\int \frac{2}{1-t^2}\;dt=\int \frac{2}{(1+t)(1-t)}\;dt=\int \frac{1}{1+t} + \frac{1}{1-t}\;dt

=\int \frac{1}{1+t}\;dt - \int \frac{-1}{1-t}\;dt

=\log(1+t) -\log(1-t) =\log\frac{1+t}{1-t}=\log(\frac{1+\tan(\frac{x}{2})}{1-\tan(\frac{x}{2})}).

According to CAS (see Fig. 1),

Fig. 1

However, the two results are equivalent as a CAS-aided verification (see Fig. 2) confirms their difference is a constant (see Corollary 2 in “Sprint to FTC“).

Fig. 2

Exercise-1 According to CAS,

Show that it is equivalent to the result obtained in Example-1

Exercise-2 Try

\int \frac{1}{\sin(x)+1}\;dx

\int \frac{1}{\sin(x)+\cos(x)}\;dx

\int \frac{1}{(2+\cos(x))\sin(x)}\;dx

\int \frac{1}{5+4\sin(x)}\;dx

\int \frac{1}{2\sin(x)-\cos(x)+5}\;dx

and of course,

\int \frac{1+\sin(x)}{\sin(x)(1+\cos(x))}\;dx

Deriving Lagrange's multiplier

We wish to consider a special type of optimization problem:

Find the maximum (or minimum) of a function f(x,y) subject to the condition g(x,y)=0\quad\quad(1)

If it is possible to solve g(x)=0 for y so that it is expressed explicitly as y=\psi(x), by substituting y in (1), it becomes

Find the maximum (or minimum) of a single variable function f(x, \psi(x)).

In the case that y can not be obtained from solving g(x,y)=0, we re-state the problem as:

Find the maximum (or minimum) of a single variable function z=f(x,y) where y is a function of x, implicitly defined by g(x, y)=0\quad\quad\quad(2)

Following the traditional procedure of finding the maximum (or minimum) of a single variable function, we differentiate z with respect to x:

\frac{dz}{dx} = f_x(x,y) + f_y(x,y)\cdot \frac{dy}{dx}\quad\quad\quad(3)


g_x(x,y) + g_y(x,y)\cdot \frac{dy}{dx}=0\quad\quad\quad(4)

By grouping g(x, y)=0 and (3), we have

\begin{cases} \frac{dz}{dx}= f_x(x, y)+g_x(x, y)\cdot \frac{dy}{dx}\\ g(x,y) = 0\end{cases}\quad\quad\quad(5)

The fact that \frac{dz}{dx}= 0 at any stationary point x^* means for all (x^*, y^*) where g(x^*, y^*)=0,

\begin{cases} f_x(x^*, y^*)+g_x(x^*, y^*)\cdot \frac{dy}{dx}\vert_{x=x^*}=0 \\ g(x^*,y^*) = 0\end{cases}\quad\quad\quad(6)

If g_y(x^*,y^*) \ne 0 then from (4),

\frac{dy}{dx}\vert_{x=x^*} = \frac{-g_x(x^*, y^*)}{g_y(x^*, y^*)}

Substitute it into (6),

\begin{cases} f_x(x^*, y^*)+f_y(x^*, y^*)\cdot (\frac{-g_x(x^*, y^*)}{g_y(x^*, y^*)})=f_x(x^*, y^*)+g_x(x^*, y^*)\cdot (\frac{-f_y(x^*, y^*)}{g_y(x^*, y^*)})\\ g(x^*,y^*) = 0\end{cases}\quad\quad\quad(7)

Let \lambda = \frac{-f_y(x^*, y^*)}{g_y(x^*, y^*)}, we have

f_y(x^*, y^*) + \lambda g_y(x^*, y^*) =0\quad\quad\quad(8)

Combining (7) and (8) gives

\begin{cases} f_x(x^*, y^*)+\lambda g_x(x^*, y^*) = 0 \\ f_y(x^*, y^*)+\lambda g_y(x^*, y^*)=0 \\ g(x^*, y^*) = 0\end{cases}

It follows that to find the stionary points of z, we solve

\begin{cases} f_x(x, y)+\lambda g_x(x, y) = 0 \\ f_y(x, y)+\lambda g_y(x, y)=0 \\ g(x, y) = 0\end{cases}\quad\quad\quad(9)

for x, y and \lambda.

This is known as the method of Lagrange’s multiplier.

Let F(x,y,\lambda) = f(x,y) + \lambda g(x,y).


F_x(x,y,\lambda) = f_x(x,y) + \lambda g_x(x,y),

F_y(x,y,\lambda)=f_y(x,y) + \lambda  g_y(x,y),

F_{\lambda}(x,y,\lambda) = g(x, y),

(9) is equivalent to

\begin{cases} F_x(x, y, \lambda)=0 \\ F_y(x,y,\lambda)=0 \\ F_{\lambda}(x, y) = 0\end{cases}\quad\quad\quad(10)

Let’s look at some examples.

Example-1 Find the minimum of f(x, y) = x^2+y^2 subject to the condition that x+y=4

Let F(x, y, \lambda) = x^2+y^2+\lambda(x+y-4).


\begin{cases}F_x=2x-\lambda=0  \\ F_y = 2y-\lambda = 0 \\ F_{\lambda} = x+y-4=0\end{cases}

for x, y, \lambda gives x=y=2, \lambda=4.

When x=2, y=2, x^2+y^2=2^2+2^2=8.

\forall  (x, y) \ne (2, 2), x+y=4, we have

(x-2)^2 + (y-2)^2 > 0.

That is,

x^2-4x+4 + y^2-4y+4 = x^2+y^2-4(x+y)+8 \overset{x+y=4}{=} x^2+y^2-16+8>0.


x^2+y^2>8, (x,y) \ne (2,2).

The target function x^2+y^2 with constraint x+y=4 indeed attains its global minimum at (x, y) = (2, 2).

I first encountered this problem during junior high school and solved it:

(x-y)^2 \ge 0 \implies x^2+y^2 \ge 2xy

x+y=4\implies (x+y)^2=16\implies x^2+2xy +y^2=16\implies 2xy=16-(x^2+y^2)

x^2+y^2 \ge 16-(x^2+y^2) \implies x^2+y^2 \ge 8\implies z_{min} = 8.

I solved it again in high school when quadratic equation is discussed:

x+y=4 \implies y =4-x

z=x^2+y^2 \implies z = x^2+(4-x)^2 \implies 2x^2-8x+16-z=0

\Delta =  64-4 \cdot 2\cdot (16-z) \ge 0 \implies z \ge 8\implies z_{min} = 8

In my freshman calculus class, I solved it yet again:

x+y=4 \implies y=4-x

z = x^2+(4-x)^2

\frac{dz}{dx} = 2x+2(4-x)(-1)=2x-8+2x=4x-8

\frac{dz}{dx} =0  \implies x=2

\frac{d^2 z}{dx^2} = 4 > 0 \implies x=2, z_{min}=2^2+(4-2)^2=8

Example-2 Find the shortest distance from the point (1,0) to the parabola y^2=4x.

We minimize f = (x-1)^2+y^2 where y^2=4x.

If we eliminate y^2 in f, then f = (x-1)^2+4x. Solving \frac{df}{dx} = 2x+2=0 gives x=-1, Clearly, this is not valid for it would suggest that y^2=-4 from y^2=4x, an absurdity.

By Lagrange’s method, we solve

\begin{cases} 2(x-1)-4\lambda=0 \\2y\lambda+2y = 0 \\y^2-4x=0\end{cases}

Fig. 1

The only valid solution is x=0, y=0, k=-\frac{1}{2}. At (x, y) = (0, 0), f=(0-1)^2+0^2=1. It is the global minimum:

\forall (x, y) \ne (0, 0), y^2=4x \implies x>0.

(x-1)^2+y^2 \overset{y^2=4x}{=}(x-1)^2+4x=x^2-2x+1+4x=x^2+2x+1\overset{x>0}{>}1=f(0,0)

Example-3 Find the shortest distance from the point (a, b) to the line Ax+By+C=0.

We want find a point (x_0, y_0) on the line Ax+By+C=0 so that the distance between (a, b) and (x_0, y_0) is minimal.

To this end, we minimize (x_0-a)^2+(y_0-b)^2 where Ax_0+By_0+C=0 (see Fig. 2)

Fig. 2

We found that

x_0=\frac{aB^2-bAB-AC}{A^2+B^2}, y_0=\frac{bA^2-aAB-BC}{A^2+B^2}

and the distance between (a, b) and (x_0, y_0) is


To show that (11) is the minimal distance, \forall (x, y) \ne (x_0, y_0), Ax+By+C=0.

Let d_1 = x-x_0, d_2=y-y_0, we have

x = x_0 + d_1, y=y_0 + d_2, d_1 \ne 0, d_2 \ne 0.

Since Ax+By+C=0,


That is


By the fact that Ax_0+By_0+C=0, we have

Ad_1 + Bd_2 =0\quad\quad\quad(12)

Compute (x-a)^2+(y-b)^2 - ((x_0-a)^2+(y_0-b)^2) (see Fig. 3)

Fig. 3



After some rearrangement and factoring, it becomes

\frac{-2C}{A^2+B^2}(Ad_1+Bd_2)+\frac{-2B}{A^2+B^2}(Ad_1+Bd_2)+\frac{-2A}{A^2+b^2}(Ad_1+Bd_2) + d_1^2+d_2^2

By (12), it reduces to

d_1^2 + d_2^2.

This is clearly a positive quantity. Therefore,

\forall (x, y) \ne (x_0, y_0), Ax+By+C=0 \implies (x-a)^2+(y-b)^2> (x_0-1)^2+(y_0-b)^2

Introducing Operator Delta

The r^{th} order finite difference of functionf(x) is defined by

\Delta^r f(x) = \begin{cases} f(x+1)-f(x), r=1\\ \Delta(\Delta^{r-1}f(x)), r > 1\end{cases}

From this definition, we have

\Delta f(x) = \Delta^1 f(x) = f(x+1)-f(x)


\Delta^2 f(x) = \Delta (\Delta^{2-1} f(x))

= \Delta (\Delta f(x))

= \Delta( f(x+1)-f(x))

= (f(x+2)-f(x+1)) - (f(x+1)-f(x))

= f(x+2)-2f(x)+f(x+1)

as well as

\Delta^3 f(x) = \Delta (\Delta^2 f(x))

= \Delta (f(x+2)-2f(x)+f(x+1))

= (f(x+3)-2f(x+1)+f(x+2)) - (f(x+2)-2f(x)+f(x+1))

= f(x+3)-3f(x+2)+3f(x+1)-f(x)

The function shown below generates \Delta^r f(x), r:1\rightarrow 5 (see Fig. 1).

delta_(g, n) := block(

    for i : 2 thru n do (


Fig. 1

Compare to the result of expanding (f(x)-1)^r=\sum\limits_{i=0}^r(-1)^i \binom{r}{i} f(x)^{r-i}, r:1\rightarrow 5 (see Fig. 2)

Fig. 2

It seems that

\Delta^r f(x) = \sum\limits_{i=0}^r(-1)^i \binom{r}{i} f(x+r-i)\quad\quad\quad(1)

Lets prove it!

We have already shown that (1) is true for r= 1, 2, 3.

Assuming (1) is true when r=k-1 \ge 4:

\Delta^{k-1} f(x) = \sum\limits_{i=0}^{k-1}(-1)^i \binom{r}{i} f(x+k-1-i)\quad\quad\quad(2)

When r=k,

\Delta^k f(x) = \Delta(\Delta^{k-1} f(x))

\overset{(2)}{=}\Delta (\sum\limits_{i=0}^{k-1}(-1)^i \binom{k-1}{i}f(x+k-1-i))

=\sum\limits_{i=0}^{k-1}(-1)^i \binom{k-1}{i}f(x+1+k-1-i)-\sum\limits_{i=0}^{k-1}(-1)^i \binom{k-1}{i}f(x+k-1-i)

=(-1)^0 \binom{k-1}{0}f(x+k-0)

+\sum\limits_{i=1}^{k-1}(-1)^i \binom{k-1}{i}f(x+k-i)-\sum\limits_{i=0}^{k-2}(-1)^i \binom{k-1}{i}f(x+k-1-i)


\overset{\binom{k-1}{0} = \binom{k-1}{k-1}=1}{=}

f(x+k)+ \sum\limits_{i=1}^{k-1}(-1)^i \binom{k-1}{i}f(x+k-i)-\sum\limits_{i=0}^{k-2}(-1)^i \binom{k-1}{i}f(x+k-1-i) -(-1)^{k-1}f(x)

=f(x+k)+ \sum\limits_{i=1}^{k-1}(-1)^i \binom{k-1}{i}f(x+k-i)+\sum\limits_{i=0}^{k-2}(-1)^{i+1}\binom{k-1}{i}f(x+k-1-i) -(-1)^{k-1}f(x)

\overset{j=i+1, i:0 \rightarrow k-2\implies j:1 \rightarrow k-1}{=}

f(x+k)+ \sum\limits_{i=1}^{k-1}(-1)^i\binom{k-1}{i}f(x+k-i) + \sum\limits_{j=1}^{k-1}(-1)^j \binom{k-1}{j-1}f(x+k-j)-(-1)^{k-1}f(x)

= f(x+k)+ \sum\limits_{i=1}^{k-1}(-1)^i\binom{k-1}{i}f(x+k-i) + \sum\limits_{i=1}^{k-1}(-1)^i\binom{k-1}{i-1}f(x+k-i)+(-1)^k f(x)

= f(x+k) + \sum\limits_{i=1}^{k-1}(-1)^i f(x+k-i) (\binom{k-1}{i} + \binom{k-1}{i-1})+(-1)^k f(x)

\overset{\binom{k-1}{i} + \binom{k-1}{i-1}=\binom{k}{i}}{=}

f(x+k)+ \sum\limits_{i=1}^{k-1}(-1)^i \binom{k}{i} f(x+k-i)+(-1)^k f(x)

= (-1)^0 \binom{k}{0}f(x+k-0)+\sum\limits_{i=1}^{k-1}(-1)^i \binom{k}{i} f(x+k-i)+(-1)^k \binom{k}{k} f(x+k-k)

= \sum\limits_{i=0}^{k}(-1)^i \binom{k}{i} f(x+k-i)