Notes on the partial fraction decomposition: why it always works

If you’ve taken any intro to Calculus class, you’re probably familiar with partial fraction decomposition.

In case you’re not, the idea is that you’re given some rational function with an awful denominator that you want to integrate, like:


And you break it up into smaller, simpler fractions:

\frac{1}{x-2} +\frac{3}{x+4}

This is the idea. If we get into the details, it gets fairly ugly — in a typical calculus textbook, you’ll find a plethora of rules regarding what to do in all sorts of cases: what to do when there are repeated linear factors, quadratic factors, repeated quadratic factors, and so on.

Since the textbooks generously cover this for us, we’ll assume that we know what to do with a rational polynomial with some polynomial as the numerator, and some number of linear or quadratic factors in the denominator. We can do partial fraction decomposition on this. If we like, we could integrate it too. I’m talking about anything of this form:

\frac{P(x)}{((ax+b)(cx+d) \cdots)((ex^2+fx+g)(hx^2+ix+j) \cdots)}

Although we won’t prove this, this seems fairly believable. We’ll assume that once we get a fraction into this form, we’re done and we can let existing partial fraction methods take care of the rest.

Can Partial Fractions Fail?

What if we have a polynomial greater than a quadratic in the denominator? So let’s say:


Fortunately, here the denominator can be factored, giving us a form we can deal with:


But we were lucky that time. After all, not all polynomials can be factored, right? What if we have this:


We can’t factor this. What can we do?

It turns out that this isn’t a huge problem. We never required the coefficients of the factors to be integers! Although the factorization is awkward, it can still be factored:

\frac{1}{(x + 5^{1/3})(x^2-5^{1/3}x+5^{2/3})}

Other than making the next step somewhat algebraically tedious, this decomposition is perfectly valid. The coefficients need not be integers, or even be expressed with radicals. As long as every coefficient is real, partial fraction decomposition will work fine.

Universality of Partial Fractions

The logical next question would be, can all radical functions be written in the previous partial fraction decomposition-suitable form? Looking through my calculus textbooks, none seemed to provide a proof of this — and failing to find a proof on the internet, I’ll give the proof here.

We need to prove that any polynomial that might appear in the denominator of a rational function, say Q(x), can be broken down into linear or quadratic factors with real coefficients.

In order to prove this, we’ll need the following two theorems:

  • Fundamental Theorem of Algebra — any polynomial of degree n can be written as a product of n linear complex factors: Q(x) = (x-z_1) (x-z_2) \cdots (x-z_n)
  • Complex Conjugate Root Theorem — if some complex number a + bi is a root of some polynomial with real coefficients, then its conjugate a-bi is also a root.

Starting with the denominator polynomial Q(x), we break it down using the Fundamental Theorem of Algebra into complex factors. Of these factors, some will be real, while others will be complex.

Consider the complex factors of Q(x). By the complex conjugate root theorem, for every complex factor we have, its conjugate is also a factor. Hence we can take all of the complex factors and pair them up with their conjugates. Why? If we multiply a complex root by its complex conjugate root: (x-z)(x-\bar{z}) — we always end up with a quadratic with real coefficients. (you can check this for yourself if you want)

Before, we were left with real linear factors and pairs of complex factors. The pairs of complex factors multiply to form quadratic polynomials with real coefficients, so we are done.

At least in theory — partial fraction decomposition always works. The problem is just that we relied on the Fundamental Theorem of Algebra to hand us the roots of our polynomial. Often, these roots aren’t simple integers or radicals — often they can’t really be expressed exactly at all. So we should say — partial fraction decomposition always works, if you’re fine with having infinitely long decimals in the decomposed product.

Simplifying a sum of consecutive cubes

If you were asked to find the sum of the first four cubes, you would easily be able to do so:

1^3 + 2^3 + 3^3 + 4^3 = 100

But here you may notice something interesting:

1^3 + 2^3 + 3^3 + 4^3 = (1+2+3+4)^2

It turns out that this holds for the sum of the first n cubes for any n — an identity. This identity can be written also as:

\displaystyle \sum_{i=1}^n i^3 = \left( \displaystyle \sum_{i=1}^n i \right)^2

A second way of expressing the same identity simplifies the right side and expresses it as a combination:

\displaystyle \sum_{i=1}^n i^3 = \binom{n+1}{2}^2

A proof by bijection

This identity can be proven combinatorially by noticing that \binom{n+1}{2} is the number of ways to choose 2 elements with repetition from the set {1..n} — the following is a proof by Benjamin and Orrison.

In this proof, we prove the third form of the identity

\displaystyle \sum_{i=1}^n i^3 = \binom{n+1}{2}^2

by constructing one set of size \displaystyle \sum_{i=1}^n i^3 and constructing another set of size \binom{n+1}{2}^2, then constructing a bijection between the two sets.

Let S be the set of ordered 4-pairs (a,b,c,d) such that 1 \leq a,b,c \leq d \leq n — that is, the fourth integer of the pair is greater or equal to each of the other three. If we fix d, the number of possibilities for the variables a,b,c is d^3. As d ranges from 1 to n, the total number of elements in S is equal to \displaystyle \sum_{i=1}^n i^3.

Also, since \binom{n+1}{2} is the number of ways to choose 2 elements with repetition from the set {1..n}, there are \binom{n+1}{2} pairs of integers (h,i) satisfying 1 \leq h \leq i \leq n. So let T be the set of pairs of such pairs ((h,i),(j,k)) where 1 \leq h \leq i \leq n and 1 \leq j \leq k \leq n. The number of elements in T is then \binom{n+1}{2}^2.

We further partition the sets as follows: let S_1 be the subset of S where a \leq b and let S_2 be the subset of S where a > b. Similarly, let T_1 be the subset of T where i \leq k and let T_2 be the subset of T where i > k. We can construct a bijection between S and T by constructing two bijections:

S_1 \leftrightarrow T_1

S_2 \leftrightarrow T_2

The following is trivially a bijection from S_1 to T_1 — the case where a \leq b:

(a,b,c,d) \rightarrow ((a,b),(c,d))

The equivalent bijection from S_2 to T_2 — the case where a > b:

(a,b,c,d) \rightarrow ((c,d),(b,a-1))

We can confirm that the two operations we defined are indeed bijections. This proves the identity.

The hockey stick theorem: an animated proof

An interesting theorem related to Pascal’s triangle is the hockey stick theorem or the christmas stocking theorem. This theorem states that the sum of a diagonal in Pascal’s triangle is the element perpendicularly under it (image from Wolfram Mathworld):

So here the sum of the blue numbers , 1+4+10+20+35+56, is equal to the red number, 126. It is easy to see why it’s called the hockey stick theorem – the shape looks like a hockey stick!

An alternative, algebraic formulation of the hockey stick theorem is follows:

\displaystyle\sum_{i=0}^{k} \binom{n+1}{i} = \binom{n+k+1}{k}

But this works in two ways, considering the symmetry of Pascal’s triangle. The flipped version would be:

\displaystyle\sum_{i=0}^{k} \binom{n+1}{n} = \binom{n+k+1}{n+1}

Using Pascal’s identity, it is fairly trivial to prove either identity by induction. Instead I present an intuitive, animated version of the proof:

Digit sum divisibility rule proofs in bases other than 10

The divisibility test for 3 and 9 is fairly well known. To find out of any given integer is divisible by 9, add up the digits of the number; if the result is divisible by 9 then the number is divisible by 9.

What’s a bit less well known is that in any base b, the same trick applies for b-1 and all of its divisors.

For example, if we are doing arithmetic in hexadecimal (base 16), the divisibility rule we use for base 10 applies not for 3 and 9, but instead for 3, 5, and 15.

Suppose we were to confirm that \textrm{831f3f}_{16} is divisible by \textrm{f}_{16} . Adding up the digits:

\begin{array}{l} 8_{16} + 3_{16} + 1_{16} + \textrm{f}_{16} + 3_{16} + \textrm{f}_{16} \\ = \textrm{2d}_{16} \end{array}

And since \textrm{2d}_{16} is divisible by \textrm{f}_{16} , so is the original number.


Let n be an integer, b be the base, and m be the number of digits in n. Then we can represent n:

n = d_{m-1} b^{m-1} + d_{m-2} b^{m-2} + \cdots + d_1 b + d_0

We can rearrange this:

\begin{array}{rl} n =& [d_{m-1} (b^{m-1}-1) + \cdots + d_1 (b-1)] \\ &+ (d_{m-1} + d_{m-2} + \cdots + d_1 + d_0) \end{array}

Each of the expressions b-1 , b^2-1 , up to b^{m-1}-1 are divisible by b-1 . I’ve explored the proof for this in an earlier blog post.

Thus the entire first group of the above expression is divisible by b-1. What remains is the second group, which happens to be the sum of digits in n.

The sum of digits in n is divisible by b-1 if and only if n is divisible by b-1, which is what we wanted to prove.

For a factor of b-1, this also works as the entire first expression is divisible by any factor of b-1.

The AM-GM inequality

One of the most basic inequalities, the AM-GM inequality states that the AM (Arithmetic mean) of a series of real numbers is always greater or equal to the GM (Geometric mean).


\frac{x_1 + x_2 + \cdots + x_n}{n} \geq \sqrt[n]{x_1 x_2 \cdots x_n}

A simplified case

If there are two elements in the series, the AM-GM states this:

\frac{x+y}{2} \geq \sqrt{xy}

The proof for this simplified case is quite simple:

\begin{array}{l} \frac{x+y}{2} - \sqrt{xy} \\ = \frac{x - 2\sqrt{xy} + y}{2} \\ = \frac{(\sqrt{x} - \sqrt{y})^2}{2} \geq 0 \end{array}

Now, however, it is not so obvious how to extend this proof to a set of three elements, or more. But there is a way, by induction, to prove this inequality for cases where the number of elements is a power of two.

The 2^k subcase

The base case states that the AM-GM inequality is true for n=2, or when the number of elements in the set is 2.

The method of inductive proof is proving a base case, 2^k, and proving that if this base case is true then the case where 2^{k+1} is also true. This would mean that the statement is true for all instances of 2^k.

Here the base case is n = 2^1, which we have just proved to be true. We can now prove that this is true when n = 2^2, or when there are 4 elements in the set: x_1, x_2, x_3, x_4.

The arithmetic mean, or AM of the set is

\frac{x_1 + x_2 + x_3 + x_4}{4} .

We can split this set into two halves. One contains x_1, x_2, and the other contains x_3, x_4.

The point is that the arithmetic mean of the arithmetic means of the two halves is equal to the arithmetic mean of the whole. Or,

\textrm{AM} = \frac{\frac{x_1+x_2}{2}+\frac{x_3+x_4}{2}}{2}

Notice that both of the two halves are of length 2, or in the general case, 2^{k-1}. Having already proved the previous case, we can say,

\textrm{AM} \geq \frac{\sqrt{x_1 x_2} + \sqrt{x_3 x_4}}{2}

This itself is another arithmetic mean. Applying the base case again:

\begin{array}{rcl} \textrm{AM} &\geq& \sqrt{\sqrt{x_1 x_2} \sqrt{x_3 x_4}} \\ &=& \sqrt[4]{x_1 x_2 x_3 x_4} \end{array}

So we have

\frac{x_1 + x_2 + x_3 + x_4}{4} \geq \sqrt[4]{x_1 x_2 x_3 x_4}

which was what we wanted to prove.

The process for proving this for a higher power of 2, such as 8, 16, etc, is very similar to what we have just done.

The n-1 subcase

If the number of elements in the set is not a nice and even power of two, the above proof does not work. Instead, we use induction again, but this time from top-down.

We try to prove that if the inequality is true for a set with n elements, then it is also true for a set with n-1 elements.

Let’s first prove a lemma.


If the arithmetic mean of a set, x_1, x_2, \cdots, x_n is a, then

\frac{x_1 + \cdots x_n + a}{n+1} = \frac{x_1 + \cdots + x_n}{n} .


\begin{array}{l} \frac{x_1 + \cdots + x_n + a}{n+1} \\ =\frac{x_1 + \cdots + x_n + \frac{x_1 + \cdots + x_n}{n}}{n+1} \\ = \frac{\frac{(n+1)x_1 + \cdots + (n+1)x_n}{n}}{n+1} \\ = \frac{x_1 + \cdots + x_n}{n} \end{array}

This lemma simply states that if you have the mean of a set, adding the mean to the set itself and finding the mean again would result in the same mean.

Now let’s prove the inequality when n=3.

Let a be the arithmetic mean of the set. Then

\frac{x_1 + x_2 + x_3 + a}{4} \geq \sqrt[4]{x_1 x_2 x_3 a}

as we proved before. According to the lemma, we can substitute:

\begin{array}{rcl} \frac{x_1 + x_2 + x_3}{3} &\geq& \sqrt[4]{x_1 x_2 x_3 a} \\ a &\geq& \sqrt[4]{x_1 x_2 x_3 a} \\ a^4 &\geq& x_1 x_2 x_3 a \\ a^3 &\geq& x_1 x_2 x_3 \\ a &\geq& \sqrt[3]{x_1 x_2 x_3} \end{array}

This completes our proof.

Of course I’ve proved it only for n=3, but the same method can be used to prove for any number where it is true for n+1.

To prove this for any integer n, first prove by induction that this is true for some number 2^k. Then prove by induction, again, that it is true for 2^k-1, 2^k-2, \cdots, n.

This method of proof is called forward-backward induction. This proof was discovered by Augustin-Louis Cauchy.

The two-circles method of proving the Pythagorean Theorem

The Pythagorean theorem, stating that the square of the hypotenuse is equal to the sum of the squares of the sides in a right triangle, is a fundamental concept in geometry and trigonometry.

There are many, many ways to prove the Pythagorean theorem. This page contains 84 different proofs, and The Pythagorean Proposition by Loomis contains another 367 proofs.

This is what I think to be an interesting proof, by the use of two circles. It is number 89 in Loomis’s book.

Starting with a right triangle, we draw two circles:

Hosted by

Here, \angle ACB is right, and A and B are the centers of circles \Omega_1 and \Omega_2 respectively.

Now we draw some additional lines, extending AB to F and G:

Hosted by

In this diagram, we can prove that \triangle ADC \sim \triangle ACG:

  1. \angle DCG is right. This is an application of Thales’ theorem since DG is a diameter.
  2. \angle ACB is right. This is given.
  3. \angle ACD = \angle BCG. Since \angle DCG = \angle ACB, \angle DCG - \angle DCB = \angle ACB - \angle DCB.
  4. \angle BCG = \angle BGC. This is because BC and BG are both radii of \Omega_2 and \triangle BCG is isosceles.
  5. \angle ACD = \angle BGC = \angle AGC.
  6. \therefore \triangle ADC \sim \triangle ACG. The two triangles have two shared angles: \angle CAG and \angle ACD = \angle AGC.

In a similar way, we can prove that \triangle BEC \sim \triangle BCF.

The rest of the proof is algebraic rather than geometric. Let’s call the side AC to be b, BC=a, and AB=c.

From the similar triangles, we have the following ratios:

\frac{AG}{AC} = \frac{AC}{AD} (or, b^2 = AG \cdot AD)

\frac{BF}{BC} = \frac{BC}{EB} (or, a^2 = BF \cdot EB)

Adding the two equations, we get:

a^2 + b^2 = BF \cdot EB + AG \cdot AD

The line BF can be split into AF and AB which is equal to c+b since AF = AC.

The line EB can be considered the difference between AB and AE, which is equal to c-b.

Similarly, AG = AB+BG = c+a, and AD = AB-DB = c-a. By substitution:

\begin{array}{rcl} a^2 + b^2 &=& (c+b) (c-b) + (c+a) (c-a) \\ &=& c^2 - b^2 + c^2 - a^2 \\ &=& 2c^2 - a^2 - b^2 \\ 2a^2 + 2b^2 &=& 2c^2 \\ a^2 + b^2 &=& c^2 \end{array}


Proofs involving the Remainder theorem

The Remainder theorem, sometimes called the Little Bezout theorem states the following:

If P(x) is a polynomial function, then the remainder of P(x) when divided by x-k is equal to P(k).


Say we wanted to find the remainder of P(x) = x^3 - 2x^2 - 6x - 7 divided by x-4. We evaluate P(4) which gives 1.

The other way to find the remainder would be by polynomial long division:

Hosted by

This calculation is somewhat wasted if we only want to calculate the remainder, as it takes considerably more work than evaluating a polynomial.

Notice however that if the goal was to obtain the remainder, Synthetic division would be the better method.

To calculate the remainder of a n^{th} degree polynomial requires n multiplications and n-1 additions, for both the remainder theorem and synthetic division.

In addition, division gives both the quotient and the remainder while evaluating the polynomial gives only the remainder.

Proof of the Remainder Theorem

Consider that a polynomial P(x) after division results in D(x) \cdot Q(x) + R where D is the divisor, Q is the quotient, and R is the remainder.

In this case, the polynomial is divided by x-k where k is a constant. This is the divisor, so D(x) = x-k.

Substituting the divisor into the polynomial, we have:

\begin{array}{l} P(x) = D(x) \cdot Q(x) + R \\ P(x) = (x-k) \cdot Q(x) + R \end{array}

From here, you can easily see that P(k) = R:

\begin{array}{l}P(k) = (k-k) \cdot Q(x) + R \\ P(k) = R \end{array}

Q.E.D. (not sure if this is correct usage or not)

Why x^n – y^n is divisible by x-y

Hosted by

No idea why my writing tends to slide downwards to the right. I don’t have this problem on paper, only when drawing stuff with the mouse on MS-paint.

Anyways, we start with the polynomial P(x) = x^n - y^n. The variable y is still a variable, but takes the position of k in the remainder theorem.

Evaluating P(y), we get 0. Therefore the remainder of P(x) when divided by x-y is zero, meaning x-y is a factor of P(x).

Why x^n + y^n is sometimes but not always divisible by x+y

Hosted by

Again with the similar polynomial P(x) = x^n + y^n, we have x-k as a factor if P(k)=0.

The expression x+y is divisible only when P(-y) = 0. In other words:

\begin{array}{l} (-y)^n + y^n = 0 \\ (-y)^n = -y^n \end{array}

If n is even:

(-y)^n = y^n

However if n is odd:

(-y)^n = -y^n

Therefore, it’s only when n is odd that x^n + y^n is divisible by x+y.