Notes on the partial fraction decomposition: why it always works

If you’ve taken any intro to Calculus class, you’re probably familiar with partial fraction decomposition.

In case you’re not, the idea is that you’re given some rational function with an awful denominator that you want to integrate, like:


And you break it up into smaller, simpler fractions:

\frac{1}{x-2} +\frac{3}{x+4}

This is the idea. If we get into the details, it gets fairly ugly — in a typical calculus textbook, you’ll find a plethora of rules regarding what to do in all sorts of cases: what to do when there are repeated linear factors, quadratic factors, repeated quadratic factors, and so on.

Since the textbooks generously cover this for us, we’ll assume that we know what to do with a rational polynomial with some polynomial as the numerator, and some number of linear or quadratic factors in the denominator. We can do partial fraction decomposition on this. If we like, we could integrate it too. I’m talking about anything of this form:

\frac{P(x)}{((ax+b)(cx+d) \cdots)((ex^2+fx+g)(hx^2+ix+j) \cdots)}

Although we won’t prove this, this seems fairly believable. We’ll assume that once we get a fraction into this form, we’re done and we can let existing partial fraction methods take care of the rest.

Can Partial Fractions Fail?

What if we have a polynomial greater than a quadratic in the denominator? So let’s say:


Fortunately, here the denominator can be factored, giving us a form we can deal with:


But we were lucky that time. After all, not all polynomials can be factored, right? What if we have this:


We can’t factor this. What can we do?

It turns out that this isn’t a huge problem. We never required the coefficients of the factors to be integers! Although the factorization is awkward, it can still be factored:

\frac{1}{(x + 5^{1/3})(x^2-5^{1/3}x+5^{2/3})}

Other than making the next step somewhat algebraically tedious, this decomposition is perfectly valid. The coefficients need not be integers, or even be expressed with radicals. As long as every coefficient is real, partial fraction decomposition will work fine.

Universality of Partial Fractions

The logical next question would be, can all radical functions be written in the previous partial fraction decomposition-suitable form? Looking through my calculus textbooks, none seemed to provide a proof of this — and failing to find a proof on the internet, I’ll give the proof here.

We need to prove that any polynomial that might appear in the denominator of a rational function, say Q(x), can be broken down into linear or quadratic factors with real coefficients.

In order to prove this, we’ll need the following two theorems:

  • Fundamental Theorem of Algebra — any polynomial of degree n can be written as a product of n linear complex factors: Q(x) = (x-z_1) (x-z_2) \cdots (x-z_n)
  • Complex Conjugate Root Theorem — if some complex number a + bi is a root of some polynomial with real coefficients, then its conjugate a-bi is also a root.

Starting with the denominator polynomial Q(x), we break it down using the Fundamental Theorem of Algebra into complex factors. Of these factors, some will be real, while others will be complex.

Consider the complex factors of Q(x). By the complex conjugate root theorem, for every complex factor we have, its conjugate is also a factor. Hence we can take all of the complex factors and pair them up with their conjugates. Why? If we multiply a complex root by its complex conjugate root: (x-z)(x-\bar{z}) — we always end up with a quadratic with real coefficients. (you can check this for yourself if you want)

Before, we were left with real linear factors and pairs of complex factors. The pairs of complex factors multiply to form quadratic polynomials with real coefficients, so we are done.

At least in theory — partial fraction decomposition always works. The problem is just that we relied on the Fundamental Theorem of Algebra to hand us the roots of our polynomial. Often, these roots aren’t simple integers or radicals — often they can’t really be expressed exactly at all. So we should say — partial fraction decomposition always works, if you’re fine with having infinitely long decimals in the decomposed product.

Calculus magic

You have probably seen some version of this curve at some time or another:

The blue lines are drawn by putting n evenly spaced points in the interval [0,1] for both the horizontal and vertical axis and connecting them.

As n approaches infinity, the question is, what is the area occupied by the blue lines? The area is clearly finite since the entire shape is bounded by a 1×1 square, but how much of that square is occupied?

Intuitively it might seem that the blue curve is a circle, however that intuition is wrong. In the above diagram, the red curve is the actual circle, and clearly the blue curve is outside the circle. But fortunately the area can be found with some calculus.

If we choose some point (a,0) on the x-axis, then obviously that point is connected to point (0,1-a) on the y-axis. The slope of this line is \frac{a-1}{a} and the equation of this line is y=\frac{a-1}{a}x+1-a. More conveniently, we can this expression as g(a,x), intuitively the line formed when we choose a as our x-axis point.

Let f(x) be the bounding curve for the blue lines. To find f(x), we want to choose the value a, given x, so that g(a,x) is maximized. To do this, we take the partial derivative \frac{\partial y}{\partial a}:

\frac{\partial y}{\partial a} = x(\frac{a-a+1}{a^2})-1=\frac{x}{a^2}-1

The optimal value of a is the one for which \frac{\partial y}{\partial a}=0, so solving for a we get a=\sqrt{x}. Then we have

f(x) = g(\sqrt{x},x) = x \frac{\sqrt{x}-1}{\sqrt{x}}+1-\sqrt{x}

f(x)= 1+x-2\sqrt{x}

Integrating f(x) from 0 to 1 gives us the answer:

\displaystyle \int_0^1 (1+x-2\sqrt{x}) dx = \frac{1}{6}

Notes on Mercator’s Projection

The following work is my math essay assignment (an essay on any topic related to the Math 30 curriculum). I found it interesting enough that I’ll post it here on my math blog. Yay.


How do you make a world map? We find world maps everywhere — but the making of world maps is more complicated than most people realize. The earth is round, and a map is flat. How should one represent the surface of the earth, a sphere, on a flat sheet of paper?

The problem of displaying the surface of a sphere on a map has concerned cartographers, or map makers, for centuries. Over the years, cartographers have come up with a multitude of map projections, or systematic methods of transferring the surface of the earth onto flat paper.

There are many types of map projections, but one of the most common types is the cylindrical projection. Imagine a semi-transparent, hollow globe with a light inside of it. Now wrap a piece of blank map paper around it, like a cylinder:

Let the earth be represented as a unit sphere. The longitude (east-west location) shall be denoted as \lambda, and the latitude (north-south location) as \theta. A is the center of the earth, and B a point on the equator.

Now a light source is placed at point A; a point D on the surface of the globe will be projected in a straight line to point C on the cylinder such that points A, D, and C are collinear. Using some trigonometry, we have \tan \theta = \frac{BC}{AB}; since AB=1, the length of BC is \tan \theta.

After repeating this procedure for every point on the sphere, the cylinder is unrolled into a flat sheet. The result is the following map:

Here, if a point on the earth is known (we know its latitude and longitude), we know its position on the map. The position coordinates of the map, (x,y) can be described in terms of \lambda and \theta in the following equations:

x = \lambda

y = \tan \theta

This projection has several major drawbacks: most notably, as the latitude \theta increases toward 90^\circ, the slope of \tan(\theta) becomes increasingly steep, and the y-coordinates of the point of the map increases towards infinity. At high latitudes, the map is very distorted.

The map is not equal-area — in an equal-area map, shapes of equal area on the sphere have equal area on the projection. Looking at the map we created, this is clearly not the case. Neither is the map conformal, or shape preserving. In a conformal map, the angle between any two lines must be the same as their counterparts on the map.

Mercator’s Projection

One of the most well known cylindrical map projections is Mercator’s projection. Invented by Belgian cartographer Gerardus Mercator in 1569, Mercator addresses some of the problems in older cylindrical projections. Unlike our projection, Mercator’s projection is conformal, and it has been used for navigation for centuries:

However, the equations for Mercator’s projection are rather complicated. The position on Mercator’s projection, when given the latitude and longitude, is:

x = \lambda

y = \ln ( \sec \theta + \tan \theta )

This is a strange and seemingly random formula. But when we use this formula, we get a conformal map. How does this work?

Deriving Mercator’s Projection

Before deriving the formula for Mercator’s Projection, we take a step back and ask ourselves — what was the projection for in the first place?

Unlike most maps today which are used to decorate walls, Mercator’s map was more than that — it was used for nautical navigation. Suppose you were to travel by ship from Vancouver to Honolulu, equipped with nothing more than a compass. Which direction should you travel?

Now if you have a copy of Mercator’s map, this becomes simple. You draw a straight line from Vancouver to Honolulu, and with a protractor you find that the direction is 40^\circ south to the parallel. With a compass, it would be easy for you to maintain this constant heading for the entire duration of the trip.

(Footnote — The straight line on Mercator’s map is not straight on the Earth — it is actually a curved line, called a rhumb line. That is, after taking an initial bearing, one proceeds along the same bearing (relative to the north pole) for the duration of the trip. In the example, you would have to adjust your heading intermittently so that you are always heading at the direction 40^\circ south to the parallel.)

This technique is possible because it relies on one crucial property of Mercator’s map: that it is conformal. It would fail if we tried the same technique on the cylindrical projection given in the introduction.

However, notice that on Mercator’s map, all parallels have the same length: a parallel is simply a line across the width of the entire map. But on Earth, a sphere, parallels are longer when they are closer to the equator and shorter when near the poles:

Let the Earth be a unit sphere with center A. D is a point on the surface of the Earth and B is its corresponding point on the equator. Let C be a point such that CD is parallel to AB and AC is perpendicular to AB. As AB = 1, the circumference of the Earth at the equator is 2 \pi. But the circumference at latitude \theta is only 2\pi \cos \theta.

It is easy to see why: as AB || CD in the diagram, it follows that angles \angle DAB and \angle ADC are equal. Next, \cos \angle ADC = \frac{CD}{AD} = \frac{CD}{1} = CD, therefore CD = \cos \theta. The result follows.

On the map, the length of the equator and the parallel at latitude \theta are equal. But on Earth, the parallel at latitude \theta is smaller than the equator by a factor of \cos \theta; in the projection any line at latitude \theta ends up being stretched horizontally by a factor of \frac{1}{\cos \theta} or \sec \theta.

Finally, in order to satisfy the property of conformality, any part of the map stretched horizontally by some factor, say k, must be stretched vertically by the same factor k. Doing so preserves the angles of that part of the map:

Imagine a very small piece of land at latitude \theta. When it is represented on the map, it is stretched horizontally by \sec \theta. Hence to preserve conformity, it must also be stretched vertically by \sec \theta.

But this can only work if the piece of land has no area. But as all pieces of land, no matter how small, has some area, the latitude of the piece of land is not \theta, but goes from \theta to \theta + \Delta \theta for some small number \Delta \theta. Similarly, the space this piece of land occupies on the map is not just y, but goes from y + \Delta y for some small number \Delta y. Now as \Delta \theta approaches 0, the ratio of \Delta y to \Delta \theta approaches \sec \theta:

\lim_{\Delta \theta \to 0} \frac{\Delta y}{\Delta \theta} = \sec \theta

Or equivalently,

\frac{dy}{d \theta} = \sec \theta

Solving for dy and integrating both sides gives:

\begin{array}{rl}y & = \int \sec \theta \, d \theta \\ & = \int \left( \sec \theta \cdot \frac{\sec \theta + \tan \theta}{\sec \theta + \tan \theta} \right) d \theta \\ & = \int \frac{\sec^2 \theta + \sec \theta \tan \theta}{\sec \theta + \tan \theta} \, d \theta\end{array}

If we let u be the denominator, u = \sec \theta + \tan \theta, it turns out that the numerator is the derivative of u, or du = \sec^2 \theta + \sec \theta \tan \theta. Thus we can substitute u into the integral:

\begin{array}{rl} y & = \int \frac{du}{u} \\ & = \ln |u| + C \\ & = \ln | \sec \theta + \tan \theta | + C \end{array}

We defined the map so that the equator lies on the x-axis, therefore when \theta = 0, y=0. Substituting and solving for C, we find that C=0. Also, since \sec \theta + \tan \theta > 0 for -\frac{\pi}{2} < \theta < \frac{\pi}{2}, we can remove the absolute value brackets. Therefore,

y = \ln ( \sec \theta + \tan \theta )

This is what is desired.

An Alternative Map Projection

Mercator’s Projection is merely a conformal map, not a perfect map. It has several drawbacks — the scale varies greatly from place to place: as the latitude increases, the map gets stretched more and more. This severely distorts larger figures, especially areas near the poles. At a latitude greater than about 70^\circ, the map is practically unusable.

The Gall-Peters Projection is an alternative cylindrical projection:

When a map is used in navigation, it is important that angles are preserved in the map. But when a map is used to present statistical data, it’s often more important to preserve areas in the map: otherwise one may be deceived into thinking one country is bigger than another when they are actually the same size.

The Gall-Peters projection does this, trading conformity for equal-area — two countries of the same area will have the same area on the map, no matter where they are on the map. This map projection is produced by the following equations:

x = \lambda \cos 45^\circ = \frac{\lambda}{\sqrt{2}}

y = \frac{\sin \theta}{\cos 45^\circ} = \sqrt{2} \sin \theta

A discussion on why this set of equations produces an equal-area map is beyond the scope of this essay.


It is a tricky matter to accurately depict the surface of the earth on a flat sheet of paper. The method of cylindrical projection produces several considerably different maps.

The first map we considered uses only simple trigonometry, but the map produced is not very useful. The second map we investigate, Mercator’s Projection, preserves angles on the map. The last map we consider, the Gall-Peters projection, preserves areas on the map.

Each of these map projections are both similar and very different. They are similar in that they are all produced in some way by wrapping a cylinder around a spherical Earth. But the mathematics needed to produce them differ.

Throwing a rock off a cliff: Calculating the optimal angle

You want to throw a rock far, as far as you can possibly throw. Using all your energy, you throw the rock into the sky, and it lands some distance away a few seconds later. But at what angle should you throw it?

It is easy to see that the optimal angle to throw a rock when you are on flat ground is 45^\circ. That is, to make the rock travel as far as possible, one should throw the rock halfway between vertical and horizontal. Any other angle and you do worse than throwing it at a 45^\circ angle.

But what about throwing a rock off a cliff into, say, the ocean? Is 45^\circ still the optimal angle, or is it different? Perhaps it varies?

Suppose you are standing on the edge of a cliff overlooking the sea. It is h meters above the sea, and you can throw a rock with a velocity v, at an angle \theta between 0^\circ and 90^\circ. You are trying to maximize d, the distance thrown.

We assume that there is no friction or air resistance, and we let g be the gravitational constant, 9.81 m/s^2.

When throwing a rock at angle \theta, the velocity of the rock is a vector with a horizontal component of v \cos \theta and a vertical component of v \sin \theta.

Then the distance traveled can be expressed partially in terms of \theta:

d = vt \cos \theta .

The next equation we can get by substituting our values into one of the motion equations:

-h = (v \sin \theta)t - \frac{1}{2}gt^2

The unknown in the two equations is t, or the total time traveled. From the first equation we have

t = \frac{d}{v \cos \theta} .

Substituting into the second equation, we get

-h = (v \sin \theta) (\frac{d}{v \cos \theta}) - \frac{1}{2}g(\frac{d}{v \cos \theta})^2

-h = d \tan \theta - \frac{gd^2 \sec^2 \theta}{2v^2}

From here we want to find the maximum for d, or when \frac{dd}{d \theta}=0. To solve for \frac{dd}{d \theta}, we first differentiate both sides of the equation with respect to \theta:

0 = \frac{d}{d \theta} (d \tan \theta) - \frac{g}{2v^2} \frac{d}{d \theta} (d^2 \sec^2 \theta)

0 = d \sec^2 \theta + \tan \theta \frac{dd}{d \theta} - \frac{g}{2v^2}(\frac{dd^2}{d \theta} \sec^2 \theta + 2d^2 \tan \theta \sec^2 \theta)

As \frac{dd^2}{d \theta} = 2d \frac{dd}{d \theta}, we can isolate \frac{dd}{d \theta}:

\frac{dd}{d \theta} = \frac{\frac{gd^2 \tan \theta \sec^2 \theta}{v^2}-d \sec^2 \theta}{\tan \theta - \frac{gd \sec^2 \theta}{v^2}}

Since \frac{dd}{d \theta} = 0, we eliminate the denominator and are left with

0 = \frac{gd^2 \tan \theta \sec^2 \theta}{v^2} - d \sec^2 \theta

0 = d(\frac{gd \tan \theta \sec^2 \theta}{v^2} - \sec^2 \theta)

Ignoring the trivial case where d=0, we get

gd \tan \theta \sec^2 \theta = v^2 \sec^2 \theta

d = \frac{v^2}{g \tan \theta}

We have the optimal distance; we can substitute this back into a previous equation:

-h = (\frac{v^2}{g \tan \theta}) \tan \theta - \frac{g (\frac{v^2}{g \tan \theta})^2 \sec^2 \theta}{2v^2}

-h = \frac{v^2}{g} - \frac{v^2 \sec^2 \theta}{2g \tan^2 \theta}

-h = \frac{v^2}{g}(1 - \frac{1}{2\sin^2 \theta})

Finally we solve for \theta:

\frac{1}{2 \sin^2 \theta} = \frac{hg}{v^2} + 1

\sin^2 \theta = \frac{v^2}{2hg+2v^2}

\theta = \arcsin(\frac{v}{\sqrt{2hg+2v^2}})

This is the final formula for \theta in terms of v, h, and g.

Project Euler 285

Problem 285 of Project Euler was released yesterday; it requires some geometry and probability sense.

The problem goes like this:

Albert chooses two real numbers a and b randomly between 0 and 1, and a positive integer k.

He plays a game with the following formula:

k' = \sqrt{(ka+1)^2 + (kb+1)^2}

The result k’ is rounded to the nearest integer. If k’ rounded is equal to k, then Albert gets k points. Otherwise, he gets nothing.

Find the expected value of the total score if Albert plays this game with k = 1, 2, \cdots , 10^5.

For example (from the problem statement), let k=6, a=0.2, and b=0.85.

Here the expression \sqrt{(ka+1)^2 + (kb+1)^2} gives the number 6.484. When rounded to the nearest integer, it becomes 6, which is equal to k. Therefore, Albert gets 6 points.

Yup – another expected value problem. But the concept of this problem is simple:

Hosted by

For any value of k, there are certain values of a and b where Albert will get points. Otherwise, he does not get points. Here the gray blob is the area where he does get points, and the white area is where he does not.

The red square is the sample space of the random variables a and b, since 0<a<1 and 0<b<1.

The probability of Albert getting the points is the fraction of the square that’s gray.

The area of the red square is always 1. The area of the gray area varies with k. Thus the idea is to calculate the area of the gray area for each value of k from 1 to 10^5.

Now we can get to the details of the problem.

Notice that if round(k') = k, then k-0.5 < k' < k+0.5 (the difference between less-equal and less is irrelevant in this problem):

Hosted by

In a similar way, the equation of the larger circle is (a+\frac{1}{k})^2 + (b+\frac{1}{k})^2 < (\frac{k+0.5}{k})^2.

These are equations of circles. A circle has the equation (x-a)^2 + (y-b)^2 = r^2 where (a,b) is the center and r is the radius.

Thus both the smaller and the larger circles have a center at (-\frac{1}{k},-\frac{1}{k}). The radii of the circles are different: the radius of the smaller circle is \frac{k-0.5}{k} and the radius of the larger circle \frac{k+0.5}{k}.

Plotting the inequalities:

Hosted by

We can see now that this is a calculus problem! Indeed, the shaded area is the area of the larger circle inside the boxarea of the smaller circle inside the box.

Initially I tried integrating the circles directly, but I ran into problems while integrating and I was unable to correctly integrate the equation. Instead, I integrated an equivalent problem:

Hosted by

It’s obvious that the areas do not change with this transformation.

So now for the smaller circle the equation is a^2 + (b+\frac{1}{k})^2 > (\frac{k-0.5}{k})^2. Solving for b, we get:

b > \sqrt{(\frac{k-0.5}{k})^2-a^2} - \frac{1}{k}

.. And this equation for the larger circle:

b < \sqrt{(\frac{k+0.5}{k})^2-a^2} - \frac{1}{k}

The variable of integration here is a, and k is only a constant. Therefore we can replace the radius with r and get this equation to integrate:

b = \sqrt{r^2-a^2} - \frac{1}{k}

With the help of an integration table, this can be integrated fairly easily.

Hosted by

There’s one additional problem – computing the integrating ‘limits’ of the two circles. Let’s call them L_S and L_L for small and large respectively.

Both of them can be computed using the pythagorean theorem.

Hosted by

The formula for L_S (it should be obvious by now what L_L should be):

L_S = \sqrt{(\frac{k-0.5}{k})^2-\frac{1}{k^2}}

Now we can get this really big formula for computing the area for any integer k:

\begin{array}{l} \int_{\frac{1}{k}}^{\sqrt{(\frac{k+0.5}{k})^2-\frac{1}{k^2}}} (\sqrt{(\frac{k+0.5}{k})^2-a^2}-\frac{1}{k}) da \\ - \int_{\frac{1}{k}}^{\sqrt{(\frac{k-0.5}{k})^2-\frac{1}{k^2}}} (\sqrt{(\frac{k-0.5}{k})^2-a^2}-\frac{1}{k}) da \end{array}

Also, fortunately for me, it is impossible for the upper bound to cross the circle at the top or right, I mean, we never have something like this –

Hosted by

Anyways, here’s a Haskell implementation of the algorithm:

integral a r k = 0.5 * (a * sqrt (r^2 - a^2) + r^2 * atan (a/sqrt (r^2-a^2))) - a/k

area k = let area_1 1 = 0
             area_1 k = integral (sqrt (((k-0.5)/k)^2 - (1/k)^2)) ((k-0.5)/k) k - integral (1/k) ((k-0.5)/k) k
             area_2 k = integral (sqrt (((k+0.5)/k)^2 - (1/k)^2)) ((k+0.5)/k) k - integral (1/k) ((k+0.5)/k) k
         in  area_2 k - area_1 k

main = print . sum . zipWith (*) [1..] $ map area [1..10^5]

The code runs in about 3 seconds.

There seems to be an edge case where k=1 (probably an asymptote) so I have to specially make it be zero.

This problem was released 10 pm last night and I was tired, lol, but I still managed to solve it and get into the top 20.

Hosted by