Visualizing Quaternions with Unity

How do you model the position and orientation of an airplane?

Position is easy, just represent it with a point in 3D space. But how do you specify its orientation — which direction it’s pointing?

At first glance, it seems a vector will do. After all, a vector points in some direction, right? If the plane is pointing east, represent its orientation by a unit vector pointing east.

Unfortunately, we quickly run into trouble when we try to roll. If we’re facing east, and we roll 90 degrees, we’re still facing east. Clearly we’re missing something.

Euler Angles

When real pilots talk about their orientation, they talk about roll, yaw, pitch. Pitch is going up or down, yaw is going left or right, roll is, well, roll.

Any change in orientation can be described by some combination of roll, yaw, pitch. This is the basis for Euler Angles. We use three angles to represent the airplane’s orientation.

This is all fine and dandy if we want to represent the orientation of a static object in space. But when we try to adjust our orientation, we start to run into problems.

You’re thinking, this should be simple! When we turn left or right, we just increment the yaw variable, right? Yes, it seems to work, at least initially. You can turn left and right, up and down, and roll around.

Implement it in Unity and play around a bit, however, and you begin to notice that things don’t quite behave the way you expect.

In this animation, I’m holding down the right button:

The plane does rotate to the right, but it’s not rotating relative to itself. Instead it’s rotating around some invisible y-axis. If it was rotating relative to itself, the green arrow shouldn’t be moving.

The problem becomes more and more severe when the pitch of the plane becomes higher and higher. The worst case is when the airplane is pointing straight up: then roll and yaw become the same thing! This is called gimbal lock: we have lost a degree of freedom and we can only rotate in 2 dimensions! Definitely not something desirable if we’re controlling a plane or spaceship.

It turns out that no matter what we do, we will suffer from some form of gimbal lock. As long as we use Euler Angles, there is one direction where if we turn too far, everything starts to screw up.

Practical Introduction to Quaternions

All is not lost, however. There is a way to represent orientation that represents all axes equally and does not suffer from gimbal lock. This mythical structure is called the quaternion. Unlike Euler Angles which describe your orientation relative to a fixed set of axes, quaternions do not rely on any fixed axis.

The drawback is that quaternions are unintuitive to understand for humans. There is no way to “look” at a quaternion and be able to visualize what rotation it represents. Fortunately for us, it’s not that difficult to make use of quaternions, even if we can’t visualize quaternions.

There is a lot of theory behind how quaternions work, but in this article, I will gloss over the theory and give a quick primer to quaternions, just the most common facts you need to use them. At the same time, I will implement the operations I describe in C#, so I can integrate them with Unity. If you don’t know C#, you can freely ignore the code.


A quaternion is an ordered pair of 4 real numbers (w,x,y,z). We write this as


The letters i,j,k are not variables. Rather, they are independent axes. If you like, you can think of the quaternions as a 4 dimensional vector space.

The defining property of quaternions is:

i^2 = j^2 = k^2 = ijk = -1

Play around with it a bit and you can derive 6 more identites:

ij = k

jk = i

ki = j

ji = -k

kj = -i

ik = -j

If you’ve worked with complex numbers, this should seem familiar. Instead of 2 parts of a complex number (the real and imaginary parts), we have 4 parts for a quaternion.

The similarity doesn’t end here. Multiplying complex numbers represents a rotation in 2 dimensions. Similarly, multiplying by a quaternion represents a rotation in 3D.

One curious thing to note: we have ij=k and ji=-k. We switched around the terms and the product changed. This means that multiplying quaternions is kind of like multiplying matrices — the order matters. So multiplication is not commutative.

Here’s a framework for a quaternion in C#:

public class Quat{
	// Represents w + xi + yj + zk
	public float w, x, y, z;
	public Quat(float w, float x, float y, float z){
		this.w = w;
		this.x = x;
		this.y = y;
		this.z = z;

Normalizing Quaternions

The norm of a quaternion is

N(\mathbf{q}) = \sqrt{w^2+x^2+y^2+z^2}

When we use quaternions to represent rotations, we typically want unit quaternions: quaternions with norm 1. This is straightforward: to normalize a quaternion, divide each component by the norm.

In C#:

public float Norm(){
  return Mathf.Sqrt (w * w + x * x + y * y + z * z);

public Quat Normalize(){
  float m = Norm ();
  return new Quat (w / m, x / m, y / m, z / m);

Multiplying Quaternions

Multiplying is simple, just a little tedious. If we have two quaternions:

(w_1 + x_1i + y_1j + z_1k) (w_2+x_2i+y_2j+z_2k)

Then their product is this ugly mess:

\begin{array}{l} w_1w_2-x_1x_2-y_1y_2-z_1z_2 \\ + (w_1x_2+x_1w_2+y_1z_2-z_1y_2)i \\ + (w_1y_2+y_1w_2-x_1z_2+z_1x_2) j \\ + (w_1z_2+z_1w_2+x_1y_2-y_1x_2) k \end{array}

In C#:

// Returns a*b
public static Quat Multiply(Quat a, Quat b){
  float w = a.w * b.w - a.x * b.x - a.y * b.y - a.z * b.z;
  float x = a.w * b.x + a.x * b.w + a.y * b.z - a.z * b.y;
  float y = a.w * b.y + a.y * b.w - a.x * b.z + a.z * b.x;
  float z = a.w * b.z + a.z * b.w + a.x * b.y - a.y * b.x;
  return new Quat (w,x,y,z).Normalize();

Since multiplication is not commutative, I made this function static to avoid confusing left and right multiplication. Also, I normalize the product so that floating point errors don’t accumulate.

Constructing Rotation Quaternions

Every rotation operation can be written as a rotation of some angle, \theta, around some vector (u_x, u_y, u_z):

The following formula gives a quaternion that represents this rotation:

\mathbf{q} = \cos \frac{\theta}{2} + (u_x i + u_y j + u_z k) \sin \frac{\theta}{2}

For our purposes, \theta is a very small number, say 0.01, and we use one of the three basis vectors to rotate around. For example, if we are rotating around (1,0,0) then our quaternion is

\cos \frac{0.01}{2} + \sin \frac{0.01}{2}i

That’s it: given any quaternion, multiplying on the left by our quaternion rotates it slightly around the x axis.

In C#, our code might look like this:

Quat qx = new Quat (Mathf.Cos (0.01 / 2), 0, 0, Mathf.Sin (0.01 / 2));
Quat qy = new Quat (Mathf.Cos (0.01 / 2), 0, Mathf.Sin (0.01 / 2), 0);
Quat qz = new Quat (Mathf.Cos (0.01 / 2), Mathf.Sin (0.01 / 2), 0, 0);

Putting it together

That’s all we need to do interesting things with quaternions. Let’s combine everything we have. Here’s our quaternion class thus far:

public class Quat{
	// Represents w + xi + yj + zk
	public float w, x, y, z;
	public Quat(float w, float x, float y, float z){
		this.w = w;
		this.x = x;
		this.y = y;
		this.z = z;

	public float Norm(){
		return Mathf.Sqrt (w * w + x * x + y * y + z * z);

	public Quat Normalize(){
		float m = Norm ();
		return new Quat (w / m, x / m, y / m, z / m);

	// Returns a*b
	public static Quat Multiply(Quat a, Quat b){
		float w = a.w * b.w - a.x * b.x - a.y * b.y - a.z * b.z;
		float x = a.w * b.x + a.x * b.w + a.y * b.z - a.z * b.y;
		float y = a.w * b.y + a.y * b.w - a.x * b.z + a.z * b.x;
		float z = a.w * b.z + a.z * b.w + a.x * b.y - a.y * b.x;
		return new Quat (w,x,y,z).Normalize();

	public Quaternion ToUnityQuaternion(){
		return new Quaternion (w, x, y, z);

Now we just need to read the input, perform our calculations, and output the rotation quaternion to Unity:

public class Airplane : MonoBehaviour {
  public GameObject airplane;
  public Quat quat = new Quat (0, 0, 0, -1);
  public float speed = 0.01f;

  void FixedUpdate(){
    float inputX = Input.GetAxis("UpDown");
    float inputY = Input.GetAxis("LeftRight");
    float inputZ = Input.GetAxis("Roll");

    Quat qx = new Quat (Mathf.Cos (speed * inputX / 2), 0, 0, Mathf.Sin (speed * inputX / 2));
    Quat qy = new Quat (Mathf.Cos (speed * inputY / 2), 0, Mathf.Sin (speed * inputY / 2), 0);
    Quat qz = new Quat (Mathf.Cos (speed * inputZ / 2), Mathf.Sin (speed * inputZ / 2), 0, 0);

    quat = Quat.Multiply (qx, quat);
    quat = Quat.Multiply (qy, quat);
    quat = Quat.Multiply (qz, quat);

    airplane.transform.rotation = quat.ToUnityQuaternion ();

In Unity, the input is not given to us as a single true/false value, but a float between -1 and 1. So holding right increases the LeftRight input gradually until it reaches 1, avoiding a sudden jump in movement.

What’s ToUnityQuaternion? Well, it turns out that Unity already has a Quaternion class that does everything here and much more, so all this could have literally been implemented in one line if we wanted.

Anyways, let’s see the result.

As you can see, holding right turns the plane relative to itself now, and the green arrow stays still. Hooray!

Notes on the partial fraction decomposition: why it always works

If you’ve taken any intro to Calculus class, you’re probably familiar with partial fraction decomposition.

In case you’re not, the idea is that you’re given some rational function with an awful denominator that you want to integrate, like:


And you break it up into smaller, simpler fractions:

\frac{1}{x-2} +\frac{3}{x+4}

This is the idea. If we get into the details, it gets fairly ugly — in a typical calculus textbook, you’ll find a plethora of rules regarding what to do in all sorts of cases: what to do when there are repeated linear factors, quadratic factors, repeated quadratic factors, and so on.

Since the textbooks generously cover this for us, we’ll assume that we know what to do with a rational polynomial with some polynomial as the numerator, and some number of linear or quadratic factors in the denominator. We can do partial fraction decomposition on this. If we like, we could integrate it too. I’m talking about anything of this form:

\frac{P(x)}{((ax+b)(cx+d) \cdots)((ex^2+fx+g)(hx^2+ix+j) \cdots)}

Although we won’t prove this, this seems fairly believable. We’ll assume that once we get a fraction into this form, we’re done and we can let existing partial fraction methods take care of the rest.

Can Partial Fractions Fail?

What if we have a polynomial greater than a quadratic in the denominator? So let’s say:


Fortunately, here the denominator can be factored, giving us a form we can deal with:


But we were lucky that time. After all, not all polynomials can be factored, right? What if we have this:


We can’t factor this. What can we do?

It turns out that this isn’t a huge problem. We never required the coefficients of the factors to be integers! Although the factorization is awkward, it can still be factored:

\frac{1}{(x + 5^{1/3})(x^2-5^{1/3}x+5^{2/3})}

Other than making the next step somewhat algebraically tedious, this decomposition is perfectly valid. The coefficients need not be integers, or even be expressed with radicals. As long as every coefficient is real, partial fraction decomposition will work fine.

Universality of Partial Fractions

The logical next question would be, can all radical functions be written in the previous partial fraction decomposition-suitable form? Looking through my calculus textbooks, none seemed to provide a proof of this — and failing to find a proof on the internet, I’ll give the proof here.

We need to prove that any polynomial that might appear in the denominator of a rational function, say Q(x), can be broken down into linear or quadratic factors with real coefficients.

In order to prove this, we’ll need the following two theorems:

  • Fundamental Theorem of Algebra — any polynomial of degree n can be written as a product of n linear complex factors: Q(x) = (x-z_1) (x-z_2) \cdots (x-z_n)
  • Complex Conjugate Root Theorem — if some complex number a + bi is a root of some polynomial with real coefficients, then its conjugate a-bi is also a root.

Starting with the denominator polynomial Q(x), we break it down using the Fundamental Theorem of Algebra into complex factors. Of these factors, some will be real, while others will be complex.

Consider the complex factors of Q(x). By the complex conjugate root theorem, for every complex factor we have, its conjugate is also a factor. Hence we can take all of the complex factors and pair them up with their conjugates. Why? If we multiply a complex root by its complex conjugate root: (x-z)(x-\bar{z}) — we always end up with a quadratic with real coefficients. (you can check this for yourself if you want)

Before, we were left with real linear factors and pairs of complex factors. The pairs of complex factors multiply to form quadratic polynomials with real coefficients, so we are done.

At least in theory — partial fraction decomposition always works. The problem is just that we relied on the Fundamental Theorem of Algebra to hand us the roots of our polynomial. Often, these roots aren’t simple integers or radicals — often they can’t really be expressed exactly at all. So we should say — partial fraction decomposition always works, if you’re fine with having infinitely long decimals in the decomposed product.

Simplifying a sum of consecutive cubes

If you were asked to find the sum of the first four cubes, you would easily be able to do so:

1^3 + 2^3 + 3^3 + 4^3 = 100

But here you may notice something interesting:

1^3 + 2^3 + 3^3 + 4^3 = (1+2+3+4)^2

It turns out that this holds for the sum of the first n cubes for any n — an identity. This identity can be written also as:

\displaystyle \sum_{i=1}^n i^3 = \left( \displaystyle \sum_{i=1}^n i \right)^2

A second way of expressing the same identity simplifies the right side and expresses it as a combination:

\displaystyle \sum_{i=1}^n i^3 = \binom{n+1}{2}^2

A proof by bijection

This identity can be proven combinatorially by noticing that \binom{n+1}{2} is the number of ways to choose 2 elements with repetition from the set {1..n} — the following is a proof by Benjamin and Orrison.

In this proof, we prove the third form of the identity

\displaystyle \sum_{i=1}^n i^3 = \binom{n+1}{2}^2

by constructing one set of size \displaystyle \sum_{i=1}^n i^3 and constructing another set of size \binom{n+1}{2}^2, then constructing a bijection between the two sets.

Let S be the set of ordered 4-pairs (a,b,c,d) such that 1 \leq a,b,c \leq d \leq n — that is, the fourth integer of the pair is greater or equal to each of the other three. If we fix d, the number of possibilities for the variables a,b,c is d^3. As d ranges from 1 to n, the total number of elements in S is equal to \displaystyle \sum_{i=1}^n i^3.

Also, since \binom{n+1}{2} is the number of ways to choose 2 elements with repetition from the set {1..n}, there are \binom{n+1}{2} pairs of integers (h,i) satisfying 1 \leq h \leq i \leq n. So let T be the set of pairs of such pairs ((h,i),(j,k)) where 1 \leq h \leq i \leq n and 1 \leq j \leq k \leq n. The number of elements in T is then \binom{n+1}{2}^2.

We further partition the sets as follows: let S_1 be the subset of S where a \leq b and let S_2 be the subset of S where a > b. Similarly, let T_1 be the subset of T where i \leq k and let T_2 be the subset of T where i > k. We can construct a bijection between S and T by constructing two bijections:

S_1 \leftrightarrow T_1

S_2 \leftrightarrow T_2

The following is trivially a bijection from S_1 to T_1 — the case where a \leq b:

(a,b,c,d) \rightarrow ((a,b),(c,d))

The equivalent bijection from S_2 to T_2 — the case where a > b:

(a,b,c,d) \rightarrow ((c,d),(b,a-1))

We can confirm that the two operations we defined are indeed bijections. This proves the identity.

Problem 10 of the 2011 Euclid Math Contest (remix)

I wrote the Euclid contest this year two days ago, on tuesday. There were 10 problems, and the tenth problem was a nice geometry problem. Three subproblems with a nice neat triangle on the right, with the subproblems getting progressively harder and harder. As I proceeded with the problem, I couldn’t help but imagine it as a Project Euler problem — instead of finding one such integer triangle, it would ask you to find the sum of the perimeters of all such integer triangles with perimeter less than 10^12 or some large number.

A modified problem

In the above diagram, \angle CAK = 2 \angle KAB and \angle ABC = 2 \angle KAB. Let a = CB, b = AC, c = AB, d = AK, and x = KB. Write an algorithm to find triangles satisfying these conditions where a, b, c are all integers.

Similar triangles

It is difficult to try to find integer triangles with such strange requirements as these. It seems that the line AK is completely unnecessary, but if we take it out, there doesn’t seem to be any way to relate the angle ratios to integer side lengths.

We can prove that \triangle CAK is similar to \triangle ABC. Being an exterior angle, CKA = 3 \theta, and also \angle CAB = 3 \theta. Both of the triangles have an angle of measure 2 \theta and another angle of measure 3 \theta, thus they are similar.

From the similar triangles ratio

\frac{b}{d} = \frac{a}{c}

We can write d in terms of the three sides of the triangle:

d = \frac{bc}{a}

Similarly, the side CK can be written as a-x. Then we have the ratio

\frac{a}{b} = \frac{b}{a-x}

Solving for x allows us to express it in terms of the three sides of the triangle, again:

x = \frac{a^2 - b^2}{a}

Constructing another similar triangle

Our goal here is to relate the lengths a, b, c with a simple equation, which then the problem turns into a number theory problem. Since we can write the lengths d and x in terms of a, b, and c, we can also relate any of a, b, c, d, x in an equation.

Again, there doesn’t seem to be a way to relate all of the variables together, in a way that any solution implies the original angle ratio required — unless we make a construction.

Here we extend AB to F, so that KB = BF and \triangle KBF is an isosceles triangle.

Again, since the exterior angle here is 2 \theta, both \angle BKF = \angle BFK = \theta. Also with this construction, \triangle AKF \sim \triangle KBF, and is also isosceles, hence d = AK = KF.

With this construction, we can write the ratio

\frac{x}{d} = \frac{d}{c+x}

Perfect! Cross multiplying and then substituting in the original sides of the triangles gives

(\frac{a^2-b^2}{a})(c+\frac{a^2-b^2}{a}) = (\frac{bc}{a})^2

Simplifying this gives

(a^2 - b^2) (a^2 - b^2 + ac) = b^2 c^2

Number theory magic

Now that we have an equation relating the three side lengths — it’s easy to verify that any three integers satisfying the triangle inequality gives a triangle with the required conditions — we can use number theory to try to generate integers that fit the equation.

If we expand the equation, we get

a^4+a^3 c-2 a^2 b^2-a b^2 c+b^4-b^2 c^2 = 0

It makes sense to solve for c, which can be done using just the quadratic formula:

c = \frac{a^3 - ab^2 \pm \sqrt{(ab^2-a^3)^2 + 4b^2(b^4 + a^4 - 2a^2 b^2)}}{2b^2}

We need the discriminant D — the expression inside the square root — to be a perfect square, where we are allowed to have integer values for a and b. If we can get D to be a perfect square, then c will turn out to be a rational number. Then multiplying all three variables by a constant gives integer values for all three.

So we defined D:

D = (ab^2-a^3)^2 + 4b^2(b^4 + a^4 - 2a^2 b^2)

Expanding this gives

D = a^6 - 7a^2 b^4 + 2 a^4 b^2 + 4b^6

Fortunately, this expression has an interesting factorization:

D = (a^2+4b^2) (a+b)^2 (a-b)^2

Or we can also write

\sqrt{D} = (a+b) (a-b) \sqrt{a^2 + 4b^2}

We’ve simplified this problem to finding values where a^2 + 4b^2 is a perfect square, that is:

a^2 + (2b)^2 = k^2

This is just finding Pythagorean triples where one of the two sides are even! For instance, in the triple (3,4,5), we have a=3 and b=2. However, substituting a=3, b=2 into the quadratic formula gives c=5. This is almost a solution, only that the sides have to satisfy the triangle inequality (two sides have to add up to more than the third side).

The next Pythagorean triple (5,12,13) gives a=5 and b=6. Substituting this in gives c=11/9, which does satisfy the triangle inequality. Multiplying everything by 9 gives a=45, b=54, c=11 as the smallest working combination.

With this method, it is possible to quickly find arbitrarily many such triples, using Pythagorean triples as a starting point (which can be generated quickly with known methods).

On some number-theoretic properties of right triangles (Project Euler 218)

The two hundred and eighteenth problem of Project Euler is quite interesting, but different. It resembles more closely a mathematics olympiad problem than a programming challenge. Its answer is somewhat surprising, too.

Original problem

The problem can be stated as follows:

A right triangle is considered perfect if (1): it is primitive (GCD of the three sides is 1) and (2): its hypotenuse is a perfect square.

A perfect triangle is considered superperfect if its area is divisible by the perfect numbers 6 and 28.

How many perfect triangles are there with hypotenuse below 10^{16} that are not also superperfect?


It turns out that every perfect triangle is also superperfect, meaning that for any limit there are no perfect triangles that are not superperfect.

Looking on the forums, it seems that a large group of solvers counted the triangles for a smaller limit, like 10^{9} or 10^{12} , and getting 0, assumed the answer applied for 10^{16} .

In this article I will attempt to prove, mathematically, that it is indeed impossible for a perfect triangle not to be superperfect.


Let’s rephrase this problem a bit: Prove that if a primitive right triangle has a hypotenuse that is a perfect square then its area must be a multiple of 6 and 28.

If the area is a multiple of 6 and 28, then it is a multiple of \mathrm{LCM}(6,28) = 84 . If we let p, q, and c be the sides of the right triangle (with c as the hypotenuse), then the area is \frac{pq}{2} .

Since 84 | \frac{pq}{2} , it follows that 168 | pq . As c is a perfect square, we write c as r^2 and since \mathrm{GCD}(p,q,c)=1 , it also follows that \mathrm{GCD}(p,q,r)=1 . This is what we shall now prove.

Lemma 1

For positive integers p, q, r where p^2 + q^2 = r^4 and \mathrm{GCD}(p,q,r)=1 , then 168|pq .

From the Euclid theorem of Pythagorean triples, the sides of a primitive right triangle with sides a, b, and c can be represented as:

a = u^2 - v^2

b = 2uv

c = u^2 + v^2

.. where u>v and u and v are of opposite parity.

Thus by applying the theorem to our triple:

p = u^2 - v^2

q = 2uv

r^2 = u^2 + v^2

Notice here that the third equation here itself represents a pythagorean triple. We then apply the same formula again, this time using m and n for the integers:

u = m^2 - n^2

v = 2mn

r = m^2 + n^2

Substituting for p, q, r:

\begin{array}{rcl} p &=& u^2 - v^2 \\ &=& (m^2-n^2) - (2mn)^2 \\ &=& m^4 - 2m^2n^2 + n^4 - 4m^2n^2 \\&=& m^4 + n^4 - 6m^2 n^2 \\ \\ q &=& 2uv \\ &=& 2(m^2-n^2)(2mn) \\ &=& 4mn(m^2-n^2) \\ \\ r &=& m^2 + n^2 \end{array}


pq = 4mn(m^2-n^2)(m^4+n^4-6m^2n^2)

.. which we will prove to be divisible by 168. Since 168=8*3*7, the expression must be proved to be divisible by 8, by 3, and by 7.

Lemma 2

8 | 4mn(m^2-n^2)(m^4+n^4-6m^2n^2)

Proof of Lemma 2

The proof is almost trivial. In order for the triple to primitive, m and n must be of opposite parity, meaning mn is even.

Because 2|mn and 4 is already a factor of the expression, it follows that 8|pq .

Lemma 3

3 | mn(m^2-n^2)

Proof of Lemma 3

Rewrite the expression as

mn(m+n)(m-n) .

If 3|m or 3|n , then 3|mn .

If m \equiv n \mod 3 , then m-n \equiv 0 \mod 3 .

The only other scenario is when m \not \equiv n \mod 3 , then either m \equiv 1 \mod 3 and n \equiv 2 \mod 3 or vice versa. Either way, m+n \equiv 0 \mod 3 .

Lemma 4

If 7 \nmid m , then m^2 \equiv 1,2,4 \mod 7 .

Proof of Lemma 4

We construct a table. The first column of this table is m \mod 7 while the second column is m^2 \mod 7:

1 1
2 4
3 2
4 2
5 4
6 1

The only possible values are 1, 2, and 4.

Lemma 5

If 7 \nmid m^2 - n^2 , then either m^2 \equiv 2n^2 \mod 7 or n^2 \equiv 2m^2 \mod 7 .

Proof of Lemma 5

Because 7 \nmid m^2 - n^2 , m^2 \not\equiv n^2 \mod 7 . Then, not counting reflective cases, there are three cases we need to consider:

Case 1: m^2 \equiv 1, n^2 \equiv 2 . Then n^2 \equiv 2m^2 \mod 7 .

Case 2: m^2 \equiv 1, n^2 \equiv 4 . Then m^2 \equiv 2n^2 \mod 7 .

Case 3: m^2 \equiv 2, n^2 \equiv 4 . Then n^2 \equiv 2m^2 \mod 7 .

One of these (or their reflection) apply for whatever value of m and n.

Lemma 6

7 | m^4 + n^4 - 6m^2n^2

Proof of Lemma 6

Without loss of generality, rewrite the expression as a congruence mod 7, in terms of m. Since m^2 \equiv 2n^2 \mod 7,

\begin{array}{rcl} m^4 + n^4 - 6m^2n^2 &=& m^4 + (2m^2)^2 - 6m^2(2m^2) \\ &=& m^4 + 4m^4 - 12m^4 \\ &=& -7m^4 \end{array}

The result follows.


Challenge of the Week 05/11/2010

Update (7/31/2010): This article has been rewritten, with several more elegant solutions added, rather than the coordinate solution presented originally.

This is the second time I’m attempting a Challenge of the Week problem, after solving this one and winning coupons for Baskin Robbins’ Ice Cream (which by the way I can’t use because I’m in Canada).

Well my solution here is pretty inelegant and is a sort of ‘coordinate bash’ geometric solution. Pretty sure a better solution exists, but this is the best I could do for now.

The problem is originally from here (although I expect it will be somewhere else by the end of this week).


In equilateral triangle \triangle ABC, M is a point on BC. Let N be the point not on BA such that \triangle BMN is equilateral. P, Q, and R are midpoints of AB, BN, and CM respectively. Prove that \triangle PQR is equilateral.

Solution by coordinate geometry

As with coordinate geometry, I put the figure on a cartesian grid with B as the origin.

Without losing generality, we can let BC be on the x-axis and C be (4,0). Also let M be (4x,0) for some value x between 0 and 1. So 4x is between 0 and 4 and is any point on BC.

Doing this doesn’t lose generality because any instance of this problem can be rotated and scaled so that B is the origin and C is (4,0).

I choose these values for the initial points in order to end up with easier to work with points for PQR.

Noting the coordinates of the points:

B = (0,0)

C = (4,0)

M = (4x,0)

Since we know B and C, we can calculate A and N:

A = (2,2 \sqrt{3})

N = (2x, -2x \sqrt{3})

It becomes easy to calculate P, Q, and R:

P = (1, \sqrt{3})

Q = (x, -x \sqrt{3})

R = (2x+2, 0)

We now calculate the distances of PQ, QR, and PR using the distance formula:

\begin{array}{rcl} PQ &=& \sqrt{(x-1)^2 + (x \sqrt{3} + \sqrt{3})^2} \\ &=& 2 \sqrt{x^2 + x + 1} \end{array}

\begin{array}{rcl} QR &=& \sqrt{(2x+2-x)^2 + (x \sqrt{3})^2} \\ &=& 2 \sqrt{x^2 + x + 1)} \end{array}

\begin{array}{rcl} PR &=& \sqrt{(2x+2-1)^2 + \sqrt{3}^2} \\ &=& 2 \sqrt{x^2 + x + 1)} \end{array}

We find that the three simplify to the same thing. That means PQ = QR = PR and \triangle PQR is equilateral.

Solution via trigonometry

Rather than a coordinate bash, we can use a similar trigonometry bash. We let a be the larger equilateral side (AB), and b be the smaller equilateral side (BN).

Using the law of cosines, with the fact that \cos 60^\circ = \frac{1}{2} and \cos 120^\circ = -\frac{1}{2}, we can get these equations:

PR^2 = (\frac{a}{2})^2 + (\frac{a+b}{2})^2 - (\frac{a}{2})(\frac{a+b}{2})

QR^2 = (\frac{b}{2})^2 + (\frac{a+b}{2})^2 - (\frac{b}{2})(\frac{a+b}{2})

PQ^2 = (\frac{a}{2})^2 + (\frac{b}{2})^2 + (\frac{a}{2})(\frac{b}{2})

If we just simplify these three expressions, we find that they are exactly the same:

PR^2 = QR^2 = PQ^2 = \frac{a^2+ab+b^2}{4}

which proves the result.

Solution via homothetic triangles

A more elegant and less ‘bashy’ solution exists via homothety.

We expand triangle PQR by a factor of 2 about the point B, giving the new triangle ANK. That is, BA is twice of BP, and BK is twice of BR, and so on.

It is sufficient to prove that \triangle ANK is equilateral, as it is homothetic to \triangle PQR.

Since BK = 2BR, it follows that BM=CK. It is obvious that \angle ABN = \angle ACK = 120^\circ, and also that AB = AC, thus triangles ABN and ACK are congruent. Thus, AN = AK.

As \angle BAC = 60^\circ, it is also true that \angle NAK = 60^\circ, thus proving triangle ANK to be equilateral by SAS.

The two-circles method of proving the Pythagorean Theorem

The Pythagorean theorem, stating that the square of the hypotenuse is equal to the sum of the squares of the sides in a right triangle, is a fundamental concept in geometry and trigonometry.

There are many, many ways to prove the Pythagorean theorem. This page contains 84 different proofs, and The Pythagorean Proposition by Loomis contains another 367 proofs.

This is what I think to be an interesting proof, by the use of two circles. It is number 89 in Loomis’s book.

Starting with a right triangle, we draw two circles:

Hosted by

Here, \angle ACB is right, and A and B are the centers of circles \Omega_1 and \Omega_2 respectively.

Now we draw some additional lines, extending AB to F and G:

Hosted by

In this diagram, we can prove that \triangle ADC \sim \triangle ACG:

  1. \angle DCG is right. This is an application of Thales’ theorem since DG is a diameter.
  2. \angle ACB is right. This is given.
  3. \angle ACD = \angle BCG. Since \angle DCG = \angle ACB, \angle DCG - \angle DCB = \angle ACB - \angle DCB.
  4. \angle BCG = \angle BGC. This is because BC and BG are both radii of \Omega_2 and \triangle BCG is isosceles.
  5. \angle ACD = \angle BGC = \angle AGC.
  6. \therefore \triangle ADC \sim \triangle ACG. The two triangles have two shared angles: \angle CAG and \angle ACD = \angle AGC.

In a similar way, we can prove that \triangle BEC \sim \triangle BCF.

The rest of the proof is algebraic rather than geometric. Let’s call the side AC to be b, BC=a, and AB=c.

From the similar triangles, we have the following ratios:

\frac{AG}{AC} = \frac{AC}{AD} (or, b^2 = AG \cdot AD)

\frac{BF}{BC} = \frac{BC}{EB} (or, a^2 = BF \cdot EB)

Adding the two equations, we get:

a^2 + b^2 = BF \cdot EB + AG \cdot AD

The line BF can be split into AF and AB which is equal to c+b since AF = AC.

The line EB can be considered the difference between AB and AE, which is equal to c-b.

Similarly, AG = AB+BG = c+a, and AD = AB-DB = c-a. By substitution:

\begin{array}{rcl} a^2 + b^2 &=& (c+b) (c-b) + (c+a) (c-a) \\ &=& c^2 - b^2 + c^2 - a^2 \\ &=& 2c^2 - a^2 - b^2 \\ 2a^2 + 2b^2 &=& 2c^2 \\ a^2 + b^2 &=& c^2 \end{array}


A Geometry Exercise

This post was inspired by a problem I came across when studying for a math contest, namely problem 23 in the 2007 Cayley Contest.

I think this is an interesting problem because the solution is not so obvious, and there are several incorrect approaches that I’ve tried (and given up on). I’ve modified the problem slightly to make it more interesting.

Here we have a rectangle, ABCD. It’s rotated to the right by \theta degrees, using A as the pivot. This forms AEFG as the new rectangle. We are given x and y, we are asked to find the area of the shaded region.

Can you come up with a general formula for the area of the shaded region, using the given variables x, y, and \theta?


I will use the notation (ABC) to denote the area of triangle \triangle ABC.

The solution is to draw a line parallel to BC through E, the blue line. This splits the lower shaded area into two triangles: \triangle AXE and \triangle EYH, and a rectangle: BCYX.

Since ABCD is the same as AEFG, the top shaded area is the same as the bottom shaded area.

Let’s find the area of triangle \triangle AXE.

The angle \angle BAE is equal to \theta. Of course, AE is equal to y. Using some basic trigonometry, AX = y \cos \theta, XE = y \sin \theta, and BX = y - y \cos \theta. Now we know the area of triangle \triangle AXE:

\begin{array}{rcl} (AXE) &=& \frac{1}{2} (y \cos \theta) (y \sin \theta) \\ &=& \frac{1}{4} y^2 \sin(2 \theta)\end{array}

Next we find the area of \triangle EYH. EY = x - y \sin \theta, and using trigonometry again, HY = \tan \theta (x - y \sin \theta).

\begin{array}{rcl} (EYH) &=& \frac{1}{2} (x - y \sin \theta) [\tan \theta (x - y \sin \theta)] \\ &=& \frac{1}{2} (x - y\sin \theta)^2 \tan \theta \end{array}

Finally, for the rectangle BCYX:

\begin{array}{rcl} (BCYX) &=& x (y - y \cos \theta) \end{array}

Now we add all this up and simplify (T is total):

\begin{array}{rcl} T &=& 2[(AXE) + (EYH) + (BCYX)] \\ &=& 2[\frac{1}{4} y^2 \sin(2 \theta) + \frac{1}{2}(x - y \sin \theta)^2 \tan \theta + x (y - y \cos \theta)] \\ &=& \tan \theta (x^2+y^2) - 2xy (\sec \theta - 1)\end{array}

It’s rather tedious to get from the second to the last step, but Wolfram Alpha could do it for me.

Notice that we assume that EF intersects DC. If \theta is large enough, then EF would intersect AD, and this formula would no longer work.

Of course it would also fail if y is greater than x, so my formula works for only a rather limited range of values.

Solving systems of linear equations in Haskell

Haskell isn’t normally used for things like this, but it’s quite possible to solve systems of linear equations with Haskell.

There are already several libraries for doing this, and other more advanced matrix manipulating. But here, I’m going to start simple.

In mathematics, systems of linear equations are usually represented by an augmented matrix. A system of n linear equations would be represented by an augmented matrix with n rows and n+1 columns.

For example, we have this system of equations:

\begin{array}{rrrcl} x &+2y &-z &=& -4 \\ 2x &+3y &-z &=& -11 \\ -2x &&-3z &=& 22 \end{array}

This would be represented as an augmented matrix:

\left[ \begin{array}{ccc|c} 1 & 2 & -1 & -4 \\ 2 & 3 & -1 & -11 \\ -2 & 0 & -3 & 22 \end{array} \right]

In Haskell we represent this as a list of lists, like this:

[ [1,2,-1,-4], [2,3,-1,-11], [-2,0,-3,22] ]

Here I’ll store each entry not as an integer, but as a floating point. You could also use Rational in Data.Ratio but both are fine for now.

The advantage of using Rational over Float is that sometimes you will end up with fractions that don’t work very well with floating point numbers. However, I’ve found that printing a list of lists of Rational types makes it difficult to read, unless you implement a custom show function for it.

So this is how we define our matrix types in Haskell:

type Row = [Float]
type Matrix = [Row]

The approach to solving this problem is rather simple. First we reduce whatever matrix we have to REF, or Row Echelon Form and then get the actual roots with some back substitution.

The algorithm used to transform a matrix to its Row Echelon Form is known as the Gaussian Elimination. Here’s what a matrix should look like after Gaussian Elimination (a * represents any value):

\left[ \begin{array}{ccc|c} 1 & * & * & * \\ 0 & 1 & * & * \\ 0 & 0 & 1 & * \end{array} \right]

Our matrix should look like this after Gaussian Elimination:

\left[ \begin{array}{ccc|c} 1 & 2 & -1 & -4 \\ 0 & 1 & -1 & 3 \\ 0 & 0 & 1 & -2 \end{array} \right]

The REF form is not unique, so that is only one of the possible valid outputs for the Gaussian Elimination.

Why do we want to have the matrix in REF form? A matrix in this form can easily be solved using back substitution. Consider this matrix as a series of linear equations, as we did before:

\begin{array}{rrrcl} x &+2y &-z &=& -4 \\ &+y &-z &=& 3 \\ &&z &=& -2 \end{array}

Now it would be very clear how to solve for the three variables.

The Gaussian Elimination Algorithm

Here is a diagram of how Gaussian Elimination works. On each iteration, the element circled in green is considered the pivot element, while the elements enclosed in the red square are the ones we intend to remove (zero) in each iteration.

Removing the red elements in the matrix is actually quite simple. Consider how you would eliminate x in equation B here:

\begin{array}{lrrcl}(A) & x & +2y & = & 4 \\(B) & 2x & +y & = & 5 \end{array}

Probably you would multiply equation A by 2, giving 2x + 4y = 8, then subtract B from it, giving 3y=3, eliminating x.

We can also write that as B = 2A - B. Basically to eliminate a variable, just multiply a row so it matches up, and subtract. This is middle school algebra.

To make things easier for us, we divide the row we are on so that the pivot is always 1. We do this now because we need them to be 1 anyways, and this avoids an unnecessary division in the next step.

We could, of course, not have the pivot always be 1, but we would have to do the divisions later when substituting to get the solutions. More on this later.

So to eliminate the variable under the pivot, multiply the whole row by that number. I have a picture to clarify:

We simply repeat this for all elements under the pivot.

An edge case

This is where it gets a little bit tricky. What if the pivot is 0?

We have no way of making it 1 by any kind of multiplication. Further, we cannot eliminate any elements below the pivot. What do we do now?

Simple. We swap the current row with any other row so that the pivot is not zero. Any row will do, so we’ll just pick the first one that fits.

If there is not a single element below the pivot that is not zero, the matrix is either under-determined or singular; either case it is unsolvable.

Here is my Haskell code on what I just covered:

gaussianReduce :: Matrix -> Matrix
gaussianReduce matrix = fixlastrow $ foldl reduceRow matrix [0..length matrix-1] where

 --swaps element at position a with element at position b.
 swap xs a b
  | a > b = swap xs b a
  | a == b = xs
  | a < b = let
  (p1,p2) = splitAt a xs
  (p3,p4) = splitAt (b-a-1) (tail p2)
  in p1 ++ [xs!!b] ++ p3 ++ [xs!!a] ++ (tail p4)

 reduceRow matrix1 r = let
  --first non-zero element on or below (r,r).
  firstnonzero = head $ filter (\x -> matrix1 !! x !! r /= 0) [r..length matrix1-1]

  --matrix with row swapped (if needed)
  matrix2 = swap matrix1 r firstnonzero

  --row we're working with
  row = matrix2 !! r

  --make it have 1 as the leading coefficient
  row1 = map (\x -> x / (row !! r)) row

  --subtract nr from row1 while multiplying
  subrow nr = let k = nr!!r in zipWith (\a b -> k*a - b) row1 nr

  --apply subrow to all rows below
  nextrows = map subrow $ drop (r+1) matrix2

  --concat the lists and repeat
  in take r matrix2 ++ [row1] ++ nextrows

 fixlastrow matrix' = let
  a = init matrix'; row = last matrix'; z = last row; nz = last (init row)
  in a ++ [init (init row) ++ [1, z / nz]]

Edit: There was a bug in the above code, found by Alan Zimmerman. I think it’s been fixed.

This may be a bit difficult to read because there is no syntax highlighting and the code is cut off. I’ll provide a link to the full source code at the end.

Admittedly Haskell may not have been the best language to implement this algorithm this particular way because there is so much state changing. Any language that allows mutable state would probably perform better than this code.

Notice that at the end, the last row does not get divided. The fixlastrow function corrects this problem.

Let’s test this code:

*Main> gaussianReduce [ [1,2,-1,-4], [2,3,-1,-11], [-2,0,-3,22] ]


Finishing up

The next step of the algorithm is to solve the variables by back substitution. This is pretty easy, I think. My code keeps a list of already-found solutions. Folding from the right, each step it substitutes in the corresponding solution and multiplies & subtracts to get the next solution, adding that to the solution list.

--Solve a matrix (must already be in REF form) by back substitution.
substitute :: Matrix -> Row
substitute matrix = foldr next [last (last matrix)] (init matrix) where

 next row found = let
  subpart = init $ drop (length matrix - length found) row
  solution = last row - sum (zipWith (*) found subpart)
  in solution : found

To get a list of solutions from a matrix, we chain the substitute and gaussianReduce functions:

solve :: Matrix -> Row
solve = substitute . gaussianReduce
*Main> solve [ [1,2,-1,-4], [2,3,-1,-11], [-2,0,-3,22] ]

This means the solutions are (x,y,z) = (-8,1,-2). That seems correct, so we’re done!

The code is far from practical, though. Although it works, I haven’t really tested its performance (probably not very good), and it doesn’t handle all edge cases.