Visualizing Quaternions with Unity

How do you model the position and orientation of an airplane?

Position is easy, just represent it with a point in 3D space. But how do you specify its orientation — which direction it’s pointing?

At first glance, it seems a vector will do. After all, a vector points in some direction, right? If the plane is pointing east, represent its orientation by a unit vector pointing east.

Unfortunately, we quickly run into trouble when we try to roll. If we’re facing east, and we roll 90 degrees, we’re still facing east. Clearly we’re missing something.

Euler Angles

When real pilots talk about their orientation, they talk about roll, yaw, pitch. Pitch is going up or down, yaw is going left or right, roll is, well, roll.

Any change in orientation can be described by some combination of roll, yaw, pitch. This is the basis for Euler Angles. We use three angles to represent the airplane’s orientation.

This is all fine and dandy if we want to represent the orientation of a static object in space. But when we try to adjust our orientation, we start to run into problems.

You’re thinking, this should be simple! When we turn left or right, we just increment the yaw variable, right? Yes, it seems to work, at least initially. You can turn left and right, up and down, and roll around.

Implement it in Unity and play around a bit, however, and you begin to notice that things don’t quite behave the way you expect.

In this animation, I’m holding down the right button:

The plane does rotate to the right, but it’s not rotating relative to itself. Instead it’s rotating around some invisible y-axis. If it was rotating relative to itself, the green arrow shouldn’t be moving.

The problem becomes more and more severe when the pitch of the plane becomes higher and higher. The worst case is when the airplane is pointing straight up: then roll and yaw become the same thing! This is called gimbal lock: we have lost a degree of freedom and we can only rotate in 2 dimensions! Definitely not something desirable if we’re controlling a plane or spaceship.

It turns out that no matter what we do, we will suffer from some form of gimbal lock. As long as we use Euler Angles, there is one direction where if we turn too far, everything starts to screw up.

Practical Introduction to Quaternions

All is not lost, however. There is a way to represent orientation that represents all axes equally and does not suffer from gimbal lock. This mythical structure is called the quaternion. Unlike Euler Angles which describe your orientation relative to a fixed set of axes, quaternions do not rely on any fixed axis.

The drawback is that quaternions are unintuitive to understand for humans. There is no way to “look” at a quaternion and be able to visualize what rotation it represents. Fortunately for us, it’s not that difficult to make use of quaternions, even if we can’t visualize quaternions.

There is a lot of theory behind how quaternions work, but in this article, I will gloss over the theory and give a quick primer to quaternions, just the most common facts you need to use them. At the same time, I will implement the operations I describe in C#, so I can integrate them with Unity. If you don’t know C#, you can freely ignore the code.

Definition

A quaternion is an ordered pair of 4 real numbers (w,x,y,z). We write this as

w+xi+yj+zk

The letters i,j,k are not variables. Rather, they are independent axes. If you like, you can think of the quaternions as a 4 dimensional vector space.

The defining property of quaternions is:

i^2 = j^2 = k^2 = ijk = -1

Play around with it a bit and you can derive 6 more identites:

ij = k

jk = i

ki = j

ji = -k

kj = -i

ik = -j

If you’ve worked with complex numbers, this should seem familiar. Instead of 2 parts of a complex number (the real and imaginary parts), we have 4 parts for a quaternion.

The similarity doesn’t end here. Multiplying complex numbers represents a rotation in 2 dimensions. Similarly, multiplying by a quaternion represents a rotation in 3D.

One curious thing to note: we have ij=k and ji=-k. We switched around the terms and the product changed. This means that multiplying quaternions is kind of like multiplying matrices — the order matters. So multiplication is not commutative.

Here’s a framework for a quaternion in C#:

public class Quat{
	// Represents w + xi + yj + zk
	public float w, x, y, z;
	public Quat(float w, float x, float y, float z){
		this.w = w;
		this.x = x;
		this.y = y;
		this.z = z;
	}
}

Normalizing Quaternions

The norm of a quaternion is

N(\mathbf{q}) = \sqrt{w^2+x^2+y^2+z^2}

When we use quaternions to represent rotations, we typically want unit quaternions: quaternions with norm 1. This is straightforward: to normalize a quaternion, divide each component by the norm.

In C#:

public float Norm(){
  return Mathf.Sqrt (w * w + x * x + y * y + z * z);
}

public Quat Normalize(){
  float m = Norm ();
  return new Quat (w / m, x / m, y / m, z / m);
}

Multiplying Quaternions

Multiplying is simple, just a little tedious. If we have two quaternions:

(w_1 + x_1i + y_1j + z_1k) (w_2+x_2i+y_2j+z_2k)

Then their product is this ugly mess:

\begin{array}{l} w_1w_2-x_1x_2-y_1y_2-z_1z_2 \\ + (w_1x_2+x_1w_2+y_1z_2-z_1y_2)i \\ + (w_1y_2+y_1w_2-x_1z_2+z_1x_2) j \\ + (w_1z_2+z_1w_2+x_1y_2-y_1x_2) k \end{array}

In C#:

// Returns a*b
public static Quat Multiply(Quat a, Quat b){
  float w = a.w * b.w - a.x * b.x - a.y * b.y - a.z * b.z;
  float x = a.w * b.x + a.x * b.w + a.y * b.z - a.z * b.y;
  float y = a.w * b.y + a.y * b.w - a.x * b.z + a.z * b.x;
  float z = a.w * b.z + a.z * b.w + a.x * b.y - a.y * b.x;
  return new Quat (w,x,y,z).Normalize();
}

Since multiplication is not commutative, I made this function static to avoid confusing left and right multiplication. Also, I normalize the product so that floating point errors don’t accumulate.

Constructing Rotation Quaternions

Every rotation operation can be written as a rotation of some angle, \theta, around some vector (u_x, u_y, u_z):

The following formula gives a quaternion that represents this rotation:

\mathbf{q} = \cos \frac{\theta}{2} + (u_x i + u_y j + u_z k) \sin \frac{\theta}{2}

For our purposes, \theta is a very small number, say 0.01, and we use one of the three basis vectors to rotate around. For example, if we are rotating around (1,0,0) then our quaternion is

\cos \frac{0.01}{2} + \sin \frac{0.01}{2}i

That’s it: given any quaternion, multiplying on the left by our quaternion rotates it slightly around the x axis.

In C#, our code might look like this:

Quat qx = new Quat (Mathf.Cos (0.01 / 2), 0, 0, Mathf.Sin (0.01 / 2));
Quat qy = new Quat (Mathf.Cos (0.01 / 2), 0, Mathf.Sin (0.01 / 2), 0);
Quat qz = new Quat (Mathf.Cos (0.01 / 2), Mathf.Sin (0.01 / 2), 0, 0);

Putting it together

That’s all we need to do interesting things with quaternions. Let’s combine everything we have. Here’s our quaternion class thus far:

public class Quat{
	// Represents w + xi + yj + zk
	public float w, x, y, z;
	public Quat(float w, float x, float y, float z){
		this.w = w;
		this.x = x;
		this.y = y;
		this.z = z;
	}

	public float Norm(){
		return Mathf.Sqrt (w * w + x * x + y * y + z * z);
	}

	public Quat Normalize(){
		float m = Norm ();
		return new Quat (w / m, x / m, y / m, z / m);
	}

	// Returns a*b
	public static Quat Multiply(Quat a, Quat b){
		float w = a.w * b.w - a.x * b.x - a.y * b.y - a.z * b.z;
		float x = a.w * b.x + a.x * b.w + a.y * b.z - a.z * b.y;
		float y = a.w * b.y + a.y * b.w - a.x * b.z + a.z * b.x;
		float z = a.w * b.z + a.z * b.w + a.x * b.y - a.y * b.x;
		return new Quat (w,x,y,z).Normalize();
	}

	public Quaternion ToUnityQuaternion(){
		return new Quaternion (w, x, y, z);
	}
}

Now we just need to read the input, perform our calculations, and output the rotation quaternion to Unity:

public class Airplane : MonoBehaviour {
  public GameObject airplane;
  public Quat quat = new Quat (0, 0, 0, -1);
  public float speed = 0.01f;

  void FixedUpdate(){
    float inputX = Input.GetAxis("UpDown");
    float inputY = Input.GetAxis("LeftRight");
    float inputZ = Input.GetAxis("Roll");

    Quat qx = new Quat (Mathf.Cos (speed * inputX / 2), 0, 0, Mathf.Sin (speed * inputX / 2));
    Quat qy = new Quat (Mathf.Cos (speed * inputY / 2), 0, Mathf.Sin (speed * inputY / 2), 0);
    Quat qz = new Quat (Mathf.Cos (speed * inputZ / 2), Mathf.Sin (speed * inputZ / 2), 0, 0);

    quat = Quat.Multiply (qx, quat);
    quat = Quat.Multiply (qy, quat);
    quat = Quat.Multiply (qz, quat);

    airplane.transform.rotation = quat.ToUnityQuaternion ();
  }
}

In Unity, the input is not given to us as a single true/false value, but a float between -1 and 1. So holding right increases the LeftRight input gradually until it reaches 1, avoiding a sudden jump in movement.

What’s ToUnityQuaternion? Well, it turns out that Unity already has a Quaternion class that does everything here and much more, so all this could have literally been implemented in one line if we wanted.

Anyways, let’s see the result.

As you can see, holding right turns the plane relative to itself now, and the green arrow stays still. Hooray!

Minimum quadrilateral inscribed in a square

A problem that I’ve seen lately reduces to the following problem:

We have a square, and we put a point on each side of the square. Then we connect the four points to create a quadrilateral. How can we make this quadrilateral have the smallest possible perimeter?

Intuitively, you may believe that this natural, obvious configuration should produce the least perimeter:

Attempt with Calculus

How can we prove that this indeed gives us the smallest possible perimeter?

A first attempt might be to give variables to the side lengths, and somehow find the minimum perimeter using algebra and calculus tools. So there are four independent points — let’s parameterize them with four variables, and assume the side length of the square is 1:

Then we want to minimize this expression:

\sqrt{a^2+(1-d)^2} + \sqrt{b^2+(1-a)^2}+ \sqrt{c^2+(1-b)^2}+ \sqrt{d^2+(1-c)^2}

At this point, it isn’t clear how to proceed — there doesn’t seem to be any way to minimize this expression of four variables.

Proof by Net

We’ll have to try something different. It’s hard to make sense of anything when there are four independent variables. Instead, if we expand things out a bit, things start to become more manageable:

What we did was reflect the square three times, and each time the square is reflected, the inscribed quadrilateral goes with it. By taking only the relevant parts of the quadrilateral, we get the green path.

Now we might have a solution. If we had a different green path, can we reverse the steps and get the original quadrilateral back? Basically, the following requirements have to be met:

  • The path has to cross all three of the internal lines BC, BA, and DA.
  • The path’s position on the bottom-most line, DC must be the same when reflected onto the top-most line DC.

With these requirements in mind, the shortest green path that satisfies these requirements is a straight line connecting a point on the bottom left to its reflected point on the top right:

Our intuition at the start was well-founded.

Now notice that this isn’t the only possible shortest path. If we move the entire green line to the left or right, we get a different path of the same length!

For instance, the degenerate ‘quadrilateral’ formed by connecting two opposite corners has the same perimeter as the one we get by connecting the midpoints. Neat, huh?

Understanding Harmonic Conjugates (sort of)

For many people (for me at least), the Harmonic Conjugate is a difficult concept to understand. I didn’t really get it the first time I saw it, at Mathcamp. Let’s take the definition of the harmonic conjugate:

AB and CD are harmonic conjugates if this equation holds:

\frac{AC}{BC} = \frac{AD}{BD}

If you’re like me, you’re thinking along the lines of “But why? Why is this defined this way? Why would we spend so much time proving things about this weird concept? What’s the point, what’s the use?”

Even now, I can’t really give you an intuitive explanation of why this equality is so important. On the other hand, I could certainly come up with a few problems in which the concept of the harmonic conjugate turns to be useful.

Apollonius and Fleeing Ships

Apollonius’s problem was this: you are in control of a ship (point A on diagram), and you are in pursuit of another ship (point B). The other ship is fleeing in a straight line in some direction:

Your speed is (obviously) faster than the speed of the other ship: say they’re going at 30 km/h and you’re going at 50 km/h. Additionally, your ship is required to go in a straight line.

In which direction should you set off in order to intercept the fleeing ship?

Solution with Harmonic Conjugates

The first step of the solution is to construct harmonic conjugates CD so that their ratio is the ratio of your speed to the other ship’s speed (we’ll prove later that this is actually possible; assume we can do this for now):

\frac{AC}{BC} = \frac{AD}{BD} = \frac{5}{3}

Next, draw a circle with diameter CD:

There is a point where the ray from B (their ship) intersects this circle. Now go to this point immediately, in a straight line: the ship will be there.

The Proof

In order to prove that this works, we’ll need to take a step back and look at how we constructed the points C and D. The solution turns out to be evident directly from the construction of the harmonic conjugates.

Again, let’s assume our desired ratio is 5/3. Starting with the points A and B, the first step is constructing some point P so that:

\frac{AP}{BP} = \frac{5}{3}

This is fairly easy to do. Draw a circle of radius 5 around A, and draw a circle of radius 3 around B — the intersection P of these two circles forms the correct ratio. (if the circles don’t intersect, just scale everything up and try again)

Next, dropping the internal and external angle bisectors of the new triangle gives the harmonic conjugates C and D:

Why angle bisectors? From the angle bisector theorems (which I won’t prove here):

\frac{AP}{BP} = \frac{AC}{BC} = \frac{5}{3}

\frac{AP}{BP} = \frac{AD}{BD} = \frac{5}{3}

Combining the two proves that C and D are indeed harmonic conjugates to AB.

As a corollary, notice that because of angle bisecting, the angle CPD is always a right angle — hence, the locus of all points P forms a circle with diameter CD.

Returning to the ship problem, since each point P is defined as a point so that \frac{AP}{BP} = \frac{5}{3} , it follows that when both ships travel to such a point P, they will meet at the same time.

Calculating the Plane Angles of Platonic Solids

What is the angle between two planes of a tetrahedron?

The angle between any two edges of the tetrahedron is 60^\circ, so it’s easy to (falsely) conclude that the angle between two faces must also be 60^\circ.

But this isn’t the case: the plane angle is defined as the angle between two lines on the two planes that are both perpendicular to the edge. None of the edges in a tetrahedron is perpendicular to any other, so the answer of 60^\circ is invalid.

We can try to compute the angle using regular Euclidean solid geometry, but things tend to get messy. A different way to approach the problem is by using vector geometry: using vector methods we can easily calculate the plane angle of the tetrahedron (as well as the icosahedron and the dodecahedron).

Assume symmetry. We represent three concurrent edges of the polyhedron as three vectors beginning at the same point: \vec a, \vec b, and \vec c; let \alpha and \beta be the angles between the vectors (by symmetry we’re assuming that the two alpha’s are equal):
(in case my poor drawing skills does not make this apparent, vectors a and b form one face of the polyhedron and c and b form another face)

For simplicity, let’s also say the lengths of each of the three vectors is 1.

We want to compute the angle between the plane formed by \vec a and \vec c, and the plane formed by \vec b and \vec c. Hence let \vec x and \vec y be the perpendiculars to \vec a and \vec b respectively each ending at the same point as their respective vectors:

For any two vectors, the dot product is defined \vec a \cdot \vec b = |\vec a| |\vec b| \cos \theta with \theta being the angle between the vectors. Given that the lengths of the vectors are all 1, we have:

\vec a \cdot \vec c = \vec b \cdot \vec c = \cos \alpha

\vec a \cdot \vec b = \cos \beta

Also, by vector addition:

\vec c \cos \alpha + \vec x = \vec a

\vec c \cos \alpha + \vec y = \vec b

Hence \vec x = \vec a - \vec c \cos \alpha and \vec y = \vec b - \vec c \cos \alpha.

We want to find the angle between \vec x and \vec y — call this angle \theta. Then

\vec x \cdot \vec y = |\vec x| |\vec y| \cos \theta

The dot product of vectors x and y is simply:

\begin{array}{l} (\vec a - \vec c \cos \alpha) \cdot (\vec b - \vec c \cos \alpha) \\ = (\vec a - \vec c (\vec a \cdot \vec c)) \cdot (\vec b - \vec c (\vec b \cdot \vec c)) \\ = \vec a \cdot \vec b - (\vec a \cdot \vec c) (\vec b \cdot \vec c) \\ = \cos \beta - \cos^2 \alpha \end{array}

Additionally |\vec x| = |\vec y| = \sin \alpha. Hence the cosine of the angle is:

\cos \theta = \frac{\vec x \cdot \vec y}{|\vec x| |\vec y|} = \frac{\cos \beta - \cos^2 \alpha}{\sin^2 \alpha}

We can now use this newly derived formula to calculate plane angles! For example…

Tetrahedron

In the tetrahedron, both \alpha and \beta are 60:
So \cos \theta = \frac{\cos 60 - \cos^2 60}{\sin^2 60} = \frac{1}{2}  and \theta = \arccos \frac{1}{3} = 70.8^\circ.

Icosahedron

In the icosahedron, \alpha = 60 but \beta = 108:
The top ‘cap’ is a regular pentagon, which has a vertex angle of 108; each side of the pentagon constitutes a side of an equilateral triangle. Since \cos 108 = \frac{1}{4} (1-\sqrt{5}), \cos \theta = \frac{\cos 108 - \cos^2 60}{\sin^2 60} = -\frac{\sqrt{5}}{3} , and \theta = \arccos (-\frac{\sqrt{5}}{3}) = 138.2^\circ.

Dodecahedron

Computing angles for the dodecahedron works a bit differently from the tetrahedron and icosahedron. Instead of using existing edges as vectors, we construct an equilateral triangle by connecting three vertices:

So \alpha = 36 (since it’s part of a regular pentagon) and \beta = 60. Then \cos \theta = \frac{\cos^2 60 - \cos^2 36}{\sin^2 36} and \theta = 116.6^\circ.

Varignon’s theorem proved in one line with vectors

Today I was reading some math book when the author mentions Varignon’s theorem, and gives a proof. The proof was not very long, but it was somewhat confusing. On Wikipedia, several more short proofs were given, but they were all more confusing than need be.

I remembered seeing the theorem proven using vector geometry before, but I couldn’t find the text (nor any other page / book that proves it this way) —

[image shamelessly taken from Wikipedia]

Varignon’s theorem states that in any quadrilateral, if we join the midpoints of the sides, then we get a parallelogram.

In the diagram, it suffices to prove that vector HG is equal to vector EF — vectors must have both the same orientation and length to be equal. This works since any method that proves HG = EF can also prove HE = GF. The proof goes as follows —

\vec {HG} = \vec{HD} + \vec{DG} = \frac{1}{2} (\vec{AD} + \vec{DC}) = \frac{1}{2} \vec{AC} = \vec{EF}

And we’re done. (the last step is due to symmetry of HG and EF)

Problem 10 of the 2011 Euclid Math Contest (remix)

I wrote the Euclid contest this year two days ago, on tuesday. There were 10 problems, and the tenth problem was a nice geometry problem. Three subproblems with a nice neat triangle on the right, with the subproblems getting progressively harder and harder. As I proceeded with the problem, I couldn’t help but imagine it as a Project Euler problem — instead of finding one such integer triangle, it would ask you to find the sum of the perimeters of all such integer triangles with perimeter less than 10^12 or some large number.

A modified problem

In the above diagram, \angle CAK = 2 \angle KAB and \angle ABC = 2 \angle KAB. Let a = CB, b = AC, c = AB, d = AK, and x = KB. Write an algorithm to find triangles satisfying these conditions where a, b, c are all integers.

Similar triangles

It is difficult to try to find integer triangles with such strange requirements as these. It seems that the line AK is completely unnecessary, but if we take it out, there doesn’t seem to be any way to relate the angle ratios to integer side lengths.

We can prove that \triangle CAK is similar to \triangle ABC. Being an exterior angle, CKA = 3 \theta, and also \angle CAB = 3 \theta. Both of the triangles have an angle of measure 2 \theta and another angle of measure 3 \theta, thus they are similar.

From the similar triangles ratio

\frac{b}{d} = \frac{a}{c}

We can write d in terms of the three sides of the triangle:

d = \frac{bc}{a}

Similarly, the side CK can be written as a-x. Then we have the ratio

\frac{a}{b} = \frac{b}{a-x}

Solving for x allows us to express it in terms of the three sides of the triangle, again:

x = \frac{a^2 - b^2}{a}

Constructing another similar triangle

Our goal here is to relate the lengths a, b, c with a simple equation, which then the problem turns into a number theory problem. Since we can write the lengths d and x in terms of a, b, and c, we can also relate any of a, b, c, d, x in an equation.

Again, there doesn’t seem to be a way to relate all of the variables together, in a way that any solution implies the original angle ratio required — unless we make a construction.

Here we extend AB to F, so that KB = BF and \triangle KBF is an isosceles triangle.

Again, since the exterior angle here is 2 \theta, both \angle BKF = \angle BFK = \theta. Also with this construction, \triangle AKF \sim \triangle KBF, and is also isosceles, hence d = AK = KF.

With this construction, we can write the ratio

\frac{x}{d} = \frac{d}{c+x}

Perfect! Cross multiplying and then substituting in the original sides of the triangles gives

(\frac{a^2-b^2}{a})(c+\frac{a^2-b^2}{a}) = (\frac{bc}{a})^2

Simplifying this gives

(a^2 - b^2) (a^2 - b^2 + ac) = b^2 c^2

Number theory magic

Now that we have an equation relating the three side lengths — it’s easy to verify that any three integers satisfying the triangle inequality gives a triangle with the required conditions — we can use number theory to try to generate integers that fit the equation.

If we expand the equation, we get

a^4+a^3 c-2 a^2 b^2-a b^2 c+b^4-b^2 c^2 = 0

It makes sense to solve for c, which can be done using just the quadratic formula:

c = \frac{a^3 - ab^2 \pm \sqrt{(ab^2-a^3)^2 + 4b^2(b^4 + a^4 - 2a^2 b^2)}}{2b^2}

We need the discriminant D — the expression inside the square root — to be a perfect square, where we are allowed to have integer values for a and b. If we can get D to be a perfect square, then c will turn out to be a rational number. Then multiplying all three variables by a constant gives integer values for all three.

So we defined D:

D = (ab^2-a^3)^2 + 4b^2(b^4 + a^4 - 2a^2 b^2)

Expanding this gives

D = a^6 - 7a^2 b^4 + 2 a^4 b^2 + 4b^6

Fortunately, this expression has an interesting factorization:

D = (a^2+4b^2) (a+b)^2 (a-b)^2

Or we can also write

\sqrt{D} = (a+b) (a-b) \sqrt{a^2 + 4b^2}

We’ve simplified this problem to finding values where a^2 + 4b^2 is a perfect square, that is:

a^2 + (2b)^2 = k^2

This is just finding Pythagorean triples where one of the two sides are even! For instance, in the triple (3,4,5), we have a=3 and b=2. However, substituting a=3, b=2 into the quadratic formula gives c=5. This is almost a solution, only that the sides have to satisfy the triangle inequality (two sides have to add up to more than the third side).

The next Pythagorean triple (5,12,13) gives a=5 and b=6. Substituting this in gives c=11/9, which does satisfy the triangle inequality. Multiplying everything by 9 gives a=45, b=54, c=11 as the smallest working combination.

With this method, it is possible to quickly find arbitrarily many such triples, using Pythagorean triples as a starting point (which can be generated quickly with known methods).

Rotating a Hyperbola

The equation of a hyperbola gives it some very interesting properties under a series of stretches. Consider the hyperbola xy=1:

If we stretch the hyperbola horizontally by a factor of 2 about the x axis — replacing x in the equation xy=1 with \frac{x}{2}, we have the equation xy=2.

Now, if we compress the resulting hyperbola vertically by a factor of 2 — replacing y in the equation xy=2 with 2y and simplifying, we get the original hyperbola, xy=1. So by a horizontal stretch followed by a vertical compression, we get the original hyperbola.

However, consider a point on the hyperbola xy=1, say (x,y). Stretching the hyperbola horizontally by a factor of 2, this point gets transformed to (2x,\frac{y}{2}). Even though each point has a different image point, however, the graph as a whole remains identical.

More generally, if we stretch by a factor of k instead of 2, where k is any positive number, we still obtain the original hyperbola after the two stretches. The point (x,y) gets transformed to the point (kx,\frac{y}{k}). We call this combinations of transformations a hyperbolic rotation. Intuitively, the points on the hyperbola are ‘rotated’ downwards.

There are some interesting things about the hyperbolic rotation. For one, by choosing the right k for the transformation, it is possible to transform any point on the hyperbola into any other point on the same arm of the hyperbola (by allowing k to be smaller than 1). Also, since the hyperbolic rotation is composed of two simple stretches, parallel lines as well as ratios of a line segments are preserved. The area of a figure is also preserved since one stretch reduces the area, while the other multiplies the area by the same amount.

With these facts, we can prove things about the hyperbola itself:

Theorem: Let P be a point on the hyperbola xy=1. Let l be the line passing through P tangent to the hyperbola. Let X be the intersection of l and the x-axis, and Y be the intersection of l with the y-axis. Prove that P is the midpoint of XY.

The proof is very simple with the hyperbolic rotation. Consider when P = (1,1). The theorem obviously holds here, because of symmetry. But by applying a hyperbolic rotation, we can transform the point (1,1) into any other point on the hyperbola. Since ratios between line segments don’t change with a hyperbolic rotation, P is always the midpoint, and we are done.

Calculus magic

You have probably seen some version of this curve at some time or another:

The blue lines are drawn by putting n evenly spaced points in the interval [0,1] for both the horizontal and vertical axis and connecting them.

As n approaches infinity, the question is, what is the area occupied by the blue lines? The area is clearly finite since the entire shape is bounded by a 1×1 square, but how much of that square is occupied?

Intuitively it might seem that the blue curve is a circle, however that intuition is wrong. In the above diagram, the red curve is the actual circle, and clearly the blue curve is outside the circle. But fortunately the area can be found with some calculus.

If we choose some point (a,0) on the x-axis, then obviously that point is connected to point (0,1-a) on the y-axis. The slope of this line is \frac{a-1}{a} and the equation of this line is y=\frac{a-1}{a}x+1-a. More conveniently, we can this expression as g(a,x), intuitively the line formed when we choose a as our x-axis point.

Let f(x) be the bounding curve for the blue lines. To find f(x), we want to choose the value a, given x, so that g(a,x) is maximized. To do this, we take the partial derivative \frac{\partial y}{\partial a}:

\frac{\partial y}{\partial a} = x(\frac{a-a+1}{a^2})-1=\frac{x}{a^2}-1

The optimal value of a is the one for which \frac{\partial y}{\partial a}=0, so solving for a we get a=\sqrt{x}. Then we have

f(x) = g(\sqrt{x},x) = x \frac{\sqrt{x}-1}{\sqrt{x}}+1-\sqrt{x}

f(x)= 1+x-2\sqrt{x}

Integrating f(x) from 0 to 1 gives us the answer:

\displaystyle \int_0^1 (1+x-2\sqrt{x}) dx = \frac{1}{6}

Notes on Mercator’s Projection

The following work is my math essay assignment (an essay on any topic related to the Math 30 curriculum). I found it interesting enough that I’ll post it here on my math blog. Yay.

Introduction

How do you make a world map? We find world maps everywhere — but the making of world maps is more complicated than most people realize. The earth is round, and a map is flat. How should one represent the surface of the earth, a sphere, on a flat sheet of paper?

The problem of displaying the surface of a sphere on a map has concerned cartographers, or map makers, for centuries. Over the years, cartographers have come up with a multitude of map projections, or systematic methods of transferring the surface of the earth onto flat paper.

There are many types of map projections, but one of the most common types is the cylindrical projection. Imagine a semi-transparent, hollow globe with a light inside of it. Now wrap a piece of blank map paper around it, like a cylinder:

Let the earth be represented as a unit sphere. The longitude (east-west location) shall be denoted as \lambda, and the latitude (north-south location) as \theta. A is the center of the earth, and B a point on the equator.

Now a light source is placed at point A; a point D on the surface of the globe will be projected in a straight line to point C on the cylinder such that points A, D, and C are collinear. Using some trigonometry, we have \tan \theta = \frac{BC}{AB}; since AB=1, the length of BC is \tan \theta.

After repeating this procedure for every point on the sphere, the cylinder is unrolled into a flat sheet. The result is the following map:

Here, if a point on the earth is known (we know its latitude and longitude), we know its position on the map. The position coordinates of the map, (x,y) can be described in terms of \lambda and \theta in the following equations:

x = \lambda

y = \tan \theta

This projection has several major drawbacks: most notably, as the latitude \theta increases toward 90^\circ, the slope of \tan(\theta) becomes increasingly steep, and the y-coordinates of the point of the map increases towards infinity. At high latitudes, the map is very distorted.

The map is not equal-area — in an equal-area map, shapes of equal area on the sphere have equal area on the projection. Looking at the map we created, this is clearly not the case. Neither is the map conformal, or shape preserving. In a conformal map, the angle between any two lines must be the same as their counterparts on the map.

Mercator’s Projection

One of the most well known cylindrical map projections is Mercator’s projection. Invented by Belgian cartographer Gerardus Mercator in 1569, Mercator addresses some of the problems in older cylindrical projections. Unlike our projection, Mercator’s projection is conformal, and it has been used for navigation for centuries:

However, the equations for Mercator’s projection are rather complicated. The position on Mercator’s projection, when given the latitude and longitude, is:

x = \lambda

y = \ln ( \sec \theta + \tan \theta )

This is a strange and seemingly random formula. But when we use this formula, we get a conformal map. How does this work?

Deriving Mercator’s Projection

Before deriving the formula for Mercator’s Projection, we take a step back and ask ourselves — what was the projection for in the first place?

Unlike most maps today which are used to decorate walls, Mercator’s map was more than that — it was used for nautical navigation. Suppose you were to travel by ship from Vancouver to Honolulu, equipped with nothing more than a compass. Which direction should you travel?

Now if you have a copy of Mercator’s map, this becomes simple. You draw a straight line from Vancouver to Honolulu, and with a protractor you find that the direction is 40^\circ south to the parallel. With a compass, it would be easy for you to maintain this constant heading for the entire duration of the trip.

(Footnote — The straight line on Mercator’s map is not straight on the Earth — it is actually a curved line, called a rhumb line. That is, after taking an initial bearing, one proceeds along the same bearing (relative to the north pole) for the duration of the trip. In the example, you would have to adjust your heading intermittently so that you are always heading at the direction 40^\circ south to the parallel.)

This technique is possible because it relies on one crucial property of Mercator’s map: that it is conformal. It would fail if we tried the same technique on the cylindrical projection given in the introduction.

However, notice that on Mercator’s map, all parallels have the same length: a parallel is simply a line across the width of the entire map. But on Earth, a sphere, parallels are longer when they are closer to the equator and shorter when near the poles:

Let the Earth be a unit sphere with center A. D is a point on the surface of the Earth and B is its corresponding point on the equator. Let C be a point such that CD is parallel to AB and AC is perpendicular to AB. As AB = 1, the circumference of the Earth at the equator is 2 \pi. But the circumference at latitude \theta is only 2\pi \cos \theta.

It is easy to see why: as AB || CD in the diagram, it follows that angles \angle DAB and \angle ADC are equal. Next, \cos \angle ADC = \frac{CD}{AD} = \frac{CD}{1} = CD, therefore CD = \cos \theta. The result follows.

On the map, the length of the equator and the parallel at latitude \theta are equal. But on Earth, the parallel at latitude \theta is smaller than the equator by a factor of \cos \theta; in the projection any line at latitude \theta ends up being stretched horizontally by a factor of \frac{1}{\cos \theta} or \sec \theta.

Finally, in order to satisfy the property of conformality, any part of the map stretched horizontally by some factor, say k, must be stretched vertically by the same factor k. Doing so preserves the angles of that part of the map:

Imagine a very small piece of land at latitude \theta. When it is represented on the map, it is stretched horizontally by \sec \theta. Hence to preserve conformity, it must also be stretched vertically by \sec \theta.

But this can only work if the piece of land has no area. But as all pieces of land, no matter how small, has some area, the latitude of the piece of land is not \theta, but goes from \theta to \theta + \Delta \theta for some small number \Delta \theta. Similarly, the space this piece of land occupies on the map is not just y, but goes from y + \Delta y for some small number \Delta y. Now as \Delta \theta approaches 0, the ratio of \Delta y to \Delta \theta approaches \sec \theta:

\lim_{\Delta \theta \to 0} \frac{\Delta y}{\Delta \theta} = \sec \theta

Or equivalently,

\frac{dy}{d \theta} = \sec \theta

Solving for dy and integrating both sides gives:

\begin{array}{rl}y & = \int \sec \theta \, d \theta \\ & = \int \left( \sec \theta \cdot \frac{\sec \theta + \tan \theta}{\sec \theta + \tan \theta} \right) d \theta \\ & = \int \frac{\sec^2 \theta + \sec \theta \tan \theta}{\sec \theta + \tan \theta} \, d \theta\end{array}

If we let u be the denominator, u = \sec \theta + \tan \theta, it turns out that the numerator is the derivative of u, or du = \sec^2 \theta + \sec \theta \tan \theta. Thus we can substitute u into the integral:

\begin{array}{rl} y & = \int \frac{du}{u} \\ & = \ln |u| + C \\ & = \ln | \sec \theta + \tan \theta | + C \end{array}

We defined the map so that the equator lies on the x-axis, therefore when \theta = 0, y=0. Substituting and solving for C, we find that C=0. Also, since \sec \theta + \tan \theta > 0 for -\frac{\pi}{2} < \theta < \frac{\pi}{2}, we can remove the absolute value brackets. Therefore,

y = \ln ( \sec \theta + \tan \theta )

This is what is desired.

An Alternative Map Projection

Mercator’s Projection is merely a conformal map, not a perfect map. It has several drawbacks — the scale varies greatly from place to place: as the latitude increases, the map gets stretched more and more. This severely distorts larger figures, especially areas near the poles. At a latitude greater than about 70^\circ, the map is practically unusable.

The Gall-Peters Projection is an alternative cylindrical projection:

When a map is used in navigation, it is important that angles are preserved in the map. But when a map is used to present statistical data, it’s often more important to preserve areas in the map: otherwise one may be deceived into thinking one country is bigger than another when they are actually the same size.

The Gall-Peters projection does this, trading conformity for equal-area — two countries of the same area will have the same area on the map, no matter where they are on the map. This map projection is produced by the following equations:

x = \lambda \cos 45^\circ = \frac{\lambda}{\sqrt{2}}

y = \frac{\sin \theta}{\cos 45^\circ} = \sqrt{2} \sin \theta

A discussion on why this set of equations produces an equal-area map is beyond the scope of this essay.

Conclusion

It is a tricky matter to accurately depict the surface of the earth on a flat sheet of paper. The method of cylindrical projection produces several considerably different maps.

The first map we considered uses only simple trigonometry, but the map produced is not very useful. The second map we investigate, Mercator’s Projection, preserves angles on the map. The last map we consider, the Gall-Peters projection, preserves areas on the map.

Each of these map projections are both similar and very different. They are similar in that they are all produced in some way by wrapping a cylinder around a spherical Earth. But the mathematics needed to produce them differ.

Fermat points and parameterizing the 120 degree integer triangle (Project Euler 143)

Problem 143 of Project Euler is an interesting geometry problem, and one notorious for having one of the lowest solvers to time released ratios of all the problems on the site: under 600 solvers in over 3 years as time of writing.

So what is the problem?

The problem is quite simple. We have a triangle ABC, with all angles smaller than 120^\circ. Point T is inside the triangle such that p+q+r is at its minimum (T can always be determined uniquely).

We define a Torricelli triangle as one where a, b, c, p, q, r are all integers.

Generate all Torricelli triangles with p + q + r \leq 120000.

The Fermat point

In triangle ABC, with P being the point such that PA+PB+PC is minimized, P is called the Fermat point of triangle ABC.

We can use Ptolemy’s theorem to locate the Fermat point in the triangle.

Let D be a point such that BCD is equilateral, then construct the circumcircle of BCD:

We now show that P must be at the intersection of AD and the circumcircle of \triangle BCD.

Assume that P does not lie on the circumcircle of \triangle BCD. Then, according to Ptolemy’s theorem:

BP \times CD + PC \times BD > PD \times BC

Equality only occurs when PCDB is cyclic. Since BCD is equilateral, we have BC = CD = BD so we can divide it out to get

PB + PC > PD

Adding PA to both sides:

PA + PB + PC > PA + PD

Now if P did lie on the circumcircle of \triangle BCD, we would have PA + PB + PC = PA + PD, so P must lie on the circumcircle of \triangle BCD in order to be optimal.

So we know P is on the circle. But now we assume that P is not on AD. As we had before, PA + PB + PC = PA + PD.

Next if P is not on AD, then AP + PD > AD, or by substitution,

PA + PB + PC > AD

Of course if P were actually on AD, then PA + PB + PC = AD, and the sum PA+PB+PC would be optimal.

This proves the optimal place for P to be on the intersection of AD and the circumcircle of \triangle BCD.

If we repeat this for the other three sides, we notice that P is the intersection of the three circumcircles, and also the intersection of AD, BE, and CF:

Further, we see that quadrilaterals BPCD, APCE, and FAPB are all cyclic. As \angle BDC = 60^\circ, \angle BPC = 120^\circ, and similarly \angle APC = 120^\circ and \angle APB = 120^\circ.

Designing an algorithm

We have enough information now to start working on an algorithm. Let us come back to the previous diagram:

So knowing that the central angles are all 120^\circ, we can apply the cosine law (\cos 120^\circ = -\frac{1}{2}):

a^2 = q^2 + r^2 - 2qr \cos 120

a^2 = q^2 + r^2 +qr

A similar formula applies to sides b and c. We call (x,y) a pair if x^2 + xy + y^2 is a perfect square.

We have found a Torricelli triangle if for three integers p, q, r, all of (p,q), (p,r), (q,r) are all pairs.

This leaves us with an outline of an algorithm:

  1. Generate all pairs (x,y) under the limit (with x+y < 120000 and x^2 + xy + y^2 being a square)
  2. Sort the list of pairs and index them (to be explained in step 3)
  3. For each pair (a,b) in the list, search through the list to check if there exists some c where (a,c) is a pair and (b,c) is a pair. We index the list to drastically improve searching time.
  4. If an (a,b,c) triple is found and a+b+c \leq 120000, then mark a+b+c as found. This is easily implemented as a bit array of size 120000 with all bits initially 0 and flipped to 1 when a sum is encountered.
  5. Add up the sums and output

Step 1 takes up most of the time in this algorithm. To do this by ‘brute force’, we need an outer loop from 1 up to 120000, then another inner loop again from 1 to 120000 (actually a bit less), which is essentially 120000^2 operations, which is in the billions. An implementation of this algorithm would take about 3 minutes.

Parameterizing the a^2 + ab + b^2 = c^2 equation

It is well known that all primitive triples of the form a^2 + b^2 = c^2 can be generated by a = m^2 - n^2, b = 2mn, c = m^2 + n^2 where m and n are coprime and are of opposite parity.

Can a similar parameterization be found for the equation a^2 + ab + b^2 equation?

Letting x = \frac{a}{c}, and y = \frac{b}{c} and dividing by c^2, we get

x^2 + xy + y^2 = 1

which is the equation of an ellipse.

Originally we wanted to find integer solutions to the equation a^2 + ab + b^2 = c^2. This is equivalent to finding all rational points on the ellipse x^2 + xy + y^2 = 1:

It is easy to see why: if x and y are rational points such that x^2 + xy + y^2 = 1 then x = \frac{a}{c} and y = \frac{b}{c} where a, b, c are positive integers, and we arrive back at the original form of the equation.

Also in order for a, b, c to be positive, we will only consider points on the ellipse in the first quadrant, with both x and y being positive. We do this by choosing a point on the ellipse, and from there combine the equations of the ellipse and the line.

Let us choose the point (0,-1) to be the first point of the line. Then the equation of the line is y = tx-1, where t is valid only if it is positive and greater than 1.

Substituting into the equation of the ellipse:

x^2 +tx^2 - x + t^2 x^2 - 2tx + 1 = 1

Simplifying:

x (t^2+t+1) = 2t + 1

x = \frac{2t+1}{t^2 + t + 1}

Now evaluating for y:

y = t (\frac{2t+1}{t^2 + t + 1})-1 = \frac{t^2-1}{t^2+t+1}

Notice now that for x and y to be rational, we just need the slope t to be rational. So we can write t as \frac{m}{n} where m and n are positive integers and m>n.

Simplifying the expressions for x and y:

x = \frac{2 \frac{m}{n} + 1}{(\frac{m}{n})^2 + \frac{m}{n} + 1} = \frac{2mn+n^2}{m^2 + mn + n^2}

y = \frac{(\frac{m}{n})^2 - 1}{(\frac{m}{n})^2 + \frac{m}{n} + 1} = \frac{m^2 - n^2}{m^2 + mn + n^2}

Since we defined x = \frac{a}{c} and y = \frac{b}{c}, we have a valid triple if a = 2mn+n^2, b = m^2 - n^2, and c = m^2 + mn + n^2.

If m and n are not coprime, then neither will a, b, and c be coprime as they will share a common divisor. So in order for (a,b,c) to be primitive we will need m and n to be coprime.

Still this isn’t quite sufficient. Consider what would happen if m \equiv n (\mod 3). We have a = 2mn+n^2 which can be written as n(n-m+3m) which is divisible by 3. Next b = (m+n) (m-n) which is divisible by 3. Finally, c = m^2 + mn + n^2 which can be written as (m-n)^2 + 3mn, again divisible by 3! So if m \equiv n \mod 3, then the resulting triple is not primitive!

But it turns out that we can ignore all cases where m \equiv n \mod 3. Given a triple (2mn+n^2, m^2-n^2, m^2+mn+n^2), with m \equiv n \mod 3, it is possible to find an identical parameterization giving (\frac{2mn+n^2}{3}, \frac{m^2-n^2}{3}, \frac{m^2 + mn + n^2}{3}).

As m \equiv n mod 3, we have m+2n \equiv 0 \mod 3 and also m-n \equiv 0 \mod 3. Thus we can have u and v where u and v are positive integers:

u = \frac{m+2n}{3}

v = \frac{m-n}{3}

Multiplying by 3, 3u = m+2n, and 3v = m-n. Combining the two and substituting we get

m = u+2v

n = u-v

If we substitute u and v for m and n in the triple (2mn+n^2, m^2-n^2, m^2+mn+n^2), we get:

2mn + n^2 = 3 (u^2-v^2)

m^2 - n^2 = 3(2uv + v^2)

m^2 + mn + n^2 = 3 (u^2 + uv + v^2)

So this proves that when m \equiv n \mod 3, the case is nonprimitive and has already been covered by a smaller (possibly primitive) case. We are done.

Implementation in C++

Now using the new parameterization scheme, we can generate all pairs in about n operations instead of n^2 operations. So we can loop up to \sqrt{120000}, which cuts down the time down to about 2 seconds:


#include <cstdio>
#include <vector>
#include <algorithm>
#include <utility>
using namespace std;

// Limit and square root of limit
#define L 120000
#define LSQ 347
typedef long long int int64;

int64 gcd(int64 a, int64 b)
{
  int64 c = a % b;
  while(c != 0)
  {
    a = b;
    b = c;
    c = a % b;
  }
  return b;
}

int main()
{
  // Store pairs in here
  vector< pair<int64,int64> > pairs;

  // Use the parameterization
  for(int64 u=1; u<LSQ; u++){
    for(int64 v=1; v<u; v++){
      if(gcd(u,v) != 1) continue;
      if((u-v) % 3 == 0) continue;
      int64 a = 2*u*v + v*v;
      int64 b = u*u - v*v;
      if(a+b > L) break;
      // From coprime pairs make composite pairs
      for(int k=1; k*(a+b)<L; k++){
        pairs.push_back(make_pair(k*a,k*b));
        pairs.push_back(make_pair(k*b,k*a));
      }
    }
  }

  // Sort pairs list
  sort(pairs.begin(),pairs.end());

  // Create index
  int index[L];
  for(int i=0; i<L; i++) index[i] = -1;
  for(int i=0; i<pairs.size(); i++){
    if(index[pairs[i].first] == -1)
      index[pairs[i].first] = i;
  }

  // Which sums have been reached?
  bool sums[L];
  for(int i=0; i<L; i++) sums[i] = false;

  // Iterate through all pairs
  for(int i=0; i<pairs.size(); i++){
    int64 a = pairs[i].first;
    int64 b = pairs[i].second;

    // Construct vectors for indices
    vector<int64> va;
    vector<int64> vb;

    // Fetch indices
    int ia = index[a];
    int ib = index[b];

    while(ia<pairs.size()){
      pair<int64,int64> next = pairs[ia];
      if(next.first != a) break;
      va.push_back(next.second);
      ia++;
    }

    while(ib<pairs.size()){
      pair<int64,int64> next = pairs[ib];
      if(next.first != b) break;
      vb.push_back(next.second);
      ib++;
    }

    // Find common objects between va and vb
    for(int ja=0; ja<va.size(); ja++){
      for(int jb=0; jb<vb.size(); jb++){
        if(va[ja]==vb[jb]){
          // Potential c found
          int64 c = va[ja];
          if(a+b+c<L) sums[a+b+c] = true;
        }
      }
    }
  }

  // Tally up sums
  int64 s = 0;
  for(int i=0; i<L; i++)
    if(sums[i]) s+=i;

  printf("%lld\n",s);
  return 0;
}