CS488 Final Project: OpenGL Boat Game

Here’s something I’ve been working on for the past few weeks for one of my courses, CS488 – Intro to Computer Graphics. For the final project, you’re allowed to do any OpenGL or raytracing project, as long as it has 10 reasonable graphics related objectives. Here’s a video of mine:

A screenshot:

It’s a simple game where you control a boat and go around a lake collecting coins. When you collect a coin, there’s a bomb that spawns and follows you around. You die when you hit a bomb. Also if two bombs collide then they both explode (although you can’t see that in the video).

Everything is implemented in bare-metal OpenGL, so none of those modern game engines or physics engines. It’s around 1000-ish lines of C++ (difficult to count because there’s a lot of donated code).

Edit (8/10/2016) – I received an Honorable Mention for this project!

CS488 – Introduction to Computer Graphics

For those that haven’t heard about CS488, it’s one of the “big three” — fourth year CS courses with the heaviest workload and with large projects (the other two being Real-time and Compilers). It’s one of the hardest courses at Waterloo, but also probably the most rewarding and satisfying course I’ve taken.

There are four assignments, each walking you step by step through graphics techniques, like drawing a cube with OpenGL, or building a puppet with hierarchical modelling, or writing a simple ray tracer. Then there’s the final project, where you can choose to make something with OpenGL or extend your ray tracer. The class is split 50/50, about half the class did OpenGL and the other half did a ray tracer. I personally feel that OpenGL gives you more room to be creative and create something unique whereas ray tracing projects end up implementing a mix of different algorithms.

The first two assignments weren’t too bad (I estimate it took me about 10 hours each), but some time during assignment 3 I realized I was spending a lot of time in the lab, so I got an hours tracking app on my phone to track exactly how much time I was spending working on this course. Assignments 3 and 4 each took me 15 hours. I spent 35 hours on my final project, over a period of 3 weeks. I chose relatively easy objectives that I was confident I could do well, which left time to polish the game and do a few extra objectives. I’m not sure what the average is for time spent on the final project, but it’s common to spend 50-100 hours. Bottom line: you can put in potentially unbounded amounts of time to try to get the gold medal, but the effort actually required to get a good grade is quite reasonable.

Now the bad part about this course (obviously not the instructor’s fault) is OpenGL is so incredibly difficult to work with. Even to draw a line on the screen, you have to deal with a lot of low level concepts like vertex array objects, vertex buffer objects, uniform attributes to pass to shaders, stuff like that. It doesn’t help that when something goes wrong in a shader (which runs on the GPU), there’s no way to pass an error message back to the CPU so you can print out variables and debug it. It also doesn’t help that there’s a lot of incompatible OpenGL versions, and code you find in an online tutorial could be subtly broken for the version you’re using. On the other hand, working with OpenGL really makes you appreciate modern game engines like Unity which takes care of all the low level stuff for you.

Visualizing Quaternions with Unity

How do you model the position and orientation of an airplane?

Position is easy, just represent it with a point in 3D space. But how do you specify its orientation — which direction it’s pointing?

At first glance, it seems a vector will do. After all, a vector points in some direction, right? If the plane is pointing east, represent its orientation by a unit vector pointing east.

Unfortunately, we quickly run into trouble when we try to roll. If we’re facing east, and we roll 90 degrees, we’re still facing east. Clearly we’re missing something.

Euler Angles

When real pilots talk about their orientation, they talk about roll, yaw, pitch. Pitch is going up or down, yaw is going left or right, roll is, well, roll.

Any change in orientation can be described by some combination of roll, yaw, pitch. This is the basis for Euler Angles. We use three angles to represent the airplane’s orientation.

This is all fine and dandy if we want to represent the orientation of a static object in space. But when we try to adjust our orientation, we start to run into problems.

You’re thinking, this should be simple! When we turn left or right, we just increment the yaw variable, right? Yes, it seems to work, at least initially. You can turn left and right, up and down, and roll around.

Implement it in Unity and play around a bit, however, and you begin to notice that things don’t quite behave the way you expect.

In this animation, I’m holding down the right button:

The plane does rotate to the right, but it’s not rotating relative to itself. Instead it’s rotating around some invisible y-axis. If it was rotating relative to itself, the green arrow shouldn’t be moving.

The problem becomes more and more severe when the pitch of the plane becomes higher and higher. The worst case is when the airplane is pointing straight up: then roll and yaw become the same thing! This is called gimbal lock: we have lost a degree of freedom and we can only rotate in 2 dimensions! Definitely not something desirable if we’re controlling a plane or spaceship.

It turns out that no matter what we do, we will suffer from some form of gimbal lock. As long as we use Euler Angles, there is one direction where if we turn too far, everything starts to screw up.

Practical Introduction to Quaternions

All is not lost, however. There is a way to represent orientation that represents all axes equally and does not suffer from gimbal lock. This mythical structure is called the quaternion. Unlike Euler Angles which describe your orientation relative to a fixed set of axes, quaternions do not rely on any fixed axis.

The drawback is that quaternions are unintuitive to understand for humans. There is no way to “look” at a quaternion and be able to visualize what rotation it represents. Fortunately for us, it’s not that difficult to make use of quaternions, even if we can’t visualize quaternions.

There is a lot of theory behind how quaternions work, but in this article, I will gloss over the theory and give a quick primer to quaternions, just the most common facts you need to use them. At the same time, I will implement the operations I describe in C#, so I can integrate them with Unity. If you don’t know C#, you can freely ignore the code.


A quaternion is an ordered pair of 4 real numbers (w,x,y,z). We write this as


The letters i,j,k are not variables. Rather, they are independent axes. If you like, you can think of the quaternions as a 4 dimensional vector space.

The defining property of quaternions is:

i^2 = j^2 = k^2 = ijk = -1

Play around with it a bit and you can derive 6 more identites:

ij = k

jk = i

ki = j

ji = -k

kj = -i

ik = -j

If you’ve worked with complex numbers, this should seem familiar. Instead of 2 parts of a complex number (the real and imaginary parts), we have 4 parts for a quaternion.

The similarity doesn’t end here. Multiplying complex numbers represents a rotation in 2 dimensions. Similarly, multiplying by a quaternion represents a rotation in 3D.

One curious thing to note: we have ij=k and ji=-k. We switched around the terms and the product changed. This means that multiplying quaternions is kind of like multiplying matrices — the order matters. So multiplication is not commutative.

Here’s a framework for a quaternion in C#:

public class Quat{
	// Represents w + xi + yj + zk
	public float w, x, y, z;
	public Quat(float w, float x, float y, float z){
		this.w = w;
		this.x = x;
		this.y = y;
		this.z = z;

Normalizing Quaternions

The norm of a quaternion is

N(\mathbf{q}) = \sqrt{w^2+x^2+y^2+z^2}

When we use quaternions to represent rotations, we typically want unit quaternions: quaternions with norm 1. This is straightforward: to normalize a quaternion, divide each component by the norm.

In C#:

public float Norm(){
  return Mathf.Sqrt (w * w + x * x + y * y + z * z);

public Quat Normalize(){
  float m = Norm ();
  return new Quat (w / m, x / m, y / m, z / m);

Multiplying Quaternions

Multiplying is simple, just a little tedious. If we have two quaternions:

(w_1 + x_1i + y_1j + z_1k) (w_2+x_2i+y_2j+z_2k)

Then their product is this ugly mess:

\begin{array}{l} w_1w_2-x_1x_2-y_1y_2-z_1z_2 \\ + (w_1x_2+x_1w_2+y_1z_2-z_1y_2)i \\ + (w_1y_2+y_1w_2-x_1z_2+z_1x_2) j \\ + (w_1z_2+z_1w_2+x_1y_2-y_1x_2) k \end{array}

In C#:

// Returns a*b
public static Quat Multiply(Quat a, Quat b){
  float w = a.w * b.w - a.x * b.x - a.y * b.y - a.z * b.z;
  float x = a.w * b.x + a.x * b.w + a.y * b.z - a.z * b.y;
  float y = a.w * b.y + a.y * b.w - a.x * b.z + a.z * b.x;
  float z = a.w * b.z + a.z * b.w + a.x * b.y - a.y * b.x;
  return new Quat (w,x,y,z).Normalize();

Since multiplication is not commutative, I made this function static to avoid confusing left and right multiplication. Also, I normalize the product so that floating point errors don’t accumulate.

Constructing Rotation Quaternions

Every rotation operation can be written as a rotation of some angle, \theta, around some vector (u_x, u_y, u_z):

The following formula gives a quaternion that represents this rotation:

\mathbf{q} = \cos \frac{\theta}{2} + (u_x i + u_y j + u_z k) \sin \frac{\theta}{2}

For our purposes, \theta is a very small number, say 0.01, and we use one of the three basis vectors to rotate around. For example, if we are rotating around (1,0,0) then our quaternion is

\cos \frac{0.01}{2} + \sin \frac{0.01}{2}i

That’s it: given any quaternion, multiplying on the left by our quaternion rotates it slightly around the x axis.

In C#, our code might look like this:

Quat qx = new Quat (Mathf.Cos (0.01 / 2), 0, 0, Mathf.Sin (0.01 / 2));
Quat qy = new Quat (Mathf.Cos (0.01 / 2), 0, Mathf.Sin (0.01 / 2), 0);
Quat qz = new Quat (Mathf.Cos (0.01 / 2), Mathf.Sin (0.01 / 2), 0, 0);

Putting it together

That’s all we need to do interesting things with quaternions. Let’s combine everything we have. Here’s our quaternion class thus far:

public class Quat{
	// Represents w + xi + yj + zk
	public float w, x, y, z;
	public Quat(float w, float x, float y, float z){
		this.w = w;
		this.x = x;
		this.y = y;
		this.z = z;

	public float Norm(){
		return Mathf.Sqrt (w * w + x * x + y * y + z * z);

	public Quat Normalize(){
		float m = Norm ();
		return new Quat (w / m, x / m, y / m, z / m);

	// Returns a*b
	public static Quat Multiply(Quat a, Quat b){
		float w = a.w * b.w - a.x * b.x - a.y * b.y - a.z * b.z;
		float x = a.w * b.x + a.x * b.w + a.y * b.z - a.z * b.y;
		float y = a.w * b.y + a.y * b.w - a.x * b.z + a.z * b.x;
		float z = a.w * b.z + a.z * b.w + a.x * b.y - a.y * b.x;
		return new Quat (w,x,y,z).Normalize();

	public Quaternion ToUnityQuaternion(){
		return new Quaternion (w, x, y, z);

Now we just need to read the input, perform our calculations, and output the rotation quaternion to Unity:

public class Airplane : MonoBehaviour {
  public GameObject airplane;
  public Quat quat = new Quat (0, 0, 0, -1);
  public float speed = 0.01f;

  void FixedUpdate(){
    float inputX = Input.GetAxis("UpDown");
    float inputY = Input.GetAxis("LeftRight");
    float inputZ = Input.GetAxis("Roll");

    Quat qx = new Quat (Mathf.Cos (speed * inputX / 2), 0, 0, Mathf.Sin (speed * inputX / 2));
    Quat qy = new Quat (Mathf.Cos (speed * inputY / 2), 0, Mathf.Sin (speed * inputY / 2), 0);
    Quat qz = new Quat (Mathf.Cos (speed * inputZ / 2), Mathf.Sin (speed * inputZ / 2), 0, 0);

    quat = Quat.Multiply (qx, quat);
    quat = Quat.Multiply (qy, quat);
    quat = Quat.Multiply (qz, quat);

    airplane.transform.rotation = quat.ToUnityQuaternion ();

In Unity, the input is not given to us as a single true/false value, but a float between -1 and 1. So holding right increases the LeftRight input gradually until it reaches 1, avoiding a sudden jump in movement.

What’s ToUnityQuaternion? Well, it turns out that Unity already has a Quaternion class that does everything here and much more, so all this could have literally been implemented in one line if we wanted.

Anyways, let’s see the result.

As you can see, holding right turns the plane relative to itself now, and the green arrow stays still. Hooray!

A CMOQR Problem and why not to Trust Brute Force

Recently I was invited to compete in the CMOQR — a qualifier contest for the Canadian Math Olympiad. The contest consisted of eight problems, and contestants were allowed about a week’s time to submit written solutions via email.

After a few days, I was able to solve all of the problems except one — the second part of the seventh problem:

Seven people participate in a tournament, in which each pair of players play one game, and one player is declared the winner and the other the loser. A triplet ABC is considered cyclic if A beats B, B beats C, and C beats A.

Can you always separate the seven players into two rooms, so that neither room contains a cyclic triplet?

(Note: the first half of the problem asked the same question for six people — and it’s not too difficult to prove that no matter what, we can put them into two rooms so that neither the first nor the second room contains a cyclic triplet.)

But what happens when we add another person? Can we still put four people in one room, and three people in the other, so that neither rooms contain a cyclic triplet?

There are two possibilities here:

  • One, it’s always possible. No matter what combinations of wins and losses have occurred, we can always separate them into two rooms in such a way. To prove this, we’ll need to systematically consider all possible combinations, and one by one, verify that the statement is possible for each of the cases.
  • Two, it’s not always possible. Then there is some counterexample — some combination of wins and losses so that no matter how we separate them, one of the rooms has a cyclic triplet. This is easier to prove: provided that we have the counterexample, we just have to verify that indeed, this case is a counterexample to the statement.

But there’s a problem. Which of the cases does the solution fall into? That is, should we look for a quick solution by counterexample, or look for some mathematical invariant that no counterexample can exist?

Brute Force?

It would be really helpful if we knew the counterexample, or knew for sure what the counterexample was. What if we wrote a computer program to check all the cases? After all, there are only 7 people in the problem, and 7 choose 2 or 21 games played. Then since each game is either won by one player or the other, there are only 2^21 combinations overall (although that does count some duplicates). And 2^21 is slightly over two million cases to check — completely within the bounds of brute force.

So I coded up a possibility-checker. Generate all 2^21 possible arrangements, then for each one, check all possible ways to separate them into two rooms. If it turns out that no matter how we arrange them, a cyclic triplet persists, then display the counterexample. Simple.

I ran the program. It quickly cycled through every possible arrangement, three seconds later exiting without producing a counterexample.

Alright. So there’s no counterexample. I would have to find some nice mathematical invariant, showing that no matter what, there is always some way to group the players so that neither room has a cyclic triplet.

But no such invariant came. I tried several things, but in each attempt couldn’t quite show that the statement held for every case. I knew that there was no counterexample, but I couldn’t prove it. But why? There must be some tricky way to show that no counterexample existed; whatever it was, I couldn’t find it.

Brute Force poorly implemented

Reluctantly, as the deadline came and passed, I submitted my set of solutions without solving the problem. When the solutions came out a week later, the solution to this problem did not contain any tricky way to disprove the counterexample. Instead, what I found was this:

Let A_0 \ldots A_6 be seven players. Let A_i beat A_j when the difference i-j \equiv 1,2,4 \mod 7.

Huh? A counterexample, really? Let’s look at it.

Everything is symmetric — we can ‘cycle’ the players around without changing anything. Also, if we take four players, two of them are consecutive. Let them be A_0 and A_1.

At this point everything falls into place: in any subset of four players, three of them are cyclic.

But wait … my program had not found any counterexamples! And right here is a counterexample! The culprit was obvious (the reader may have foreseen this by now) — of course, there had to be a problem with my program.

Running my code through a debugger, I found a logic error in the routine converting binary numbers to array configurations, meaning that not all possible configurations were tried. As a result, the counterexample slipped through the hole.

After fixing the code, the program found not one, but a total of 7520 (although not necessarily distinct) counterexamples. Most of them had no elegant structure, but the solution’s configuration was among them.

For the interested, here is the fixed code.

When to Start Over?

It is true that the program could have been better written, better debugged. But how could you know whether a counterexample existed and your program didn’t find it, or if no counterexample existed at all?

In hindsight, it seems that writing the brute force program made me worse off than if I hadn’t written it at all. After the program ran without finding a single counterexample, I was confident that no counterexample existed, and set out about proving that, instead of looking for counterexamples or symmetry.

When you are stuck on such a math problem — that is, after making a bit of progress you get stuck — it might be profitable to start over. More often than I would like, I prove a series of neat things, without being able to prove the desired result. Then a look at the solutions manual reveals that a very short solution — one or two steps — lay in the opposite direction.

I’ll put an end to my philosophical musings of the day. Fortunately, the cutoff for the CMOQR was low enough that even without solving every single problem, I was still able to meet the cutoff.

Blatantly abusing the Windows search function: a very lazy note-taking idea

Often, as I am minding my daily business, I would suddenly have an interesting thought, or some idea. Being likely to forget it a few minutes later, I would want to write the idea down. But being a lazy and slightly disorganized person, I would grab the nearest pencil and scrap paper, scribble down a few words, and throw the piece of paper aside somewhere.

Usually I would never retrieve that piece of paper again. But occasionally I do want to read the note I wrote. After a few minutes of searching I would usually find the piece of paper I needed. If I didn’t, oh well, it probably wasn’t that important anyway.

But, as you can imagine, this quickly gets out of hand.

Now you would like to do do the note taking on the computer, instead on small pieces of paper. After all, losing files is harder on the computer right?

But how would you do this? In the computer world it’s a bit harder to ‘throw aside’ a file. You have to give the file a name, and put it in some directory, essentially forcing you to be organized. Or if you don’t, files clutter up your computer desktop even faster than sheets of paper clutter up your actual desk.

So being the geek I am, I wrote up my experimental, software solution. Then writing a note would go as follows. You open up command prompt, and enter the note command:

Type up whatever you want, and then hit save:

So your note is saved in some obscure directory of your choosing. Now the Windows 7 search is pretty good (Windows Vista works too) that as long as your directory is indexed, Windows-F and typing pretty much anything in the search bar would quickly bring you to the file (it searches file contents):

Searching for “phone ashley” takes me directly to the note. The program just takes what we’ve written, and saves it in a unique file that also has a datestamp on it. For example the first note written on a day would have _1 appended to it, then _2, and so on.

I’m not yet sure how practical this idea would be, but here’s the C++ code for my program:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <ctime>
using namespace std;

// Folder that we are going to write notes in
string NPATH = "C:/home/notes/";

// Convert integer to string
string istr(int i)
  stringstream ss;
  ss << i;
  return ss.str();

// Convert time to displayable form with year, month, day, for example
// 2010-01-01. Always 10 characters long.
string formTime(struct tm * cur)
  char s[11];
  sprintf(s, "%04d-%02d-%02d",
    cur->tm_year + 1900, // Number of years since 1900
    cur->tm_mon + 1,     // Month, starting at 0
    cur->tm_mday);       // Day of month
  return s;

// Does the file at this path exist, or not?
bool existsFile(string path)
  ifstream ifs(path.c_str());
  return ifs;

void write(string data)
  // Get current time
  time_t curr_t = time(NULL);
  tm * curr = localtime(&curr_t);

  // 1 might be already occupied, or 2, so fill first unoccupied one
  int ext = 1;
  string filebase = NPATH + formTime(curr);
  string flocation;
    flocation = filebase + "_" + istr(ext) + ".txt";

  // Write everything to the file
  ofstream fout;
  fout << data;

int main()
  stringstream ss;

  // Accept input until line is "save" or "exit"
  string s;
    if(s == "save") break;
    if(s == "exit") return 0;
    ss << s << '\n';

  // Write everything and exit
  return 0;

Fermat points and parameterizing the 120 degree integer triangle (Project Euler 143)

Problem 143 of Project Euler is an interesting geometry problem, and one notorious for having one of the lowest solvers to time released ratios of all the problems on the site: under 600 solvers in over 3 years as time of writing.

So what is the problem?

The problem is quite simple. We have a triangle ABC, with all angles smaller than 120^\circ. Point T is inside the triangle such that p+q+r is at its minimum (T can always be determined uniquely).

We define a Torricelli triangle as one where a, b, c, p, q, r are all integers.

Generate all Torricelli triangles with p + q + r \leq 120000.

The Fermat point

In triangle ABC, with P being the point such that PA+PB+PC is minimized, P is called the Fermat point of triangle ABC.

We can use Ptolemy’s theorem to locate the Fermat point in the triangle.

Let D be a point such that BCD is equilateral, then construct the circumcircle of BCD:

We now show that P must be at the intersection of AD and the circumcircle of \triangle BCD.

Assume that P does not lie on the circumcircle of \triangle BCD. Then, according to Ptolemy’s theorem:

BP \times CD + PC \times BD > PD \times BC

Equality only occurs when PCDB is cyclic. Since BCD is equilateral, we have BC = CD = BD so we can divide it out to get

PB + PC > PD

Adding PA to both sides:

PA + PB + PC > PA + PD

Now if P did lie on the circumcircle of \triangle BCD, we would have PA + PB + PC = PA + PD, so P must lie on the circumcircle of \triangle BCD in order to be optimal.

So we know P is on the circle. But now we assume that P is not on AD. As we had before, PA + PB + PC = PA + PD.

Next if P is not on AD, then AP + PD > AD, or by substitution,

PA + PB + PC > AD

Of course if P were actually on AD, then PA + PB + PC = AD, and the sum PA+PB+PC would be optimal.

This proves the optimal place for P to be on the intersection of AD and the circumcircle of \triangle BCD.

If we repeat this for the other three sides, we notice that P is the intersection of the three circumcircles, and also the intersection of AD, BE, and CF:

Further, we see that quadrilaterals BPCD, APCE, and FAPB are all cyclic. As \angle BDC = 60^\circ, \angle BPC = 120^\circ, and similarly \angle APC = 120^\circ and \angle APB = 120^\circ.

Designing an algorithm

We have enough information now to start working on an algorithm. Let us come back to the previous diagram:

So knowing that the central angles are all 120^\circ, we can apply the cosine law (\cos 120^\circ = -\frac{1}{2}):

a^2 = q^2 + r^2 - 2qr \cos 120

a^2 = q^2 + r^2 +qr

A similar formula applies to sides b and c. We call (x,y) a pair if x^2 + xy + y^2 is a perfect square.

We have found a Torricelli triangle if for three integers p, q, r, all of (p,q), (p,r), (q,r) are all pairs.

This leaves us with an outline of an algorithm:

  1. Generate all pairs (x,y) under the limit (with x+y < 120000 and x^2 + xy + y^2 being a square)
  2. Sort the list of pairs and index them (to be explained in step 3)
  3. For each pair (a,b) in the list, search through the list to check if there exists some c where (a,c) is a pair and (b,c) is a pair. We index the list to drastically improve searching time.
  4. If an (a,b,c) triple is found and a+b+c \leq 120000, then mark a+b+c as found. This is easily implemented as a bit array of size 120000 with all bits initially 0 and flipped to 1 when a sum is encountered.
  5. Add up the sums and output

Step 1 takes up most of the time in this algorithm. To do this by ‘brute force’, we need an outer loop from 1 up to 120000, then another inner loop again from 1 to 120000 (actually a bit less), which is essentially 120000^2 operations, which is in the billions. An implementation of this algorithm would take about 3 minutes.

Parameterizing the a^2 + ab + b^2 = c^2 equation

It is well known that all primitive triples of the form a^2 + b^2 = c^2 can be generated by a = m^2 - n^2, b = 2mn, c = m^2 + n^2 where m and n are coprime and are of opposite parity.

Can a similar parameterization be found for the equation a^2 + ab + b^2 equation?

Letting x = \frac{a}{c}, and y = \frac{b}{c} and dividing by c^2, we get

x^2 + xy + y^2 = 1

which is the equation of an ellipse.

Originally we wanted to find integer solutions to the equation a^2 + ab + b^2 = c^2. This is equivalent to finding all rational points on the ellipse x^2 + xy + y^2 = 1:

It is easy to see why: if x and y are rational points such that x^2 + xy + y^2 = 1 then x = \frac{a}{c} and y = \frac{b}{c} where a, b, c are positive integers, and we arrive back at the original form of the equation.

Also in order for a, b, c to be positive, we will only consider points on the ellipse in the first quadrant, with both x and y being positive. We do this by choosing a point on the ellipse, and from there combine the equations of the ellipse and the line.

Let us choose the point (0,-1) to be the first point of the line. Then the equation of the line is y = tx-1, where t is valid only if it is positive and greater than 1.

Substituting into the equation of the ellipse:

x^2 +tx^2 - x + t^2 x^2 - 2tx + 1 = 1


x (t^2+t+1) = 2t + 1

x = \frac{2t+1}{t^2 + t + 1}

Now evaluating for y:

y = t (\frac{2t+1}{t^2 + t + 1})-1 = \frac{t^2-1}{t^2+t+1}

Notice now that for x and y to be rational, we just need the slope t to be rational. So we can write t as \frac{m}{n} where m and n are positive integers and m>n.

Simplifying the expressions for x and y:

x = \frac{2 \frac{m}{n} + 1}{(\frac{m}{n})^2 + \frac{m}{n} + 1} = \frac{2mn+n^2}{m^2 + mn + n^2}

y = \frac{(\frac{m}{n})^2 - 1}{(\frac{m}{n})^2 + \frac{m}{n} + 1} = \frac{m^2 - n^2}{m^2 + mn + n^2}

Since we defined x = \frac{a}{c} and y = \frac{b}{c}, we have a valid triple if a = 2mn+n^2, b = m^2 - n^2, and c = m^2 + mn + n^2.

If m and n are not coprime, then neither will a, b, and c be coprime as they will share a common divisor. So in order for (a,b,c) to be primitive we will need m and n to be coprime.

Still this isn’t quite sufficient. Consider what would happen if m \equiv n (\mod 3). We have a = 2mn+n^2 which can be written as n(n-m+3m) which is divisible by 3. Next b = (m+n) (m-n) which is divisible by 3. Finally, c = m^2 + mn + n^2 which can be written as (m-n)^2 + 3mn, again divisible by 3! So if m \equiv n \mod 3, then the resulting triple is not primitive!

But it turns out that we can ignore all cases where m \equiv n \mod 3. Given a triple (2mn+n^2, m^2-n^2, m^2+mn+n^2), with m \equiv n \mod 3, it is possible to find an identical parameterization giving (\frac{2mn+n^2}{3}, \frac{m^2-n^2}{3}, \frac{m^2 + mn + n^2}{3}).

As m \equiv n mod 3, we have m+2n \equiv 0 \mod 3 and also m-n \equiv 0 \mod 3. Thus we can have u and v where u and v are positive integers:

u = \frac{m+2n}{3}

v = \frac{m-n}{3}

Multiplying by 3, 3u = m+2n, and 3v = m-n. Combining the two and substituting we get

m = u+2v

n = u-v

If we substitute u and v for m and n in the triple (2mn+n^2, m^2-n^2, m^2+mn+n^2), we get:

2mn + n^2 = 3 (u^2-v^2)

m^2 - n^2 = 3(2uv + v^2)

m^2 + mn + n^2 = 3 (u^2 + uv + v^2)

So this proves that when m \equiv n \mod 3, the case is nonprimitive and has already been covered by a smaller (possibly primitive) case. We are done.

Implementation in C++

Now using the new parameterization scheme, we can generate all pairs in about n operations instead of n^2 operations. So we can loop up to \sqrt{120000}, which cuts down the time down to about 2 seconds:

#include <cstdio>
#include <vector>
#include <algorithm>
#include <utility>
using namespace std;

// Limit and square root of limit
#define L 120000
#define LSQ 347
typedef long long int int64;

int64 gcd(int64 a, int64 b)
  int64 c = a % b;
  while(c != 0)
    a = b;
    b = c;
    c = a % b;
  return b;

int main()
  // Store pairs in here
  vector< pair<int64,int64> > pairs;

  // Use the parameterization
  for(int64 u=1; u<LSQ; u++){
    for(int64 v=1; v<u; v++){
      if(gcd(u,v) != 1) continue;
      if((u-v) % 3 == 0) continue;
      int64 a = 2*u*v + v*v;
      int64 b = u*u - v*v;
      if(a+b > L) break;
      // From coprime pairs make composite pairs
      for(int k=1; k*(a+b)<L; k++){

  // Sort pairs list

  // Create index
  int index[L];
  for(int i=0; i<L; i++) index[i] = -1;
  for(int i=0; i<pairs.size(); i++){
    if(index[pairs[i].first] == -1)
      index[pairs[i].first] = i;

  // Which sums have been reached?
  bool sums[L];
  for(int i=0; i<L; i++) sums[i] = false;

  // Iterate through all pairs
  for(int i=0; i<pairs.size(); i++){
    int64 a = pairs[i].first;
    int64 b = pairs[i].second;

    // Construct vectors for indices
    vector<int64> va;
    vector<int64> vb;

    // Fetch indices
    int ia = index[a];
    int ib = index[b];

      pair<int64,int64> next = pairs[ia];
      if(next.first != a) break;

      pair<int64,int64> next = pairs[ib];
      if(next.first != b) break;

    // Find common objects between va and vb
    for(int ja=0; ja<va.size(); ja++){
      for(int jb=0; jb<vb.size(); jb++){
          // Potential c found
          int64 c = va[ja];
          if(a+b+c<L) sums[a+b+c] = true;

  // Tally up sums
  int64 s = 0;
  for(int i=0; i<L; i++)
    if(sums[i]) s+=i;

  return 0;

Point in a polygon

An important, yet basic problem in computational geometry is the following task: given a polygon (a sequence of points) and a point, determine if the point is inside, or outside the polygon.

To a human eye, this problem is trivial. After all, it seems obvious that point A is outside the polygon, whereas point B is inside it.

Computers, on the other hand, lack this simple intuition. It is not entirely obvious how to go about creating an algorithm that correctly determines this.

Triangles containing the origin

Let us first focus on an easier version of the problem. Instead of any arbitrary polygon, let our polygon be limited to three points, that is, a triangle. Instead of any point in space, let us determine if the triangle contains the origin.

We observe that we can draw a ray from our point outwards. Any direction will do. If the point is inside the triangle, then the ray will eventually hit a side of it.

On the other hand, if the point is outside the triangle, the ray may hit two sides of the triangle, or it may hit none. However, it will never hit exactly one side:

Aha; this is the approach. Now we can transpose our diagram to a Cartesian plane. For simplicity, let our ray be a line directly upwards from the origin, that is, the entire y-axis above the origin.

Our algorithm now takes the three line segments of the triangle, and counts how many of them intersect the y-axis above the origin.

There are three cases. In the first case, the triangle doesn’t intersect the y-intercept at all above the origin, thus the triangle doesn’t contain the origin:

In the second scenario, the triangle intersects the y-axis twice above the origin, in which case the triangle still does not contain the origin:

In the last case, the triangle intersects the y-axis once above the origin, in which case the origin is in the triangle:

So the algorithm is, for every line segment (pair of coordinates) in the triangle:

  1. Calculate the y-intercept, say, b
  2. If b<0 this line segment doesn’t cross the y-axis above the origin so throw it away. Otherwise, continue.
  3. Make sure that the line segment actually crosses the y-axis at all. Or, check that the two points are on opposite sides of the y-axis.

Here’s my implementation of the algorithm in C++:

// Slope of a line
double slope(double x1, double y1, double x2, double y2){
  return (y1-y2) / (x1-x2);

// Y intercept given slope and point
double yint(double px, double py, double m){
  return py - m*px;

// Determine if a line crosses the y axis
bool cyax(double ax, double bx){
  if(ax<0 && bx>0) return true;
  if(bx<0 && ax>0) return true;
  return false;

// Contains origin, or not?
// Input is (A_x, A_y, B_x, B_y, C_x, C_y).
bool cto(double ax, double ay, double bx, double by, double cx, double cy){

  // Find slopes
  double mAB = slope(ax,ay,bx,by);
  double mBC = slope(bx,by,cx,cy);
  double mAC = slope(ax,ay,cx,cy);

  // Find y intercepts
  double yAB = yint(ax,ay,mAB);
  double yBC = yint(bx,by,mBC);
  double yAC = yint(cx,cy,mAC);

  // How many times it crosses the y intercept above 0
  int c = 0;

  if(yAB>0 && cyax(ax,bx)) c++;
  if(yBC>0 && cyax(bx,cx)) c++;
  if(yAC>0 && cyax(ax,cx)) c++;

  return c==1;

Generalizing the algorithm

We can proceed to generalize the algorithm to not only triangles containing the origin. First, for the problem of determining if an arbitrary point is in a triangle, we can create a bijection:

Translate all the points so that the test point lies on the origin. The position of this point relative to the triangle is exactly the same, so we can proceed with the y-intercept algorithm.

For a polygon with more than 3 points, the same algorithm applies. Sort of. With more complicated polygons, we may encounter this edge case:

Here the ray crosses the polygon three times.

There’s an easy fix to this problem, however. It turns out that if the ray crosses the polygon an odd number of times, then the point is inside the polygon.

Some mathematical and algorithmic results on a simple puzzle game

I once came across this simple puzzle, or game.

You have a board of 7 squares in a single row, three black, and three red pieces. At the initialization of the game, the three black pieces are placed on the first (leftmost) three squares, and the three red pieces on the last (rightmost) three squares. This leaves one square in the middle.

The aim is to get all the red pieces onto the left side, and the black pieces on the right side, or in other words to reverse the position of all the pieces. This is done by executing a sequence of two possible moves:

One, a piece may move onto an empty square adjacent to it.

Two, a piece may hop over a piece of the opposite color adjacent to it, provided that the square on the other side of the second piece is empty.

Furthermore a move may only bring a piece closer to its destination; that is, it is not allowed to backtrack.

This particular instance of the problem (of three pieces each and seven squares) can be completed in 15 moves:

While this sequence completes the puzzle in 15 moves, it is not so obvious that this is the shortest sequence, that there is no shorter sequence that completes the puzzle in fewer than 15 moves. It is possible to prove a generalized result mathematically.

A formula for the number of moves

Let us generalize the game. We define an n-game to be an instance of the puzzle, with n pieces of each color separated by an empty square. For an n-game, the board contains 2n+1 squares.

Proposition: to complete an n-game, exactly n(n+2) moves are required.

In order for the n black pieces to completely trade places with the n red pieces, each black piece must hop over, or be hopped over by n red pieces.

Consider the pieces to be paired up as follows:

In the sequence of the game, each piece trades places with its partner. (Note that this is correct because the order of pieces of the same color cannot change).

It can be seen that the relative displacement between two members of a pair is 2(n+1). This is because each piece travels n+1 squares to get to its destination. In other words a piece travels 2(n+1) squares relative to its partner.

A move by either member of a pair adds a relative displacement of 1, while a hop adds a relative displacement of 2.

We have already shown that each pair must have, between them, n hops. This is equivalent to a relative displacement of 2n. Two additional moves are required to bring the relative displacement up to 2(n+1). Therefore, n+2 moves are required for a pair to exchange places.

As there are n pairs in the game, the total number of moves required to complete the game is n(n+2), as desired.

Implementation of a solver

Using the technique of a recursive depth-first search, it is a fairly easy programming exercise to write a solver for this puzzle. The solver produces the actual sequence of moves to complete an n-game.

The source code, in C:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Number of tokens on each side
#define N 3

// Destination board
char endbd[2*N+1];

// Board representation
char bd[2*N+1];

// Moves found
char **mvs;

// Flag to indicate if recursion has finished
char cont = 1;

// Perform dfs given n byte array
void dfs(int nmvs){
    int i, l;
    char* tm = malloc(20);

    // Found, dump move list
        for(i=0;i<nmvs;i++) printf("%d. %s\n",i+1,mvs[i]);
        cont = 0;

    if(!cont) return;

    // Check for 1-moves
        if(bd[i]!=1) continue;

        // Try moving a square

            // Generate notation for piece move
            l=sprintf(tm,"%d -> %d",i+1,i+2);

            // Copy to move list

            // Continue search

            // Reset to previous state

        // Try hopping
        if(i<2*N-1 && bd[i+1]==2 && bd[i+2]==0){
            l=sprintf(tm,"%d -> %d",i+1,i+3);

    // Check for 2-moves
        if(bd[i]!=2) continue;
            l=sprintf(tm,"%d -> %d",i+1,i);
        if(i>1 && bd[i-1]==1 && bd[i-2]==0){
            l=sprintf(tm,"%d -> %d",i+1,i-1);

int main(){

    // Board representation. 1's move right, 2's move left.
    // Initialize board here.
    int i;

    mvs = malloc(1000*sizeof(char*));
    return 0;

For instance, the sequence of moves for n=3:

1. 3 -> 4
2. 5 -> 3
3. 6 -> 5
4. 4 -> 6
5. 2 -> 4
6. 1 -> 2
7. 3 -> 1
8. 5 -> 3
9. 7 -> 5
10. 6 -> 7
11. 4 -> 6
12. 2 -> 4
13. 3 -> 2
14. 5 -> 3
15. 4 -> 5

An alternative digital clock

Most of us have some sort of digital clock at home, like this one:

It’s a pretty nice invention. Yet when you think about it, it’s pretty inefficient.

Most digital clocks run on twelve hours, making no distinction between AM and PM. Since there are sixty minutes in an hour, and a digital clock ‘cycles’ every twelve hours, there are only 12*60 or 720 possible times that a digital clock can display.

Ignoring the two dots in the middle separating the hours and the minutes, there are four numbers shown in the display. Each of the four numbers use a seven-segment display:

In total there are 4*7 or 28 individual LED’s comprising the digital clock (again ignoring the two dots). But there are only 720 possible times to display.

This means there is a good deal of redundancy. With 28 LED’s or 28 bits of information, you can display one of 2^28, or 268435456 possible times. Except that only 720 of them are valid.

Theoretically, only ten LED’s are needed for a digital clock, and being able to display one of 1024 messages, they are adequate for our means. A hypothetical ten-LED clock might look like this:

This solution is not entirely satisfactory, however. Since 720 is not a power of two, 304 of the 1024 possible values are invalid. Then seemingly arbitrary combinations of dots are valid, while others are not. This is not an elegant solution.

A triangular clock

Notice, however, that 720 is a perfect factorial: it is the product of the first six numbers. We can take advantage of this fact and design a triangular digital clock.

Instead of using a base ten numbering system that humans use, or a base 60 system of seconds and minutes or the base two (binary) system used by computers, we are going to use the Factorial number system, which is a mixed radix system.

This enables us to encode times like this:

To decode it:

The dots in the rows are worth 360, 120, 30, 6, and 1 minutes respectively. Adding them up results in 671, which is to be interpreted as 11*60+11, or 11:11.

This works because a dot in a row is worth one more than the sum of all the dots in all the rows underneath it. It can be proven that each number between 0 and 720 has exactly one representation as a 5-factoradic.

Implementation in X11

A computer implementation of the binary clock is quite straightforward. I’ve done it using Xlib (runs under most linux systems). It takes the current time and displays it; it does not update in real time like a real clock:

Here it’s displaying the time 7:26.

The source code:

#include <X11/Xlib.h>
#include <time.h>

void draw(Display *d,Window w,GC g,int *dots){

    // Draw the dots in the clock. Dots are circles with 20px diameters
    // and 20px in between, arranged in a 60-60-60 lattice.
    int x = 100, y = 20, i, j;
    for(i=0; i<5; i++){
        for(j=0; j<=i; j++){

            // Use XDrawArc if it's off, or XFillArc if it's on.
            char on = j<dots[i];
            if(on) XFillArc(d,w,g,x+40*j,y,20,20,0,360*64);
            else   XDrawArc(d,w,g,x+40*j,y,20,20,0,360*64);

        // Next row
        x -= 20;
        y += 35;

int main(){

    // Set the time to display
    time_t current = time(NULL);
    struct tm * local = localtime(&current);

    // Convert time to binary clock format
    int binarytime = (local->tm_hour%12)*60 + local->tm_min;
    int dots[5];
    dots[0] = binarytime / 360;
    dots[1] = (binarytime % 360) / 120;
    dots[2] = (binarytime % 120) / 30;
    dots[3] = (binarytime % 30) / 6;
    dots[4] = binarytime % 6;

    // Create and initialize window
    Display *display = XOpenDisplay(NULL);
    int screen = DefaultScreen(display);
    Window window = XCreateSimpleWindow(

    // We don't need to accept any real events but we still need to handle
    // the expose event so we know if it's showing.
    XEvent event;
    GC gc = DefaultGC(display,screen);

    // Drawing loop
        if(event.type == Expose){

    return 0;

This is my first time using Xlib, and I think this would make a good programming exercise for learning any new GUI framework.


Although the idea is nice, I don’t think it would be very practical to use a clock like this. The problem that it was designed for, encoding time using less than 28 LED’s, did not really exist. There is no reason we would want to save a few LED’s: indeed using fifteen lights it saves thirteen when compared with the conventional clock. But a base two approach uses five less LED’s, rendering this idea worthless.

The idea of this digital clock is not my own; the idea is originally from the paper A New Idea for a Binary Clock by Jorg Pretz. This paper also covers the more mathematical details of the scheme.

The Sieve of Sundaram

The Sieve of Eratosthenes is probably the best known algorithm for generating primes. Together with wheel factorization and other optimization options, it can generate primes very quickly.

But a lesser well known algorithm for sieving primes is the Sieve of Sundaram. This algorithm was discovered in 1934 by Sundaram; like the sieve of Eratosthenes it finds all prime numbers up to a certain integer.

The algorithm

A simplified version of the algorithm, using N as the limit to which we want to find primes to:

m =: Floor of N/2
L =: List of numbers from 1 to m
For every solution (i,j) to i + j + 2ij < m:
    Remove i + j + 2ij from L

For each k remaining in L:
    2k + 1 is prime.

In practice we can find solutions to i + j + 2ij < m by using two nested for loops:

For i in 0 to m:
    For j in i to m:
        L[i + j + 2ij] =: False

Here i is always less than j, because the two are interchangeable and filtering it twice would be a waste.

We don’t actually need to loop j from 0 to m. From the inequality i + j + 2ij < m, we can solve for j: j < \frac{m-i}{2i+1}. The new algorithm:

m =: Floor of N/2
L =: Boolean array of length m
Fill L with true

For i in 0 to m:
    For j in i to (m-i)/(2i+1):
        L[i + j + 2ij] =: False

For each k remaining in L:
    2k + 1 is prime.

Why this algorithm works

In the algorithm, 2k+1 is prime where k can be written as i + j + 2ij where i and j are integers. We can rewrite this:

\begin{array}{l} 2(i+j+2ij) + 1 \\ = 2i + 2j + 4ij + 1 \\ = (2i+1) (2j+1) \end{array}

Both 2i+1 and 2j+1 are odd numbers, and any number that can be written as the product of two odd numbers are composite.

Of the odd numbers, those that cannot be written as the product of two odd numbers are prime. We’ve filtered everything that can be written as (2i+1)(2j+1) so we are left with the odd prime numbers.

This algorithm only gets the odd prime numbers, but fortunately there is only one even prime number, 2.

Benchmarks with the Sieve of Eratosthenes

Here’s an implementation of the Sieve of Sundaram:

#include <stdio.h>
#include <stdlib.h>

typedef unsigned long long ll;
int main() {
 ll n = 100000000LL;
 ll m = n/2;
 char *sieve = malloc(m);

 for(ll i=0; i<m; i++)
  sieve[i] = 1;

 for(ll i=1; i<m; i++)
  for(ll j=i; j<=(m-i)/(2*i+1); j++)
   sieve[i+j+2*i*j] = 0;

 ll s=1;
 for(ll i=1; i<m; i++)
  if(sieve[i]) s++;

 printf("%llu", s);
 return 0;

This code counts the number of primes below 100 million, which should be 5761455. The above code runs in 9.725 seconds.

Here’s an alternative, an implementation of the more standard Sieve of Eratosthenes:

#include <stdio.h>
#include <stdlib.h>

typedef unsigned long long ll;
int main(){
 ll lim = 100000000LL;
 char *sieve = malloc(lim);

 for(int i=0; i<lim; i++)
  sieve[i] = 1;

 int s=0;
 for(int i=2; i<lim; i++){
   for(int j=2; j<=lim/i; j++)
    sieve[i*j] = 0;

 printf("%d", s);
 return 0;

I was surprised to find that the Sieve of Eratosthenes actually ran faster. It completed in 7.289 seconds.

I expected the Sieve of Sundaram to be faster because according to Wikipedia this algorithm uses O(n \log(n)) operations, while the Sieve of Eratosthenes uses O(n \log(n) \log \log(n)).

Playing with the toggle key LED’s

It’s actually pretty easy to programmatically turn the toggle keys on and off. By the toggle keys I mean the Num Lock, Caps Lock, and Scroll Lock keys at the corner of the keyboard.

I’m sure this has been done before but the information about how to do this was kind of difficult to find.

Using C, here’s a few routines that may be helpful:

#include <Windows.h>

struct KEY {
    char virtcode;
    char hardcode;

struct KEY NumLock =        {0x90, 0x45};
struct KEY ScrollLock =     {0x91, 0x46};
struct KEY CapsLock =       {0x14, 0x3a};

// Simulates a hardware key press.
// Input: Key we want to press.
void flipKey(struct KEY* key){

    // Simulate a key press and release.
    keybd_event(key->virtcode, key->hardcode, 0, 0);
    keybd_event(key->virtcode, key->hardcode, 2, 0);

// Returns the state of a key (on or off).
// 0 if off, 1 if on.
char getKey(struct KEY* key){
    return GetKeyState((BYTE) key->virtcode);

The method keybd_event requires both a virtual and a hardware code for a key. So for convenience, the two codes are grouped together into a struct.

So the code to flip Num Lock would be this:


This code infinitely loops a pattern on the three LED’s:

int main() {

    // Make sure all the keys are off.
    if(getKey(&NumLock)) flipKey(&NumLock);
    if(getKey(&CapsLock)) flipKey(&CapsLock);
    if(getKey(&ScrollLock)) flipKey(&ScrollLock);

    // Display a nice pattern.




    return 0;

Maybe this will be of use to someone.