# What if math contests were scored using Principal Component Analysis?

In many math competitions, all problems are weighted equally, even though the problems have very different difficulties. Sometimes, the harder problems are weighted more. But how should we assign weights to each problem?

Usually, the organizers make up weights based on how difficult they believe the problems are. However, they sometimes misjudge the difficulty of problems. Wouldn’t it be better if the weightings were determined from data?

Let’s try Principal Component Analysis!

Principal Component Analysis (PCA) is a statistical procedure that finds a transformation of the data that maximizes the variance. In our case, the first principal component gives a relative weighting of the problems that maximizes the variance of the total scores. This makes sense because we want to separate the good and bad students in a math contest.

## IMO 2017 Data

The International Mathematics Olympiad (IMO) is an annual math competition for top high school students around the world. It consists of six problems, divided between two days: on each day, contestants are given 4.5 hours to solve three problems.

Here are the 2017 problems, if you want to try them.

Above: Score distribution for IMO 2017

This year, 615 students wrote the IMO. Problems 1 and 4 were the easiest, with the majority of contestants receiving full scores. Problems 3 and 6 were the hardest: only 2 students solved the third problem. Problems 2 and 5 were somewhere in between.

This is a good dataset to play with, because the individual results show what each student scored for every problem.

## Derivation of PCA for the 1-dimensional case

Let $X$ be a matrix containing all the data, where each column represents one problem. There are 615 contestants and 6 problems so $X$ has 615 rows and 6 columns.

We wish to find a weight vector $\vec u \in \mathbb{R}^{6 \times 1}$ such that the variance of $X \vec u$ is maximized. Of course, scaling up $\vec u$ by a constant factor also increases the variance, so we need the constraint that $| \vec u | = 1$.

First, PCA requires that we center $X$ so that the mean for each of the problems is 0, so we subtract each column by its mean. This transformation shifts the total score by a constant, and doesn’t affect the relative weights of the problems.

Now, $X \vec u$ is a vector containing the total scores of all the contestants; its variance is the sum of squares of its elements, or $| X \vec u |^2$.

To maximize $|X \vec u |^2$ subject to $|\vec u| = 1$, we take the singular value decomposition of $X = U \Sigma V^T$. Then, the leftmost column of $V$ (corresponding to the largest singular value) gives $\vec u$ that maximizes $| X \vec u|^2$. This gives the first principal axis, and we are done.

## Experiments

Running PCA on the IMO 2017 data produced interesting results. After re-scaling the weights so that the minimum possible score is 0 and the maximum possible score is 42 (to match IMO’s scoring), PCA recommends the following weights:

• Problem 1: 9.15 points
• Problem 2: 9.73 points
• Problem 3: 0.15 points
• Problem 4: 15.34 points
• Problem 5: 5.59 points
• Problem 6: 2.05 points

This is the weighting that produces the highest variance. That’s right, solving the hardest problem in the history of the IMO would get you a fraction of 1 point. P4 had the highest variance of the six problems, so PCA gave it the highest weight.

The scores and rankings produced by the PCA scheme are reasonably well-correlated with the original scores. Students that did well still did well, and students that did poorly still did poorly. The top students that solved the harder problems (2, 3, 5, 6) usually also solved the easier problems (1 and 2). The students that would be the unhappiest with this scheme are a small number of people who solved P3 or P6, but failed to solve P4.

Here’s a comparison of score distributions with the original and PCA scheme. There is a lot less separation between the best of the best students and the middle of the pack. It is easy to check that PCA does indeed produce higher variance than weighing all six problems equally.

Now, let me comment on the strange results.

It’s clearly absurd to give 0.15 points to the hardest problem on the IMO, and make P4, a much easier problem, be worth 100 times more. But it makes sense from PCA’s perspective. About 99% of the students scored zero on P3, so its variance is very low. Given that PCA has a limited amount of weight to “spend” to increase the total variance, it would be wasteful to use much of it on P3.

The PCA score distribution has less separation between the good students and the best students. However, by giving a lot of weight to P1 and P4, it clearly separates mediocre students that solve one problem from the ones who couldn’t solve anything at all.

In summary, scoring math contests using PCA doesn’t work very well. Although it maximizes overall variance, math contests are asymmetrical as we care about differentiating between the students on the top end of the spectrum.

## Source Code

If you want to play with the data, I uploaded it as a Kaggle dataset.

The code for this analysis is available here.

# IMO 2010: Problem 5

Problem 5 of the IMO was definitely the most interesting problem in the contest, although one of the harder ones. A mere 37 people out of 511 managed to solve it completely.

What made it so different was that it did not require any particular knowledge on olympiad math theorems, indeed such knowledge would be useless for this problem. It required only creative insight, and quite a bit of it. Technically it’s a combinatorics problem, but no specific combinatorics experience is needed to solve the problem.

Also this problem was the target for a mini-polymath project organized by Terry Tao, which took about two and a half hours to get from start to solving the problem. The solution I’ll give is mostly based on the polymath solution, although I came up with my own finishing steps which I think is less awkward.

### Problem

In each of six boxes $B_1,B_2,B_3,B_4,B_5,B_6$ there is initially one coin. There are two types of operation allowed:

Type 1: Choose a nonempty box $B_j$ with $1 \leq j \leq 5$. Remove one coin from $B_j$ and add two coins to $B_{j+1}$.

Type 2: Choose a nonempty box $B_k$ with $1 \leq k \leq 4$. Remove one coin from $B_k$ and exchange the contents of (possibly empty) boxes $B_{k+1}$ and $B_{k+2}$.

Determine whether there is a ﬁnite sequence of such operations that results in boxes $B_1,B_2,B_3,B_4,B_5$ being empty and box $B_6$ containing exactly $2010^{2010^{2010}}$ coins. (Note that $a^{b^c}= a^{(b^c)}$.)

### Initial observations

It is not entirely obvious how this problem should be approached; thus we will begin by playing with it and jotting down any interesting observations.

We begin with one coin in each box. For convenience, I will express this as

$(1,1,1,1,1,1)$.

We need to have

$(0,0,0,0,0,2010^{2010^{2010}})$

Again for convenience, let us substitute

$D = 2010^{2010^{2010}}$.

Now we can express the two types of moves in our new notation. First, Type 1 moves (we shall call it Move 1):

$(a,b) \to (a-1,b+2)$.

Next, Type 2 moves:

$(a,b,c) \to (a-1,c,b)$.

Notice how applying Move 1 adds one coin to the system as a whole, while Move 2 takes away one coin.

By applying Move 2 over and over, it is easy to remove arbitrarily large amounts of coins from the system (as long as the coins are in the first four boxes).

On the other hand, it isn’t so clear how do add large amounts of coins to the system.

Applying Move 1 adds a coin to the system, so logically adding coins should be done by repeated usage of Move 1.

Given $(a,b)$, applying Move 1 results in $(a-1,b+2)$; applying it again results in $(a-2,b+4)$. We can repeat this as many times as we like, or until the first box is depleted.

This creates a compound move for a pair, executed by applying Move 1 repeatedly:

$(a,b) \to (0,b+2a)$.

What if we apply this on all six boxes? We first get

$(0,3,1,1,1,1)$

Next we have

$(0,0,7,1,1,1)$;

A few more iterations and we will have:

$(0,0,0,0,0,63)$.

We are far from our target: we have 63 coins and we’re stuck. There doesn’t seem to be any other way to introduce any more coins into the system.

The problem seems hopeless.

### Building up a repository of compound moves

We first need a way of getting a large amount of coins into the system. From there is is easy to reduce the coins.

The compound move explained in the previous section is a first step. Let’s call it Compound Move 1:

$(a,b) \to (0,b+2a)$.

The next compound move is less obvious. Suppose that we start with

$(a,0,0)$.

Applying Move 1 gives us

$(a-1,2,0)$

which, using Compound Move 1, can be used to get

$(a-1,0,4)$.

Now we apply Move 2. Using up a coin to switch the two boxes, we have

$(a-2,4,0)$

Or equivalently,

$(a-2,0,8)$.

We can repeat this: switching, doubling, switching again, and so on, until the first box is depleted. This gives us

$(0,2^a,0)$.

So here we have Compound Move 2:

$(a,0,0) \to (0,2^a,0)$.

This is a lot better than what we had before. However, it’s still a while towards our goal.

With Compound Move 1, each additional coin in the left most box adds 2 coins to the next box.

But with Compound Move 2, each additional coin doubles the result in the next box.

Now this problem has a Towers of Hanoi feel to it. It turns out that we can take this one step further, which, when examined closely, looks remarkably similar to the previous case.

Suppose we have

$(a,0,0,0)$.

in four consecutive boxes. Applying Move 1:

$(a-1,2,0,0)$.

Applying Compound Move 2 using the last three boxes, we get

$(a-1,0,2^2,0)$.

Now we apply Move 2 and switch the boxes:

$(a-2,2^2,0,0)$.

If we do this again, we get

$(a-3,2^{2^2},0,0)$.

So, repeating this many times until the left box is empty gives us

$(0,2^{\cdot^{\cdot^2}},0,0)$.

No doubt the number we have is sufficiently big now. We will call this move Compound Move 3, which can only be expressed reasonably with Knuth’s up-arrow notation:

$(a,0,0,0) \to (0, 2 \uparrow \uparrow a,0,0)$.

### Putting it together

The configuration we want is

$(0,0,0,0,0,D)$

which is equivalent to

$(0,0,0,\frac{D}{4},0,0)$.

It’s relatively easy to get most of the boxes empty except for the third. From the starting position, Move 1 gives us

$(1,1,1,1,0,3)$

Applying Move 2 three times gives us

$(1,0,3,0,0,0)$;

Move 1 then gives

$(0,0,7,0,0,0)$.

We invoke Compound Move 3, getting

$(0,0,0,2 \uparrow \uparrow 7,0,0)$.

Finally we can waste coins by consecutively executing Move 2 until we have

$(0,0,0,\frac{D}{4},0,0)$

And we’re done.

It is easy to show, albeit informally, that $2 \uparrow \uparrow 7 > \frac{D}{4}$. If we take the binary log (I’ll write it as $\log x$) of both sides, we have

$2 \uparrow \uparrow 6 > 2010^{2010} \cdot \log 2010 - 2$.

It’s okay to ‘throw away’ the 2, considering the magnitude of the numbers. Taking the log again:

$2 \uparrow \uparrow 5 > 2010 \cdot \log (2010 \cdot \log 2010)$.

Considering that $2 \uparrow \uparrow 5 = 2^{65536}$, and the right hand side evaluates to just a few thousand, it’s fairly safe to say which is bigger.

# IMO 2010: Problem 2

In this year’s IMO, there were two geometry problems. Problem 4 is generally considered the easier one of the two. Problem 2 is the other problem, being a bit harder. Of the 517 contestants, 366 of them received a full mark on problem 4; only 162 students got a full mark for problem 2.

### The Problem

Let $I$ be the incentre of triangle $ABC$ and let $\Gamma$ be its circumcircle. Let the line $AI$ intersect $\Gamma$ again at $D$. Let $E$ be a point on the arc $\widehat{BDC}$ and $F$ a point on the side $BC$ such that

$\angle BAF = \angle CAE < \frac{1}{2} \angle BAC$.

Finally, let $G$ be the midpoint of the segment $IF$. Prove that the lines $DG$ and $EI$ intersect on $\Gamma$.

### Solution

It turns out that several solutions exist. Several of the solutions use Menelaus’s theorem, which I’ve never even heard of; others rely on properties of excircles. I’ll give the solution that I think is the most elegant (although it doesn’t seem to be the standard solution).

We start by working backwards.

Let us denote $K$ to be the point of intersection between $DG$ and $EI$; we wish to prove that $K$ lies on $\Gamma$. If $K$ lies on $\Gamma$, then quadrilateral $AKDE$ is cyclic, and $\triangle IKD \sim \triangle IAE$.

Obviously $\angle KID = \angle AIE$. If we can show that $\angle GDI = \angle AEI$, then triangles $IKD$ and $IAE$ are similar and our result follows.

Let $M$ be the midpoint of $AI$.

We will prove that $\angle GDI = \angle AEI$ by proving that $\triangle MGD$ and $\triangle AIE$ are similar (these are the triangles highlighted in red).

It is given that $\angle CAE = \angle BAF$, so angles $\angle DAE$ and $\angle DAF$ are also equal (since $AD$ passes through the incenter $I$ and thus bisects $\angle A$).

It is also given that $FG = GI$; now $IM = MA$ by definition so triangles $IFA$ and $IGM$ are similar. Then $\angle GMD = \angle IAE$.

All that remains to prove the similarity of $\triangle MGD$ and $\triangle AIE$ (and thus the result) is to prove that

$\frac{MG}{MD} = \frac{AI}{AE}$

or equivalently

$MG \cdot AE = AI \cdot MD$.

As $\triangle IFA \sim \triangle IGM$, and $IG = \frac{1}{2}IF$, it follows that $MG = \frac{1}{2}AF$. By substitution,

$\frac{1}{2}AF \cdot AE = AI \cdot MD$.

It can be shown that the product $AF \cdot AE$ is constant no matter where $E$ is on the circle:

Draw the line $EC$; the triangle formed, $\triangle ACE$, is similar to $\triangle AFB$ since $\angle CAE = \angle CAF$ and $\angle ABF$ and $\angle AEC$ subtend the same arc.

Then

$\frac{AF}{AB} = \frac{AC}{AE}$

or

$AE \cdot AF = AB \cdot AC$.

Since $AB \cdot AC$ is constant with respect to $E$ (and $F$), so is $AE \cdot AF$.

Consequentially we can prove that $\frac{1}{2}AE \cdot AF = AI \cdot MD$ for all values of $E$ and $F$ by proving it for one value of $E$ and $F$, since $\frac{1}{2} AE \cdot AF$ is constant and obviously $AI \cdot MD$ is constant.

We prove the case for $\angle CAE = 0$, or in other words when $E$ coincides with $C$ and $F$ coincides with $B$:

So here we need to prove that

$\frac{1}{2} AB \cdot AC = AI \cdot MD$.

This is equivalent to proving that $K$ lies on $\Gamma$ in this instance, as that would prove the above equation too for this instance.

Obviously $\angle ACK = \angle ADK$, also $\angle BCK = \angle BDK$. As $\angle ACK = \angle BCK$ since $CK$ passes through the incircle and is an angle bisector, it follows that $\angle ADK = \angle BDK$.

Given $BG=GI$, this is sufficient to prove $\triangle BDI$ to be isosceles.

So $\angle BDG = \angle GDI$, and $DG$ meets $\Gamma$ at the midpoint of minor ark $\widehat{AB}$. Obviously $CI$ meets $\Gamma$ at the same point, and we are done. QED.

### Checking the solution

Just to make sure, we can trace out steps back to get from the equation $\frac{1}{2} AE \cdot AF = AI \cdot MD$ to our result that $K$ lies on $\Gamma$. I’m going to go through it very briefly:

From the equation we get $\frac{MG}{MD} = \frac{AI}{AE}$, proving triangles $MGD$ and $AIE$ to be similar.

Then $\angle GDI = \angle AEI$ and triangles $KDI$ and $AEI$ are similar. Finally $AKDE$ is cyclic, leading to the result.

It is also valid to, starting with the result, arrive at the equation, which is what we implicitly did near the end of the solution.

# IMO 2010: Problem 4

The International Olympiad of Mathematics took place on July 8 and 9, and the six questions were made available the second day. This was a two day contest, three questions per day, and four and a half hours per day, giving a total of nine hours for the six questions. The international contest included questions on geometry, algebra, combinatorics, and number theory.

This year, it seems that problems 1 and 4 were the easiest (by comparison). I’ve only looked at problem 4 in detail (probably because it’s geometry).

### Problem

Let P be a point inside the triangle $ABC$. The lines $AP$, $BP$ and $CP$ intersect the circumcircle $\Gamma$ of triangle $ABC$ again at the points K, L and M respectively. The tangent to $\Gamma$ at C intersects the line $AB$ at S. Suppose that $SC = SP$. Prove that $MK = ML$.

### Solution

As often with geometry problems, a variety of different solutions are possible. Some are simple, while others rely on homothetic transformations and inversions. I’m going to go with the method I came up with (albeit with a few peeks at the Art of Problem Solving forums).

From the Power of a Point theorem, we have

$SC^2 = SA \cdot SB$

But it’s given that $SP = SC$, so

$SP^2 = SA \cdot SB$

Or,

$\frac{SP}{SA} = \frac{SB}{SP}$

This is enough to show that triangles $SAP$ and $SPB$ are similar, since they also share an angle.

Therefore,

$\frac{SA}{AP} = \frac{SP}{BP}$

Or,

$\frac{AP}{BP} = \frac{SA}{SP}$

Since $SP = SC$,

$\frac{AP}{BP} = \frac{SA}{SC} \; \; \; (1)$

Triangles $SCA$ and $SBC$ are similar, since they share an angle and $\angle SCA = \angle ABC$. Thus,

$\frac{SA}{AC} = \frac{SC}{BC}$

Or,

$\frac{SA}{SC} = \frac{AC}{BC}$

From (1),

$\frac{AP}{BP} = \frac{AC}{BC}$

Or equivalently,

$\frac{AC}{AP} = \frac{BC}{BP} \; \; \; (2)$

Notice that triangles $APC$ and $MPK$ are similar ($\angle CMK$ and $\angle CAK$ subtending the same arc); so,

$\frac{AC}{AP} = \frac{MK}{MP}$

Similarly, triangles $MPL$ and $BPC$ are also similar, giving

$\frac{BC}{BP} = \frac{ML}{MP}$

From (2),

$\frac{MK}{MP} = \frac{ML}{MP}$

This shows that $MK = ML$, as desired. QED.