SPOJ: Absurd Prices (6803)

Just writing on a SPOJ problem I worked on recently. It’s called Absurd Prices, code name ABSURD.

In this problem, when given an integer representing an item’s price in cents, our program has to determine whether the price is absurd or not.

There is an algorithm to calculate the absurdity of an integer: the higher the absurdity, the more absurd it is. An integer c is considered absurd when within the range [0.95c,1.05c] there is an integer with a lower absurdity.

To find the absurdity of a price:

  1. Remove all the trailing 0’s from it
  2. Count the length of the remaining number
  3. The absurdity is twice the length
  4. If the number (with trailing zeroes removed) ends with 5, subtract 1 from the absurdity

As a result of these rules, prices with a lot of trailing zeroes, like 3500000, would have a very low absurdity, while those without trailing zeroes, like 7999999, would have a much higher absurdity.

A brute force approach seemed at first to be the way to go. This is because the range, [0.95c,1.05c] isn’t too big and it isn’t a big deal to simply calculate the absurdity of all of them, and mark it as absurd when any of the integers in the range has a lower absurdity.

However, the time limit is very strict for this problem. Even when my program only checked 50 random numbers in the range, the system returned with TLE (Time Limit Exceeded)!

Fortunately there is a better way of solving the problem.

Given a range, it is easy to find the least absurd number in the range. For example, if our minimum and maximum bounds are 6655 and 7799 respectively, it’s easy to see that the least absurd number in that range is 7000, simply because it has the most trailing zeroes.

The algorithm that we can use to find the least absurd number in a range (defined by two integers) is follows:

  1. In their decimal expansions, compare the first characters, then the second characters, etc, until the n^{th} character is different.
  2. The minimum absurdity is 2n.
  3. If the n^{th} character of the smaller number is less than 5, and the n^{th} character of the larger number is five or greater, subtract 1 from the minimum absurdity.

Here’s my implementation in Haskell:

-- Integer with trailing zeroes removed
removeZeroes x = head . filter (\n -> n `mod` 10 > 0) $
  iterate (\n -> n `div` 10) x

-- Calculuate the absurdity of a price
absurdity x = let
  r = removeZeroes x
  d = fromIntegral . length $ show r
  l = r `mod` 10
  in if l == 5 then 2*d-1 else 2*d

-- The minimum possible absurdity in a given range
minabsd a b | a == b = absurdity a
minabsd a b = let
  c = show a; d = show b
  sm = fromIntegral . length . takeWhile (uncurry (==)) $ zip c d
  lst = (c!!sm,d!!sm)
  offset = if fst lst < '5' && snd lst >= '5' then 1 else 0
  in minimum [absurdity a, (2 * (sm + 1) - offset), absurdity b]

absurd x = let
  r1 = ceiling $ 0.95*(realToFrac x)
  r2 = floor $ 1.05*(realToFrac x)
  absx = absurdity x
  in absx > minabsd r1 r2

main = interact run where
  as s = let k = read s in if absurd k then "absurd" else "not absurd"
  run = unlines . map as . tail . lines

The Proggit Bacon Challenge: a probabilistic and functional approach

A few days ago I saw an interesting programming challenge on Proggit (more commonly known as /r/programming, or the programming subreddit). The problem is found here.

This is how the problem goes:

You are given a rectangular grid, with houses scattered across it:

The objective is to place bacon dispensers (I’ll call them bacons from now on) at various places so the people in the houses can get the bacon.

I have no clue why they chose bacon, out of all objects to choose from. Alas, that is not the point.

So given a limited number of bacons, you must distribute them effectively among the houses by putting them on empty squares. In the example, you have three bacons to place.

For each house, the score is the distance to the nearest bacon (using the Manhattan, not Euclidean metric). Your total score is the sum of the scores for each house. Thus, you are trying to minimize your total score.

Optimal solutions

Here is the optimal solution for the problem:

If you add them up, you can see that the total score for this configuration is 10.

Question is, how do you arrive at this configuration?

It turns out that this isn’t as easy as it looks. This problem is NP-Hard, meaning there is no algorithm that can solve it both quickly and optimally. By “quickly”, it’s understood to mean polynomial or non-exponential complexity; if this is impossible then the best algorithm is not significantly better than just brute force.

In order to solve the problem in a reasonable amount of time, we have to trade optimality for speed and rely on less than optimal, probabilistic approaches.

Introducing the k-means algorithm

We will now transform the problem into a data clustering problem.

If we have k bacons to place, then we must find k distinct clusters. After this, we can place the bacons in the centroid of each cluster to achieve the optimal score. In other words, we are finding clusters such that the distance from a point to the center of a cluster is minimized.

The best known algorithm for this problem is Lloyd’s algorithm, more commonly referred to as the k-means algorithm. Let’s try an example to demonstrate this algorithm.

Suppose we want to find two clusters in these points:

We start by choosing two centers randomly from the sample space, let’s make them green and red:

We assign each point to its nearest center:

Then, we move each center to the centroid of its cluster:

Notice now how some of the points are closer to a different center than the center they’re assigned now. Indeed, they belong to a different cluster.

So we reassign the clusters:

Again we calculate the centroids:

We repeat these steps as many times as we need to, usually until the clusters do not change anymore. Depending on the data it may take more or less iterations, but it normally converges fairly quickly.

This method, unfortunately, does not always achieve an optimal result. Technically it always converges on a local optimum, which is not always the global optimum. The local optimum can be arbitrarily worse than the global optimum.

Take note of how the result of the algorithm depends entirely on the results of the random starting positions of the clusters.

If you’re very very lucky, they might as well end up at exactly the optimal locations.

If you’re really unlucky, however, they may end up all in a corner of the map; and the result configuration would be far from optimal. We might even end up with most of the clusters completely empty. The thing is that they’re assigned completely randomly.

We can do better than that.

Improving the k-means: introducing the k-means++ algorithm

The k-means++ algorithm addresses some of the problems with the k-means algorithm, by seeking better starting clusters. Its results are almost always better than the standard algorithm.

Let’s try this.

The first thing we do is put a cluster right on top of a random point:

For each point that doesn’t already have a cluster on it, calculate the distance to the nearest cluster (which is not always the same cluster):

Next we assign a probability to each of the points, proportional to the squares of the distances:

The next cluster is chosen with this weighted probability. We repeat this until we have all k clusters distributed on top of k different points.

Then, we proceed with the regular k-means algorithm.

The result of this way of choosing is that the starting clusters tend to be spread out more evenly; moreover there’s no empty clusters. Notice how a point twice as far from the nearest cluster is four times more likely to be chosen for the next cluster.

Although this drastically improves the k-means algorithm, it still does not guarantee an optimal configuration.

Repeated simulation

There is one more thing we can do to increase our score. Being a probabilistic algorithm, the results depend heavily on the random numbers generated. Using different random numbers would achieve better or worse results.

To get the better results, we can run the algorithm multiple times, each time with a different set of random numbers. As the number of iterations increase, the score will get closer and closer to the optimum.

Implementation in Haskell

It took me about two days to write a program for this task; I’ve submitted the program to the challenge. There the source code is available, as well as various benchmarks.

Looking through and running some of the other entries, it seems that my program beats most of them. One exception is the entry (entries) by idevelop, which produces considerably better scores than mine for the extremely large input sets. On the other hand, my program does better on most other inputs (medium and large) by repeating the simulation a few hundred times, (usually) arriving at the optimum solution.

Project Euler 285

Problem 285 of Project Euler was released yesterday; it requires some geometry and probability sense.

The problem goes like this:

Albert chooses two real numbers a and b randomly between 0 and 1, and a positive integer k.

He plays a game with the following formula:

k' = \sqrt{(ka+1)^2 + (kb+1)^2}

The result k’ is rounded to the nearest integer. If k’ rounded is equal to k, then Albert gets k points. Otherwise, he gets nothing.

Find the expected value of the total score if Albert plays this game with k = 1, 2, \cdots , 10^5.

For example (from the problem statement), let k=6, a=0.2, and b=0.85.

Here the expression \sqrt{(ka+1)^2 + (kb+1)^2} gives the number 6.484. When rounded to the nearest integer, it becomes 6, which is equal to k. Therefore, Albert gets 6 points.

Yup – another expected value problem. But the concept of this problem is simple:

Hosted by imgur.com

For any value of k, there are certain values of a and b where Albert will get points. Otherwise, he does not get points. Here the gray blob is the area where he does get points, and the white area is where he does not.

The red square is the sample space of the random variables a and b, since 0<a<1 and 0<b<1.

The probability of Albert getting the points is the fraction of the square that’s gray.

The area of the red square is always 1. The area of the gray area varies with k. Thus the idea is to calculate the area of the gray area for each value of k from 1 to 10^5.

Now we can get to the details of the problem.

Notice that if round(k') = k, then k-0.5 < k' < k+0.5 (the difference between less-equal and less is irrelevant in this problem):

Hosted by imgur.com

In a similar way, the equation of the larger circle is (a+\frac{1}{k})^2 + (b+\frac{1}{k})^2 < (\frac{k+0.5}{k})^2.

These are equations of circles. A circle has the equation (x-a)^2 + (y-b)^2 = r^2 where (a,b) is the center and r is the radius.

Thus both the smaller and the larger circles have a center at (-\frac{1}{k},-\frac{1}{k}). The radii of the circles are different: the radius of the smaller circle is \frac{k-0.5}{k} and the radius of the larger circle \frac{k+0.5}{k}.

Plotting the inequalities:

Hosted by imgur.com

We can see now that this is a calculus problem! Indeed, the shaded area is the area of the larger circle inside the boxarea of the smaller circle inside the box.

Initially I tried integrating the circles directly, but I ran into problems while integrating and I was unable to correctly integrate the equation. Instead, I integrated an equivalent problem:

Hosted by imgur.com

It’s obvious that the areas do not change with this transformation.

So now for the smaller circle the equation is a^2 + (b+\frac{1}{k})^2 > (\frac{k-0.5}{k})^2. Solving for b, we get:

b > \sqrt{(\frac{k-0.5}{k})^2-a^2} - \frac{1}{k}

.. And this equation for the larger circle:

b < \sqrt{(\frac{k+0.5}{k})^2-a^2} - \frac{1}{k}

The variable of integration here is a, and k is only a constant. Therefore we can replace the radius with r and get this equation to integrate:

b = \sqrt{r^2-a^2} - \frac{1}{k}

With the help of an integration table, this can be integrated fairly easily.

Hosted by imgur.com

There’s one additional problem – computing the integrating ‘limits’ of the two circles. Let’s call them L_S and L_L for small and large respectively.

Both of them can be computed using the pythagorean theorem.

Hosted by imgur.com

The formula for L_S (it should be obvious by now what L_L should be):

L_S = \sqrt{(\frac{k-0.5}{k})^2-\frac{1}{k^2}}

Now we can get this really big formula for computing the area for any integer k:

\begin{array}{l} \int_{\frac{1}{k}}^{\sqrt{(\frac{k+0.5}{k})^2-\frac{1}{k^2}}} (\sqrt{(\frac{k+0.5}{k})^2-a^2}-\frac{1}{k}) da \\ - \int_{\frac{1}{k}}^{\sqrt{(\frac{k-0.5}{k})^2-\frac{1}{k^2}}} (\sqrt{(\frac{k-0.5}{k})^2-a^2}-\frac{1}{k}) da \end{array}

Also, fortunately for me, it is impossible for the upper bound to cross the circle at the top or right, I mean, we never have something like this –

Hosted by imgur.com

Anyways, here’s a Haskell implementation of the algorithm:

integral a r k = 0.5 * (a * sqrt (r^2 - a^2) + r^2 * atan (a/sqrt (r^2-a^2))) - a/k

area k = let area_1 1 = 0
             area_1 k = integral (sqrt (((k-0.5)/k)^2 - (1/k)^2)) ((k-0.5)/k) k - integral (1/k) ((k-0.5)/k) k
             area_2 k = integral (sqrt (((k+0.5)/k)^2 - (1/k)^2)) ((k+0.5)/k) k - integral (1/k) ((k+0.5)/k) k
         in  area_2 k - area_1 k

main = print . sum . zipWith (*) [1..] $ map area [1..10^5]

The code runs in about 3 seconds.

There seems to be an edge case where k=1 (probably an asymptote) so I have to specially make it be zero.

This problem was released 10 pm last night and I was tired, lol, but I still managed to solve it and get into the top 20.

Hosted by imgur.com

Yet another Quine tutorial

A quine is a program that prints its own source code. For example, this would be a quine (save as quine.hs):

main = putStr =<< readFile "quine.hs"

When the program is run, the following output is produced:

main = putStr =<< readFile "quine.hs"

Although this program technically prints its own source code, this wouldn’t be considered a real quine. A real quine cannot read its own source code from a file.

A first approach

Usually the most obvious way to approach this problem is to print a program, something like this:

main = putStr "main = putStr \"Hello\""

Of course, this isn’t really a quine, because the output of this program is main = putStr "Hello". Although the program prints another program, a quine must reproduce itself.

It’s clear that this approach isn’t going to work.

The structure of a Quine

One distinguishing feature of a quine is part of the program’s code encoded as a string. We’ll make this the structure of attention:

Usually there are code before and after this string. We’ll call them A and B:

Now the important thing is the code inside the quotes is a character for character representation of everything after the ending quotation:

Perhaps this would be a better representation of the structure:

Up to this point we haven’t mentioned what the code should contain. Here is a rough guideline:

You can see here that the code is printed twice. The first time, we print it the way it appears in the string, handling the quotations, and escapes (if any). The second time we print it normally.

Here’s a quine in Haskell:

s = "\nmain = putStr (\"s = \" ++ show s ++ s)"
main = putStr ("s = " ++ show s ++ s)

A major problem faced when writing quines is how to print the quoted code exactly as it appears in the source code.

Fortunately for us, we don’t have this problem in Haskell because the show function does exactly this.

Liu Hui’s algorithm for calculating Pi

Liu Hui [刘徽] was a Chinese mathematician who lived in the Wei period. One of the things he was famous for was his method of calculating \pi iteratively.

This approach to calculate \pi (also used by Archimedes), is to calculate the area of a polygon inscribed in a circle. As the number of sides of the polygon increases, its area becomes closer and closer to the area of the circle.

Liu Hui found a simple formula to find the side length of an inscribed polygon of 2n sides, if the side length of a polygon with n sides is known:

\begin{array}{l} k_6 = 1 \\ k_{2n} = \sqrt{2+k_n} \\ S_n = \sqrt{2-k_n} \end{array}

Here k is a temporary variable, and S_n is the side length of an inscribed polygon with n sides. Something like this:

We start with a hexagon. The radius of the circle is 1, the area \pi. The side length of the hexagon is 1.

To calculate the next k value, all we need to do is do an addition and a square root:

\begin{array}{ll} k_6 = 1 & S_6 = \sqrt{2-1} = 1 \\ k_{12} = \sqrt{2+1} & S_{12} = \sqrt{2-\sqrt{2+1}} \approx 0.518 \\ k_{24} = \sqrt{2+\sqrt{2+1}} & S_{24} = \sqrt{2-\sqrt{2+\sqrt{2+1}}} \approx 0.261 \end{array}

The area of a regular polygon is A = \frac{1}{2}nsa where n is the number of sides, s is the side length, and a is the apothem. As the number of sides increases, the apothem becomes closer and closer to the radius, so here we’ll just let a=1.

We then have the formula for the area of the polygon: P_n = \frac{1}{2} n S_n, where P_n is the area of a polygon with n sides.

Let’s do a few here:

\begin{array}{l} P_6 = 3 \sqrt{2-1} = 3 \\ P_{12} = 6 \sqrt{2-\sqrt{2+1}} \approx 3.106 \\ P_{24} = 12 \sqrt{2-\sqrt{2+\sqrt{2+1}}} \approx 3.133 \\ P_{48} = 24 \sqrt{2-\sqrt{2+\sqrt{2+\sqrt{2+1}}}} \approx 3.139 \end{array}

To save us some tedious work, here’s a Haskell program to do the calculations for us:

k 6 = 1
k n = sqrt $ 2 + k (n/2)
s n = sqrt $ 2 - k n
p n = (n/2) * s n
main = mapM_ (putStrLn . (\n -> show (round n) ++ " " ++ show (p n))) $ iterate (*2) 6

Here’s part of the output:

6 3.0
12 3.1058285412302498
24 3.132628613281237
48 3.139350203046872
96 3.14103195089053
192 3.1414524722853443
384 3.141557607911622
768 3.141583892148936
1536 3.1415904632367617
3072 3.1415921060430483
6144 3.1415925165881546
12288 3.1415926186407894
24576 3.1415926453212157
49152 3.1415926453212157
98304 3.1415926453212157

This is about the maximum you could get without arbitrary precision decimals: the area of a polygon with 98304 sides. This gives us 3.14159265 which is 8 digits of \pi.

With a big-decimal library, it would be possible to calculate \pi to any precision.

Happy pi day!

Solving systems of linear equations in Haskell

Haskell isn’t normally used for things like this, but it’s quite possible to solve systems of linear equations with Haskell.

There are already several libraries for doing this, and other more advanced matrix manipulating. But here, I’m going to start simple.

In mathematics, systems of linear equations are usually represented by an augmented matrix. A system of n linear equations would be represented by an augmented matrix with n rows and n+1 columns.

For example, we have this system of equations:

\begin{array}{rrrcl} x &+2y &-z &=& -4 \\ 2x &+3y &-z &=& -11 \\ -2x &&-3z &=& 22 \end{array}

This would be represented as an augmented matrix:

\left[ \begin{array}{ccc|c} 1 & 2 & -1 & -4 \\ 2 & 3 & -1 & -11 \\ -2 & 0 & -3 & 22 \end{array} \right]

In Haskell we represent this as a list of lists, like this:

[ [1,2,-1,-4], [2,3,-1,-11], [-2,0,-3,22] ]

Here I’ll store each entry not as an integer, but as a floating point. You could also use Rational in Data.Ratio but both are fine for now.

The advantage of using Rational over Float is that sometimes you will end up with fractions that don’t work very well with floating point numbers. However, I’ve found that printing a list of lists of Rational types makes it difficult to read, unless you implement a custom show function for it.

So this is how we define our matrix types in Haskell:

type Row = [Float]
type Matrix = [Row]

The approach to solving this problem is rather simple. First we reduce whatever matrix we have to REF, or Row Echelon Form and then get the actual roots with some back substitution.

The algorithm used to transform a matrix to its Row Echelon Form is known as the Gaussian Elimination. Here’s what a matrix should look like after Gaussian Elimination (a * represents any value):

\left[ \begin{array}{ccc|c} 1 & * & * & * \\ 0 & 1 & * & * \\ 0 & 0 & 1 & * \end{array} \right]

Our matrix should look like this after Gaussian Elimination:

\left[ \begin{array}{ccc|c} 1 & 2 & -1 & -4 \\ 0 & 1 & -1 & 3 \\ 0 & 0 & 1 & -2 \end{array} \right]

The REF form is not unique, so that is only one of the possible valid outputs for the Gaussian Elimination.

Why do we want to have the matrix in REF form? A matrix in this form can easily be solved using back substitution. Consider this matrix as a series of linear equations, as we did before:

\begin{array}{rrrcl} x &+2y &-z &=& -4 \\ &+y &-z &=& 3 \\ &&z &=& -2 \end{array}

Now it would be very clear how to solve for the three variables.

The Gaussian Elimination Algorithm

Here is a diagram of how Gaussian Elimination works. On each iteration, the element circled in green is considered the pivot element, while the elements enclosed in the red square are the ones we intend to remove (zero) in each iteration.

Removing the red elements in the matrix is actually quite simple. Consider how you would eliminate x in equation B here:

\begin{array}{lrrcl}(A) & x & +2y & = & 4 \\(B) & 2x & +y & = & 5 \end{array}

Probably you would multiply equation A by 2, giving 2x + 4y = 8, then subtract B from it, giving 3y=3, eliminating x.

We can also write that as B = 2A - B. Basically to eliminate a variable, just multiply a row so it matches up, and subtract. This is middle school algebra.

To make things easier for us, we divide the row we are on so that the pivot is always 1. We do this now because we need them to be 1 anyways, and this avoids an unnecessary division in the next step.

We could, of course, not have the pivot always be 1, but we would have to do the divisions later when substituting to get the solutions. More on this later.

So to eliminate the variable under the pivot, multiply the whole row by that number. I have a picture to clarify:

We simply repeat this for all elements under the pivot.

An edge case

This is where it gets a little bit tricky. What if the pivot is 0?

We have no way of making it 1 by any kind of multiplication. Further, we cannot eliminate any elements below the pivot. What do we do now?

Simple. We swap the current row with any other row so that the pivot is not zero. Any row will do, so we’ll just pick the first one that fits.

If there is not a single element below the pivot that is not zero, the matrix is either under-determined or singular; either case it is unsolvable.

Here is my Haskell code on what I just covered:

gaussianReduce :: Matrix -> Matrix
gaussianReduce matrix = fixlastrow $ foldl reduceRow matrix [0..length matrix-1] where

 --swaps element at position a with element at position b.
 swap xs a b
  | a > b = swap xs b a
  | a == b = xs
  | a < b = let
  (p1,p2) = splitAt a xs
  (p3,p4) = splitAt (b-a-1) (tail p2)
  in p1 ++ [xs!!b] ++ p3 ++ [xs!!a] ++ (tail p4)

 reduceRow matrix1 r = let
  --first non-zero element on or below (r,r).
  firstnonzero = head $ filter (\x -> matrix1 !! x !! r /= 0) [r..length matrix1-1]

  --matrix with row swapped (if needed)
  matrix2 = swap matrix1 r firstnonzero

  --row we're working with
  row = matrix2 !! r

  --make it have 1 as the leading coefficient
  row1 = map (\x -> x / (row !! r)) row

  --subtract nr from row1 while multiplying
  subrow nr = let k = nr!!r in zipWith (\a b -> k*a - b) row1 nr

  --apply subrow to all rows below
  nextrows = map subrow $ drop (r+1) matrix2

  --concat the lists and repeat
  in take r matrix2 ++ [row1] ++ nextrows

 fixlastrow matrix' = let
  a = init matrix'; row = last matrix'; z = last row; nz = last (init row)
  in a ++ [init (init row) ++ [1, z / nz]]

Edit: There was a bug in the above code, found by Alan Zimmerman. I think it’s been fixed.

This may be a bit difficult to read because there is no syntax highlighting and the code is cut off. I’ll provide a link to the full source code at the end.

Admittedly Haskell may not have been the best language to implement this algorithm this particular way because there is so much state changing. Any language that allows mutable state would probably perform better than this code.

Notice that at the end, the last row does not get divided. The fixlastrow function corrects this problem.

Let’s test this code:

*Main> gaussianReduce [ [1,2,-1,-4], [2,3,-1,-11], [-2,0,-3,22] ]
[[1.0,2.0,-1.0,-4.0],[0.0,1.0,-1.0,3.0],[-0.0,0.0,1.0,-2.0]]

Excellent.

Finishing up

The next step of the algorithm is to solve the variables by back substitution. This is pretty easy, I think. My code keeps a list of already-found solutions. Folding from the right, each step it substitutes in the corresponding solution and multiplies & subtracts to get the next solution, adding that to the solution list.

--Solve a matrix (must already be in REF form) by back substitution.
substitute :: Matrix -> Row
substitute matrix = foldr next [last (last matrix)] (init matrix) where

 next row found = let
  subpart = init $ drop (length matrix - length found) row
  solution = last row - sum (zipWith (*) found subpart)
  in solution : found

To get a list of solutions from a matrix, we chain the substitute and gaussianReduce functions:

solve :: Matrix -> Row
solve = substitute . gaussianReduce
*Main> solve [ [1,2,-1,-4], [2,3,-1,-11], [-2,0,-3,22] ]
[-8.0,1.0,-2.0]

This means the solutions are (x,y,z) = (-8,1,-2). That seems correct, so we’re done!

The code is far from practical, though. Although it works, I haven’t really tested its performance (probably not very good), and it doesn’t handle all edge cases.

2009 COMC: Problem 4B

This is the final (and probably hardest) problem of this year’s COMC (Canadian Open Mathematics Challenge). This is a 2.5 hour contest for high school students. I wrote this contest a few months ago in November, and I didn’t get this question. Looking at the results, the average mark for this question is a mere 0.2/10.

4. For each positive integer n, define f(n) to be the smallest positive integer s for which 1 + 2 + 3 + \cdots + (s-1) + s is divisible by n. For example, f(5) = 4 because 1 + 2 + 3 + 4 is divisible by 5 and none of 1, 1 + 2, or 1 + 2 + 3 are divisible by 5.

It seems that the first problem is understanding what the function f(n) actually means. I find it easier to define it in Haskell:

f n = head . filter (\x -> sum [1..x] `mod` n == 0) $ [1..]

Let’s use the term triangular number to define a number that can be written as 1 + 2 + 3 + \cdots + N. The first few triangular numbers are 1, 3, 6, and 10. The term triangular number comes from the fact that numbers like that can be arranged to form an equalateral triangle:

Lemma 0: T_n = \frac{n(n+1)}{2}

If we apply the arithmetic sum formula (\frac{(a+b)n}{2}) we can see that any triangular number can be written as \frac{n(n+1)}{2} and that any number of that form is a triangular number. Also, let’s call T_n the n^{th} triangular number. For example, T_5 = 15 because \frac{5(5+1)}{2} = 15.

Also, A | B, when A and B are positive integers, means that A is a factor of B, or conversely that B is a multiple of A.

I’ll rewrite the definition of f(n) for clarity:

f n = first (\x -> T_x divides n) $ [1..]

In plain English that means f(n) is the first triangular number that’s a multiple of n.

Let’s move on to the first problem:

(a) Determine all positive integers a such that f(a) = 8.

I think this problem is rather straightforward. What we need to do is find all of the numbers where f(a) = 8. If f(a) = 8, then the smallest triangular number that evenly divides a is T_8 or 36. What numbers does 36 evenly divide? Well, the divisors of 36 are [1,2,3,4,6,9,18,36].

But wait. The problem asks for the smallest triangular number that divides it. The triangular numbers smaller than 36 are [1,3,6,10,15,21,28], so if a is divisible by any of these numbers, then f(A) \neq 8.

So we filter out the numbers that are divisible by a smaller triangular number: 1 (divisible by 1), 2 (divisible by 6), 3 (divisible by 6), 4 (divisible by 28), and 6 (divisible by 6); we’re left with [9,12,18,36].

Therefore the solution for f(a) = 8 are a = 9,12,18,36.

The next problem is a bit harder:

(b) Prove that there are infinitely many odd positive integers b such that f(b+1) - f(b) > 2009.

Let’s start by defining a few lemmas.

Lemma 1: If w | T_z then f(w) \leq z.

Why is this?

If T_z is a multiple of w, then T_z evenly divides w. Therefore from our definition, the first triangular number that evenly divides w is at most T_z. From that, it’s obvious that f(w) is at most T_z.

This doesn’t necessarily mean that f(w) = z, as there might still be a smaller triangular number that also divides w.

Lemma 1a: If f(w) = z, then w | T_z.

This goes the other way as well. If f(w) = z, w must be a factor of T_z. Notice that this is pretty much just a redefinition of f(x), but it’s important to realize this.

Lemma 1b: For all odd integers y, f(y) = y-1.

Now we just have to prove that this is true for all values of a. Assume for now that f(y) = y-1. Using Lemma 1a, we know that y must be a factor of T_{y-1}.

There is another way of proving this:

We mentioned before that the formula for the n^{th} triangular number is \frac{n(n+1)}{2}. From this, the formula for T_{y-1} would be \frac{y(y-1)}{2} or y \frac{y-1}{2}.

We know that y is an odd number. Then y-1 must be even and \frac{y-1}{2} is an integer. From that we can conclude that y must be a factor of T_{y-1}.

Again, there might be a triangular number less than T_{y-1} which fits our statements but the smallest cannot be greater than T_{y-1}.

Lemma 2: f(2^a) = 2^{a+1} - 1 for all positive integers a.

Now let’s prove this.

Let m = f(2^a). From Lemma 1a, 2^a must be a factor of T_m. We can also write that as q 2^a = T_m for some integer q.

Substiting the triangular number formula, q 2^a = \frac{m(m+1)}{2}. Mutiplying by 2, q 2^{a+1} = m(m+1). Of (m,m+1), one of them is even and the other odd. The odd one could not possibly be divisible by any exponent of 2, so the even one of the two must be divisible by 2^{a+1}.

The smallest m for which this is possible is where m is the odd number and m+1 is the even number, and m = 2^{a+1}-1. Then m+1 = 2^{a+1}.

So now we have a lower bound for f(2^a). m can’t be any smaller than 2^{a+1}-1 so f(2^a) \geq 2^{a+1}-1. Now we just have to prove that this is true for all values of a.

Using Lemma 1b this time, if we can prove that 2^a is a factor of T_{2^{a+1}-1}, then f(2^a) \leq 2^{a+1}-1. This effectively produces an upper bound. We already have a lower bound, so this reduces f(2^a) to only one value.

It’s easy to show that 2^a is a factor of T_{2^{a+1}-1}. T_{2^{a+1}-1} = 2^{a+1} \frac{2^{a+1}-1}{2} = 2^a (2^{a+1}-1) which is a multiple of 2^a.

Now that we’ve proven our lemmas, we can apply them to solve the problem. Let b = 2^a-1 for some positive integer a in f(b+1) - f(b). Notice that b is odd, and that b+1 is a power of 2.

f(b+1)-f(b) = f(2^a)-f(2^a-1) = 2^{a+1}-1-f(2^a-1) \leq 2^{a+1}-1-(2^a-2) = 2^a+1

In short:

f(b+1)-f(b) \leq 2^a+1.

For any number a that is greater or equal to 11, 2^a+1 > 2009.

There are infinitely many integers a \geq 11, so there are infinitely many numbers b that satisfy f(b+1)-f(b)>2009, solving the problem.

This problem is tricky because although it’s easy enough to find an upper limit for f(y) where y is an odd integer, it’s not so easy when y is even. The ‘trick’ is not to find it for even integers, but to find the solution for the case where b = 2^a. Even then, this problem still requires a good grasp of number theory.

I’ll move on to the final problem.

(c) Determine, with proof, the smallest integer k for which the equation f(c) = f(c+k) has an odd positive integer solution for c.

Notice that there is a solution for k = 3 where f(9) = 8 and f(12) = 8.

After the contest I in fact ran a program to make sure that there is indeed no solution for k = 1 and k = 2. I didn’t know how to prove it at the time however. Also this contest did not permit the use of computers or calculators.

In this case the answer is in fact 3, but to prove that we’ll need to prove that there is no solution for f(c) = f(c+1) and f(c) = f(c+2), where c is an odd positive integer.

First case: Prove that f(c) = f(c+1) has no solution for c, where c is an odd integer.

If such a solution exists, there should be m where m = f(c) = f(c+1). By lemma 1, c is a factor of T_m and T_m = \frac{m(m+1)}{2} so qc = \frac{m(m+1)}{2} for some integer q.

By lemma 1b, because c is odd, m \leq c-1. Substituting, qc \leq \frac{c(c-1)}{2} so q \leq \frac{c-1}{2}.

But f(c+1) = m as well, so c+1 is also a factor of T_m. We can also write that as qc so both c+1 and c are factors of qc.

The numbers c and c+1 are consecutive integers, so lcm(c,c+1) = c(c+1). Then the minimum possible value for qc is c(c+1) and q \geq c+1.

At the same time, q \leq \frac{c-1}{2}. Obviously it is impossible for both to be true. This concludes our case by proof of contradiction.

Second case: Prove also that f(c) = f(c+2) has no solution for c.

The proof for this case is nearly identical for the previous case. Remember that c is an odd integer, so lcm(c,c+2) = c(c+2).

If we follow through on this the same way as the previous case, we should arrive with the contradicting inequalities q \geq c+2 and q \leq \frac{c-1}{2}.

We have an example for k=3 and we have shown that it is impossible for k=1 and k=2, leaving the answer as 3.