## How to Write your own Minesweeper AI

December 23, 2012

A while ago, I wrote a minesweeper AI. I intended to publish a writeup, but due to university and life and exams, I never got around to writing it. But having just finished my Fall term, I have some time to write a decent overview of what I did.

Short 30 second video of the AI in action here:

### How to Play Minesweeper

If you’re an experienced minesweeper player, you can probably skip this section. Otherwise, I’ll just give a quick overview of some basic strategies that we can use to solve an easy minesweeper game.

We start with a 10×10 Beginner’s grid, and click on a square in the middle:

We can quickly identify some of the mines. When the number 1 has exactly one empty square around it, then we know there’s a mine there.

Let’s go ahead and mark the mines:

Now the next strategy: if a 1 has a mine around it, then we know that all the other squares around the 1 cannot be mines.

So let’s go ahead and click on the squares that we know are not mines:

Keep doing this. In this case, it turns out that these two simple strategies are enough to solve the Beginner’s grid:

All this seems easy enough. Here’s what we’ll need to do:

1. Read the board. If we use a screenshot function, we can get a bitmap of all the pixels on the board. We just need to ‘read’ the numbers on the screen. Luckily for us, the numbers tend to have different colors: 1 is blue, 2 is green, 3 is red, and so on.
2. Compute.  Run the calculations, figure out where the mines are. Enough said.
3. Click the board. This step is easy. In Java, we can use the Robot class in the standard library to send mouse clicks to the screen.

There’s not a whole lot to this step, so I’m going to skim over it quickly.

At the beginning of the run, while we have a completely empty grid, we invoke a calibration routine – which takes a screenshot and looks for something that looks like a Minesweeper grid. Using heuristics, it determines the location of the grid, the size of a grid square, the dimensions of the board, and things like that.

Now that we know where the squares are, if we want to read a square, we crop a small section of the screenshot and pass it to a detection routine, which looks at a few pixels and figures out what’s in the square.

A few complications came up in the detection routine:

• The color for the number 1 is very close to the color of an unopened square: both are a dark-blue color. To separate them apart, I compared the ‘variance’ of the patch from the average color for the patch.
• The color for 3 is identical to that for 7. Here, I used a simple edge-detection heuristic.

### Straightforward Algorithm

The trivially straightforward algorithm is actually good enough to solve the beginner and intermediate versions of the game a good percent of the time. Occasionally, if we’re lucky, it even manages to solve an advanced grid!

When humans play minesweeper, we compete for the fastest possible time to solve a grid of minesweeper. So it doesn’t matter if we lose 20 games for every game we win: only the wins count.

This is clearly a silly metric when we’re a robot that can click as fast as we want to. Instead, we’ll challenge ourselves with a more interesting metric:

Win as many games as possible.

Consider the following scenario:

Using the straightforward method, we seem to be stuck.

Up until now, whenever we mark a square as having a mine or safe, we’ve only had to look at a single 3×3 chunk at a time. This strategy fails us here: the trick is to employ a multisquare algorithm – look at multiple different squares at once.

From the lower 2, we know that one of the two circled squares has a mine, while the other doesn’t. We just don’t know which one has the mine:

Although this doesn’t tell us anything right now, we can combine this information with the next 2: we can deduce that the two yellowed squares are empty:

Let’s click them to be sure.

And voilà. They’re empty. The rest of the puzzle can be solved easily, after we’ve made the deduction that those two squares were empty.

### The Tank Solver Algorithm

It’s difficult to make the computer think deductively like we just did. But there is a way to achieve the same results, without deductive thinking.

The idea for the Tank algorithm is to enumerate all possible configurations of mines for a position, and see what’s in common between these configurations.

In the example, there are two possible configurations:

You can check for yourself that no other configuration could work here. We’ve deduced that the one square with a cross must contain a mine, and the three squares shaded white below must not contain a mine:

This works even better than human deduction!

We always try to apply the simple algorithm first, and only if that gets us stuck, then we bring in the Tank algorithm.

To implement the Tank algorithm, we first make a list of border tiles: all the tiles we aren’t sure about but have some partial information.

Now we have a list of $T$  border tiles. If we’re considering every possible configuration, there are $2^T$ of them. With backtracking, this number is cut down enough for this algorithm to be practical, but we can make one important optimization.

The optimization is segregating the border tiles into several disjoint regions:

If you look carefully, whatever happens in the green area has no effect on what happens in the pink area – we can effectively consider them separately.

How much of a speedup do we get? In this case, the green region has 10 tiles, the pink has 7. Taken together, we need to search through $2^{17}$ combinations. With segregation, we only have $2^{10} + 2^7$: about a 100x speedup.

Practically, the optimization brought the algorithm from stopping for several seconds (sometimes minutes) to think, to giving the solution instantly.

### Probability: Making the Best Guess

Are we done now? Can our AI dutifully solve any minesweeper grid we throw at it, with 100% accuracy?

Unsurprisingly, no:

One of the two squares has a mine. It could be in either, with equal probability. No matter how cleverly we program our AI, we can’t do better than a 50-50 guess. Sorry.

The Tank solver fails here, no surprise. Under exactly what circumstances does the Tank algorithm fail?

If it failed, it means that for every border tile, there exists some configuration that this tile has a mine, and some configuration that this tile is empty. Otherwise the Tank solver would have ‘solved’ this particular tile.

In other words, if it failed, we are forced to guess. But before we put in a random guess, we can do some more analysis, just to make sure that we’re making the best guess we could make.

Try this. What do we do here:

From the 3 in the middle, we know that three of them are mines, as marked. But marking mines doesn’t give us any new information about the grid: in order to gain information, we have to uncover some square. Out of the 13 possible squares to uncover, it’s not at all clear which one is the best.

The Tank solver finds 11 possible configurations. Here they are:

Each of these 11 configurations should be equally likely to be the actual position – so we can assign each square a probability that it contains a mine, by counting how many (of the 11) configurations does it contain a mine:

Our best guess would be to click on any of the squares marked ‘2’: in all these cases, we stand an 82% chance of being correct!

### Two Endgame Tactics

Up until now, we haven’t utilized this guy:

The mine counter. Normally, this information isn’t of too much use for us, but in many endgame cases it saves us from guessing.

For example:

Here, we would have a 50-50 guess, where two possibilities are equally likely.

But what if the mine counter reads 1? The 2-mine configuration is eliminated, leaving just one possibility left. We can safely open the three tiles on the perimeter.

Now on to our final tactic.

So far we have assumed that we only have information on a tile if there’s a number next to it. For the most part, that’s true. If you pick a tile in some distant unexplored corner, who knows if there’s a mine there?

Exceptions can arise in the endgame:

The mine counter reads 2. Each of the two circled regions gives us a 50-50 chance – and the Tank algorithm stops here.

Of course, the middle square is safe!

To modify the algorithm to solve these cases, when there aren’t that many tiles left, do the recursion on all the remaining tiles, not just the border tiles.

The two tricks here have the shared property that they rely on the mine counter. Reading the mine counter, however, is a non-trivial task that I won’t attempt; instead, the program is coded in with the total number of mines in the grid, and keeps track of the mines left internally.

### Conclusion, Results, and Source Code

At this point, I’m convinced that there isn’t much more we could do to improve the win rate. The algorithm uses every last piece of information available, and only fails when it’s provably certain that guessing is needed.

How well does it work? We’ll use the success rate for the advanced grid as a benchmark.

• The naïve algorithm could not solve it, unless we get very lucky.
• Tank Solver with probabilistic guessing solves it about 20% of the time.
• Adding the two endgame tricks bumps it up to a 50% success rate.

Here’s proof:

I’m done for now; the source code for the project is available on Github if anyone is inclined to look at it / tinker with it:

https://github.com/luckytoilet/MSolver

## A CMOQR Problem and why not to Trust Brute Force

March 6, 2012

Recently I was invited to compete in the CMOQR – a qualifier contest for the Canadian Math Olympiad. The contest consisted of eight problems, and contestants were allowed about a week’s time to submit written solutions via email.

After a few days, I was able to solve all of the problems except one — the second part of the seventh problem:

Seven people participate in a tournament, in which each pair of players play one game, and one player is declared the winner and the other the loser. A triplet ABC is considered cyclic if A beats B, B beats C, and C beats A.

Can you always separate the seven players into two rooms, so that neither room contains a cyclic triplet?

(Note: the first half of the problem asked the same question for six people — and it’s not too difficult to prove that no matter what, we can put them into two rooms so that neither the first nor the second room contains a cyclic triplet.)

But what happens when we add another person? Can we still put four people in one room, and three people in the other, so that neither rooms contain a cyclic triplet?

There are two possibilities here:

• One, it’s always possible. No matter what combinations of wins and losses have occurred, we can always separate them into two rooms in such a way. To prove this, we’ll need to systematically consider all possible combinations, and one by one, verify that the statement is possible for each of the cases.
• Two, it’s not always possible. Then there is some counterexample — some combination of wins and losses so that no matter how we separate them, one of the rooms has a cyclic triplet. This is easier to prove: provided that we have the counterexample, we just have to verify that indeed, this case is a counterexample to the statement.

But there’s a problem. Which of the cases does the solution fall into? That is, should we look for a quick solution by counterexample, or look for some mathematical invariant that no counterexample can exist?

### Brute Force?

It would be really helpful if we knew the counterexample, or knew for sure what the counterexample was. What if we wrote a computer program to check all the cases? After all, there are only 7 people in the problem, and 7 choose 2 or 21 games played. Then since each game is either won by one player or the other, there are only 2^21 combinations overall (although that does count some duplicates). And 2^21 is slightly over two million cases to check — completely within the bounds of brute force.

So I coded up a possibility-checker. Generate all 2^21 possible arrangements, then for each one, check all possible ways to separate them into two rooms. If it turns out that no matter how we arrange them, a cyclic triplet persists, then display the counterexample. Simple.

I ran the program. It quickly cycled through every possible arrangement, three seconds later exiting without producing a counterexample.

Alright. So there’s no counterexample. I would have to find some nice mathematical invariant, showing that no matter what, there is always some way to group the players so that neither room has a cyclic triplet.

But no such invariant came. I tried several things, but in each attempt couldn’t quite show that the statement held for every case. I knew that there was no counterexample, but I couldn’t prove it. But why? There must be some tricky way to show that no counterexample existed; whatever it was, I couldn’t find it.

### Brute Force poorly implemented

Reluctantly, as the deadline came and passed, I submitted my set of solutions without solving the problem. When the solutions came out a week later, the solution to this problem did not contain any tricky way to disprove the counterexample. Instead, what I found was this:

Let $A_0 \ldots A_6$ be seven players. Let $A_i$ beat $A_j$ when the difference $i-j \equiv 1,2,4 \mod 7$.

Huh? A counterexample, really? Let’s look at it.

Everything is symmetric — we can ‘cycle’ the players around without changing anything. Also, if we take four players, two of them are consecutive. Let them be $A_0$ and $A_1$.

At this point everything falls into place: in any subset of four players, three of them are cyclic.

But wait … my program had not found any counterexamples! And right here is a counterexample! The culprit was obvious (the reader may have foreseen this by now) — of course, there had to be a problem with my program.

Running my code through a debugger, I found a logic error in the routine converting binary numbers to array configurations, meaning that not all possible configurations were tried. As a result, the counterexample slipped through the hole.

After fixing the code, the program found not one, but a total of 7520 (although not necessarily distinct) counterexamples. Most of them had no elegant structure, but the solution’s configuration was among them.

For the interested, here is the fixed code.

### When to Start Over?

It is true that the program could have been better written, better debugged. But how could you know whether a counterexample existed and your program didn’t find it, or if no counterexample existed at all?

In hindsight, it seems that writing the brute force program made me worse off than if I hadn’t written it at all. After the program ran without finding a single counterexample, I was confident that no counterexample existed, and set out about proving that, instead of looking for counterexamples or symmetry.

When you are stuck on such a math problem — that is, after making a bit of progress you get stuck — it might be profitable to start over. More often than I would like, I prove a series of neat things, without being able to prove the desired result. Then a look at the solutions manual reveals that a very short solution — one or two steps — lay in the opposite direction.

I’ll put an end to my philosophical musings of the day. Fortunately, the cutoff for the CMOQR was low enough that even without solving every single problem, I was still able to meet the cutoff.

## My attempt at a Conquest AI

December 9, 2011

Before I move on, let me introduce Conquest – a chesslike minigame in Runescape. This isn’t a detailed description of the rules, just an overview to kind of get to know how the game works, if you haven’t seen it before.

You have a 20×20 board and some pieces (which you choose yourself and get to set up wherever you want, with some restrictions). You’re trying to kill all of your opponent’s pieces, which you have to do in order to win.

Each piece has a certain move radius — so it can move to any square within that radius (diagonals only counting as a distance of one). Then, once it has moved, it is allowed to attack an enemy within its range — again varying from piece to piece.

That’s the gist of the game. Oh yes, at certain times you can also use some silly special attacks in addition to moving and attacking — for instance, freeze an enemy piece for one move or temporarily boost the attack of one of your own pieces. And these special moves can be used on any pieces during your move, and in any combination.

### A Conquest AI… or not

So some months ago, I was playing Conquest. I was a decent player, but after a string of losses, I thought: maybe I can write an AI to play this game for me — perhaps better than a human can? Perhaps the same way you would write a chess AI?

Well, I scribbled some back of the envelope calculations, but I quickly realized that the number of possible moves at each position was far too large.

How large? A player has maybe 5 pieces, and if a piece’s average move radius is 4, then we already have 5*9*9 (remember 4 is the radius, not diameter) which is over 400. And that’s if we never attack or use special moves. If we take special moves into account, since special moves can be stacked on top of each other, assuming 5 valid targets per special move and 3 special moves we would multiply the 400 again by 5*5*5. For one move, we would need to consider 50000 possibilities.

In comparison, from any position in chess there are typically 30 or 40 legal moves. So the standard minimax algorithm would probably be no good, at least not without an incredible amount of heuristics. I had to try something different for this to work.

### Random has a chance

It turns out that there exists a very different strategy that worked quite well for Go-playing AIs. Instead of constructing and evaluating a huge search tree to find the best move, we simulate thousands random games, with random moves from each side. The move that gives us the best win percentage is played. All else being equal, a position that gives you a 90% chance of winning given random play from that point on is usually better than one with only a 30% chance of winning.

Here’s a screenshot of a connect-4 AI using this technique:

The Monte Carlo approach, as that’s what it’s called, seems absurd — there’s no way playing random moves can beat a good minimax! Indeed, Monte Carlo is never used to play chess, and I could sometimes beat the above AI. But where Monte Carlo really shines is when standard minimax is impractical — my scenario.

After finding and reading a few papers describing the method in detail, I was ready to start coding. However, there is still one more catch: to make it practical to set up the AI against other players over Runescape (and where the time limit for each move is usually 30 seconds), I would have to find a way to integrate the back end — the part that does the computations — with a front end that relayed the moves back and forth to the Runescape game client.

All is not lost, however: there were several freely available hacked clients for botting purposes: Powerbot allowed for users to write java scripts that automated tedious ingame actions.

That was about as far as I got though, unfortunately. Coding the game logic proved trickier than expected. A month or so after I started my account was banned for botting; later, Jagex (the company behind the game) made an update that rendered Powerbot (as well as every other similar hacked client) unusable. Without a game account nor a practical front-end, I called it quits.

## Solving the AB Game by Brute Force

November 7, 2011

The AB game, a variant of Mastermind, is played with pencil and paper and with two people. One person chooses a four digit number in which no digit is repeated, say X, and the other person tries to guess it in as few moves as possible. After the guesser guesses some number, say Y, the other person gives information on how closely Y matched with X:

• If some digit in Y coincides with a digit in X (in the same position), then the guesser scores an A.
• If some digit in Y exists in X but is in the wrong place, then the guesser scores a B.

For instance, if X is 1234 and we guess 2674, we get an A and a B, because the 4 is in the right place, and the 2 is one of the right numbers but isn’t in the right place.

This proceeds until the guesser gets the exact number.

When humans (or at least beginners) play the AB game, they usually do some form of uncoordinated trial and error, which gets the right answer after some large number of moves. This takes anywhere from about 8 to possibly 30 guesses. When I played this game with a friend, I didn’t have a very systematic strategy, but I wondered if a computer program could solve the game, always entering the optimal guess.

### A Bruteforce Solver

My first approach happened to work fairly well. Simply, the computer keeps track of a list of all possible numbers that the answer can be. At random, the computer guesses one of the numbers, and upon receiving feedback, eliminates every number in its list that doesn’t match that feedback. Quickly it eliminates whole swaths of combinations and arrives at the answer.

A typical session might go like this:

Guess 1: 6297. Score: 5040
b
Guess 2: 8512. Score: 1440
abb
Guess 3: 2315. Score: 83
bb
Guess 4: 5842. Score: 29
bb
Guess 5: 9581. Score: 13
ab
Guess 6: 8021. Score: 1


It took only 5 guesses for the computer to narrow down the choices to the only possible answer (8021).

A variant that is usually much harder for humans to solve is to allow the number chosen to have repeats. Although significantly trickier for humans to guess by trial and error, brute force doesn’t seem to be affected too much by it:

Guess 1: 1796. Score: 10000
b
Guess 2: 5881. Score: 3048
a
Guess 3: 0131. Score: 531
abb
Guess 4: 3311. Score: 15
aa
Guess 5: 3201. Score: 9
abb
Guess 6: 2011. Score: 1


The computer’s average of five or six is much better than a human can normally do! (although I haven’t researched possible human algorithms).

### Optimal or Not?

At this point you may begin to wonder if this strategy is the optimal one. Unfortunately, it is not — and I only need one counterexample to demonstrate that.

Suppose that instead of four numbers, you were allowed to choose 4 letters from A to Z. You choose the combination ABCW. Now suppose that the computer ‘knows’ that the first three letters are ABC — that is, it has eliminated all other combinations except for ABCD, ABCE, …, ABCZ.

By the random guessing algorithm, the computer is forced to guess ABCJ, ABCP, etc, until it eventually hits on ABCW at random. This may take a very high number of guesses.

A smarter strategy would be to guess combinations of four unknown letters, say DEFG, then HIJK, etc. Instead of eliminating one letter, you eliminate four letters at a time. Although some guesses have no chance of being correct, the number of guesses required is fewer in the long run.

### Source Code

I’ll put my somewhat unwieldy java code here for all to see:



import java.util.*;

public class ABSolver{

// Does cc1 and cc2 together match the pattern?
// If repeats are allowed, cc2 is matched against cc1.
static boolean fits(char[] cc1, char[] cc2, int A, int B){

int a = 0;
int b = 0;
for(int i=0; i<4; i++){
if(cc1[i] == cc2[i]) a++;
else{
for(int j=0; j<4; j++){
if(i != j && cc1[j] == cc2[i] && cc1[j]!=cc2[j]){
b++;
break;
}
}
}
}

return a==A && b==B;
}

public static void main(String[] args){

Random rand = new Random();
Scanner scan = new Scanner(System.in);

// Combinations that haven't been eliminated yet
List<char[]> ok = new ArrayList<char[]>(10000);

// Generate all possible combinations
for(int i = 0; i <= 9999; i++){

String i_s = Integer.toString(i);
while(i_s.length() != 4) i_s = "0" + i_s;

// Check for sameness
char a = i_s.charAt(0);
char b = i_s.charAt(1);
char c = i_s.charAt(2);
char d = i_s.charAt(3);
boolean same = a==b || a==c || a==d || b==c || b==d || c==d;

// Comment this line out if we're allowing repeats
if(same) continue;

char[] i_ia = i_s.toCharArray();

}

// Pick the first guess randomly
char[] firstg = ok.get(rand.nextInt(ok.size()));
char[] guess = null;

int nguesses = 1;

// Question answer cycle until we get it right
while(true){

if(nguesses > 1){
String ans = scan.nextLine();
int A = 0;
int B = 0;
for(char cc : ans.toCharArray()){
if(cc == 'a') A++;
if(cc == 'b') B++;
}

if(A==4) return;

// For each one check to see if it still fits
List<char[]> ok_ = new ArrayList<char[]>(ok.size());
for(char[] zhe : ok){
if(fits(zhe, guess, A, B))
}
ok = ok_;
}

char[] nextguess = null;
if(nguesses == 1)
nextguess = firstg;
else nextguess = ok.get(rand.nextInt(ok.size()));

System.out.println("Guess " + nguesses + ": " + new String(nextguess)
+ ". Score: " + ok.size());

// we win!
if(ok.size() == 1)
return;

guess = nextguess;
nguesses++;
}

}
}


## Algorithmic Counterpoint: How I Wrote a Computer Program to Generate Music at Mathcamp 2011

August 10, 2011

Hello all; I have just returned from Mathcamp, a five week math program for high school students. I’m not going to write in detail about life at Mathcamp, other than that it was amazing in every way.

This year at Mathcamp, I chose to do a computer programming project to ‘teach’ a computer to write ‘music’. The word ‘music’ is in quotes because what it’s expected to generate is more or less random and nothing like actual music by Bach or Mozart; on the other hand, the ‘music’ generated resembles actual music closely enough that it can still be called music, not a random string of notes. Anyways, here’s a short sample of what it can generate:

What can be considered ‘music’ in some sense here is really just a set of rules backed by a random number generator. Hence what it does is really the following:

1. Generate a main melody using a bunch of rules: ie, what notes are allowed to go with each other, what rhythms are allowed to go with what notes, etc.
2. Using the main melody as a sort of a baseline, generate one or more counterpoint melodies: a counterpoint melody is constrained by the same rules that the main melody must follow, but also has to harmonize with the existing melody, thus being constrained by another set of rules.
3. Once all the notes are generated, the program simply uses some library — in our case JFugue – to play the notes through the speakers.

The bulk of our time was spent implementing, tweaking, and debugging the rules and the various other substeps in the above process.

The source code, mostly written by myself over about two weeks, is available as a Google Code project.

## Eyes!

June 14, 2011

I randomly made this thing out of boredom:

Cute, but it’s probably a sign that I need some better ideas for hobby programming projects xD

It uses the pyglet library, and is only a page or so of python code.

## Coding a Tetris AI using a Genetic Algorithm

May 27, 2011

About two years ago, when I was in grade 9, I decided to make a tetris clone in Java. One or two months later, I had a fully working and playable tetris, complete with background and sound effects and scoring and graphical effects when a line is cleared.

A year after I created it, I decided to go ahead and write an AI to play my game. Although it played much better than I could (I’m not a particularly good tetris player), it would still die after a few dozen lines when the number of columns is 10. This means it was pretty outclassed by existing AI’s that could go on for thousands of lines!

Now, another year later, I coupled my previous AI algorithm with a genetic algorithm, with some pretty neat results:

### Rules of the Game

How does one make a computer program to play tetris? Or more generally, how does one play tetris in the first place? The rules of tetris seem to me to be better viewed than explained, but I’ll give a quick overview of tetris gameplay and tetris strategies.

As I’m sure most of you know, in tetris you have blocks of four (tetrominoes) falling from the top of the board. The player moves and rotates the blocks and stacks them up:

Here the black outline is one of the places you can put the funny shaped block. And when a row is filled entirely with blocks (the row with the red outline below), you get a clear; that entire row is removed and the rest of the board is shifted down (often with a series of beeping noises and a small increase to your score):

If the blocks don’t get cleared and they stack to the top of the board, you lose. So ideally you want to fill as many lines as possible and avoid stacking the blocks up. Very simple.

### Simple Strategies

So what are some things that we can try to avoid losing the game? Some things immediately come to mind. We lose when the blocks get stacked up so high that we can’t put in new blocks, right? So it makes sense to avoid piling up large numbers of blocks in high positions in the first place, or to penalize height:

So for each position the computer can calculate a height penalty — for each block the computer adds a number depending on how high it is. Then when the AI tries to decide where to put the next block, it ‘knows’ that blocks piled up really high is a bad thing and tries to avoid it.

Another strategy that seems pretty obvious is to try to get clears! We assign a positive score for each line we clear, in other words we reward clears. Duh.

Anyone who has played a few games of tetris would probably subconsciously know a number of intuitive strategies — packing together blocks as tightly as possible for instance. How do we translate that into code? Well, to start, blocks that are packed tightly has little or no holes — we’ll define these as any empty spaces for which there is a block somewhere directly above it:

Why don’t we want holes? A row is only considered a clear if the entire row is filled — if there’s even a single hole in the row, it doesn’t get removed. Not good. So it makes sense to give a negative score to positions with holes in them — to penalize holes.

It’s best to try not to have any holes at all, but sometimes having a hole or two is inevitable. What can we do after we have holes in our formation? Good question, but we should not pile more blocks on top of our holes. If we define a blockade as any block that’s directly above a hole, we should penalize blockades:

Why are blockades bad again? Well, a hole only stops being a hole if there are no more blocks above it, so stacking more blocks above holes would only make it harder to remove the hole.

These are all the obvious strategies. I also put in less obvious scores rewarding or penalizing for hugging the wall (edge of the current block touching edge of the wall), hugging the floor (edge of current block touching the floor) and flattening (rewarding if the edge of the current block touches an existing  block). Again, these are harder to justify and mostly for fine-tuning — it’s not even clear whether they should be positive or negative.

### A Hedonistic AI

These strategies are sufficient to make a passable tetris AI. This algorithm is very simple:

1. Look at the current block and the next block and simulate ALL possible combinations (positions and rotations) of the two blocks.
2. Calculate a score for each of the positions.
3. Move the block to the position with the highest score and repeat.

To give a score for a position, we would use an equation like this:

Score = A * Sum of Heights
+ B * Number of Clears
+ C * Number of Holes
+ D * Number of Blockades

Where A, B, C, and D are weights that we decide — how important is each of the factors. I initially came up with some pretty arbitrary values:

• -0.03 for the height multiplier
• -7.5 per hole
• +8.0 per clear
• +3.0 for each edge touching another block
• +2.5 for each edge touching the wall
• +5.0 for each edge touching the floor

The reason I gave such a low multiplier for the height is because the numbers stack up so quickly it racks up a huge penalty for each block on the field. The numbers I chose seem pretty reasonable — and puts blocks more or less where a human would put them.

### Playing God: Bringing in the Genetic Algorithm

The biggest problem with this method is that we choosed the weights pretty much arbitrarily. They might work well or they might not, but we don’t really know whether there are better values for them.

What can we do about it? We could brute force it — but with solutions that range across a continuum, there is a better way — a genetic algorithm.

A genetic algorithm is just a searching heuristic; it derives its ideas from evolution, where nature creates complex and sophisticated organisms by making random changes to the DNA.

Charles Darwin specifies four criteria for the process of natural selection to occur:

1. Variation: Organisms in a population must be slightly different from one another.
2. Inheritance: Traits of parent organisms must be passed onto their offspring.
3. Limited space: Only some of the offspring in any generation is able to survive and pass on its genes.
4. Competition: Individuals that are more fit are more likely to pass on their genes to the next generation.

In order to turn this into an algorithm, we’ll need — let’s quote this article:

1. chromosome which expresses a possible solution to the problem as a string
2. fitness function which takes a chromosome as input and returns a higher value for better solutions
3. population which is just a set of many chromosomes
4. selection method which determines how parents are selected for breeding from the population
5. crossover operation which determines how parents combine to produce offspring
6. mutation operation which determines how random deviations manifest themselves
We begin by constructing a chromosome — a solution to the problem of making an AI to play tetris. This is pretty easy, since we can already run an AI with a set of seven weights. So the chromosome is simply an array of seven doubles.

Next, our fitness function is very easy too, since the AI already has a scoring system. Basically the program would run the tetris AI at max speed on a 8 column board until it died, after which it would use the score it earned. Why only 8 columns and not the normal 10? In later generations, AI’s are able to survive for hours in the 10 column version, but when we reduce the number of columns to 8, even the best AI’s can survive for only a few seconds to a minute (we’re still talking about hundreds or thousands of lines here).

I used Nintendo’s original scoring system for tetris — 40 points for one clear, 120 points for two simultaneous clears, 300 for three simultaneous clears, and 1200 for four simultaneous clears. I also added 1 point for each block placed, to differentiate between AI’s that couldn’t score any lines.

Three, I chose a population of sixteen chromosomes. Initially the chromosomes are filled with randomly generated numbers (floating points fitting a normal distribution). Each generation onwards, the population’s chromosomes are derived from the best candidates of the previous generation (more on this later) — but the population size stays the same.

Next, for the selection method I chose a simple tournament method. After we run each candidate from a generation and collect all of their scores, we randomly pair up the candidates. For each pair, we take the high scorer — the winner — and discard the low scorer. Then, we pair up the winners randomly again to generate new offspring for the next generation.

Lastly, I implemented the crossover as follows: for each of the seven attributes in the offspring’s chromosome, we randomly select the respective attribute from the two parents with equal probability.

Occasionally, we have a mutation – a trait in an offspring that does not come from either parent. Each time we give an offspring an attribute, we have a 10% chance of assigning a completely random value to that attribute instead.

### Results of the Genetic Algorithm

In the first generation or two, most candidates performed horribly. Many candidates had completely wrong weights — rewarding height and rewarding holes! Needless to say, these programs did not survive very long. But the genetic algorithm quickly came up with some decent solutions, and pretty soon most algorithms were scoring a few hundred lines (my original values gave about 20 lines on the 8-column version by comparison)

After running the genetic algorithm for about ten generations, I picked a candidate that was scoring decently:

• -3.78 for the height multiplier
• -2.31 per hole
• +1.6 per clear
• +3.97 for each edge touching another block
• +6.52 for each edge touching the wall
• +0.65 for each edge touching the floor

Whoa — that’s a huge height multiplier. Perhaps the multiplier is so big that it just overwhelms everything else in the list — remember that the height multiplier applies to every block on the field. Also, holes and blockades might not have been as bad as I thought — and what’s with the huge bonus for touching the wall?

I ran the whole thing again from scratch for ten generations — using different randomly generated starting values. What it came up with made me think at first that my program had a bug somewhere:

• -3.71 for the height multiplier
• -4.79 per hole
• -1.87 per clear
• +4.8 for each edge touching another block
• +3.22 for each edge touching the wall
• +3.68 for each edge touching the floor

Yup, this one rewarded blockades and penalized clears. And it would outperform both my naive values and the first set of AI values — I used this set in the video. It seems to put blocks in really bad positions, creating holes and blockades when it is unnecessary to — but it still does better than anything else I have.

### Conclusion

Was the exercise a success? I would say partially so. There is the good and the bad. The good is that it came up with a much better AI than the one I originally had. But there may have been some things that could’ve been done better:

• What I said about the height multiplier overwhelming everything else is a bit misleading. While it is true that the height multiplier itself applies to every block on the field, it doesn’t really work that way, and really only affects the current block. Reason being, the rest of the field — everything but the current block — stays constant no matter where the current block goes. It’s kind of like if you vote for everybody, you really vote for nobody as your votes have no effect on the outcome.
• The lines cleared factor also turned out to be a bit misleading. While the second AI had a negative weight for clearing a line, it still cleared lines whenever it could: again tying back to the height multiplier. Clearing a line does exactly what it says: removing an entire row of blocks — and removing that many blocks does a huge blow to the height multiplier.
• The fitness function was really kind of screwed up. By the time your AI’s can get a few thousand lines on a tiny 8-column board, the only thing that causes the AI to die is a bad sequence of hard-to-place S and Z blocks — and in any random number generator you’ll eventually get a series of unlucky blocks. So at later generations, simply simulating a 8 column tetris was fairly bad at separating the very good AI’s from the excellent AI’s.
• Selection was also a bit screwed up. After ten generations, the entire population had more or less the same values, with only some minor variations — a bit ironic since this situation was exactly the situation the tournament selection algorithm was supposed to prevent.
• Although a genetic algorithm can converge on a local optimum fairly quickly, producing a decent solution, it is very hard for it to achieve a global optimum — the best possible solution. You might have a situation where mutating any one value seriously harms the candidate, but mutating two or more values simultaneously in a certain way makes it better. This is a drawback for genetic algorithms in general.
So that’s all I have to say. I’m fairly new to genetic algorithms, so I may have botched one or more parts of the algorithm. I’d love to know what I did wrong and how I should’ve done better.

## Is 2011 a special number?

January 1, 2011

Today is the first day of the year 2011. 2011 is a prime number. Not only that, but according to numerous math bloggers and twitterers on the internet, 2011 is the sum of 11 consecutive primes — that is, 2011 = 157 + 163 + 167 + 173 + 179 + 181 + 191 + 193 + 197 + 199 + 211. Furthermore (as I already said), 2011 is prime.

Naturally I wonder, how rare are these numbers — numbers that are prime but also a sum of a bunch of consecutive primes? This seems like a problem easily solved with some programming.

Let us write P(n,k) as the number of primes less than or equal to n that can be written as a sum of at least k consecutive primes. How big is P(2011,k) compared to $\pi(2011) = 305$, the number of primes less than or equal 2011?

My method for computing P(n,k) is pretty simple. First we start with a list of prime numbers, then write the cumulative sums for the prime numbers (these images were taken with my laptop camera off a whiteboard):

Next take every pair from the bottom list, and find their differences:

Because of the way I arranged the numbers, we can see that the bottom diagonal are all the numbers that can be written as a sum of 1 prime (obviously, the prime numbers), then the next row are all numbers that can be written as a sum of 2 primes, and so on. If we want to compute P(n,k) we simply list enough prime numbers to complete this table, take everything above a diagonal, and filter out all the duplicates.

Here’s my hopefully correct implementation:

import java.util.*;
import java.lang.*;

class Main
{
public static void main (String[] args) throws java.lang.Exception
{
int LIM = 2011;
int NCONSEC = 3;
boolean[] sieve = new boolean[LIM+1];
Arrays.fill(sieve, true);
for(int i=2; i<=LIM; i++){
if(!sieve[i]) continue;
for(int j=2; j*i<=LIM; j++)
sieve[i*j] = false;
}

List<Integer> primes = new ArrayList<Integer>();
for(int i=2; i<=LIM; i++)

List<Integer> cuml = new ArrayList<Integer>();
int cum = 0;
for(int p : primes){
cum += p;
}

Set<Integer> consums = new TreeSet<Integer>();
for(int i=1; i<cuml.size(); i++){
for(int j=0; j<=i-NCONSEC; j++)
}

int p = 0;
for(int i : consums){
if(i > LIM) break;
if(!sieve[i]) continue;
//System.out.println(i);
p++;
}

System.out.println(p);
}
}


It turns out that P(2011,3) = 147 and since there are 305 primes less or equal to 2011, roughly half of all primes under 2011 can be written as a sum of at least 3 consecutive primes. Hardly a special number at all! Even if we insist on a list of 11 consecutive primes, there are still 56 primes less than or equal to 2011 that can be written as a sum of 11 consecutive primes, about 1 in every 5 primes.

Happy New Year!

## Blatantly abusing the Windows search function: a very lazy note-taking idea

August 31, 2010

Often, as I am minding my daily business, I would suddenly have an interesting thought, or some idea. Being likely to forget it a few minutes later, I would want to write the idea down. But being a lazy and slightly disorganized person, I would grab the nearest pencil and scrap paper, scribble down a few words, and throw the piece of paper aside somewhere.

Usually I would never retrieve that piece of paper again. But occasionally I do want to read the note I wrote. After a few minutes of searching I would usually find the piece of paper I needed. If I didn’t, oh well, it probably wasn’t that important anyway.

But, as you can imagine, this quickly gets out of hand.

Now you would like to do do the note taking on the computer, instead on small pieces of paper. After all, losing files is harder on the computer right?

But how would you do this? In the computer world it’s a bit harder to ‘throw aside’ a file. You have to give the file a name, and put it in some directory, essentially forcing you to be organized. Or if you don’t, files clutter up your computer desktop even faster than sheets of paper clutter up your actual desk.

So being the geek I am, I wrote up my experimental, software solution. Then writing a note would go as follows. You open up command prompt, and enter the note command:

Type up whatever you want, and then hit save:

So your note is saved in some obscure directory of your choosing. Now the Windows 7 search is pretty good (Windows Vista works too) that as long as your directory is indexed, Windows-F and typing pretty much anything in the search bar would quickly bring you to the file (it searches file contents):

Searching for “phone ashley” takes me directly to the note. The program just takes what we’ve written, and saves it in a unique file that also has a datestamp on it. For example the first note written on a day would have _1 appended to it, then _2, and so on.

I’m not yet sure how practical this idea would be, but here’s the C++ code for my program:


#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <ctime>
using namespace std;

// Folder that we are going to write notes in
string NPATH = "C:/home/notes/";

// Convert integer to string
string istr(int i)
{
stringstream ss;
ss << i;
return ss.str();
}

// Convert time to displayable form with year, month, day, for example
// 2010-01-01. Always 10 characters long.
string formTime(struct tm * cur)
{
char s[11];
sprintf(s, "%04d-%02d-%02d",
cur->tm_year + 1900, // Number of years since 1900
cur->tm_mon + 1,     // Month, starting at 0
cur->tm_mday);       // Day of month
return s;
}

// Does the file at this path exist, or not?
bool existsFile(string path)
{
ifstream ifs(path.c_str());
return ifs;
}

void write(string data)
{
// Get current time
time_t curr_t = time(NULL);
tm * curr = localtime(&curr_t);

// 1 might be already occupied, or 2, so fill first unoccupied one
int ext = 1;
string filebase = NPATH + formTime(curr);
string flocation;

do{
flocation = filebase + "_" + istr(ext) + ".txt";
ext++;
}while(existsFile(flocation));

// Write everything to the file
ofstream fout;
fout.open(flocation.c_str());
fout << data;
fout.close();
}

int main()
{
stringstream ss;

// Accept input until line is "save" or "exit"
string s;
while(true){
getline(cin,s);
if(s == "save") break;
if(s == "exit") return 0;
ss << s << '\n';
};

// Write everything and exit
write(ss.str());
return 0;
}


## Fermat points and parameterizing the 120 degree integer triangle (Project Euler 143)

August 25, 2010

Problem 143 of Project Euler is an interesting geometry problem, and one notorious for having one of the lowest solvers to time released ratios of all the problems on the site: under 600 solvers in over 3 years as time of writing.

So what is the problem?

The problem is quite simple. We have a triangle $ABC$, with all angles smaller than $120^\circ$. Point $T$ is inside the triangle such that $p+q+r$ is at its minimum ($T$ can always be determined uniquely).

We define a Torricelli triangle as one where $a$, $b$, $c$, $p$, $q$, $r$ are all integers.

Generate all Torricelli triangles with $p + q + r \leq 120000$.

### The Fermat point

In triangle $ABC$, with $P$ being the point such that $PA+PB+PC$ is minimized, $P$ is called the Fermat point of triangle $ABC$.

We can use Ptolemy’s theorem to locate the Fermat point in the triangle.

Let $D$ be a point such that $BCD$ is equilateral, then construct the circumcircle of $BCD$:

We now show that $P$ must be at the intersection of $AD$ and the circumcircle of $\triangle BCD$.

Assume that $P$ does not lie on the circumcircle of $\triangle BCD$. Then, according to Ptolemy’s theorem:

$BP \times CD + PC \times BD > PD \times BC$

Equality only occurs when $PCDB$ is cyclic. Since $BCD$ is equilateral, we have $BC = CD = BD$ so we can divide it out to get

$PB + PC > PD$

Adding $PA$ to both sides:

$PA + PB + PC > PA + PD$

Now if $P$ did lie on the circumcircle of $\triangle BCD$, we would have $PA + PB + PC = PA + PD$, so $P$ must lie on the circumcircle of $\triangle BCD$ in order to be optimal.

So we know $P$ is on the circle. But now we assume that $P$ is not on $AD$. As we had before, $PA + PB + PC = PA + PD$.

Next if $P$ is not on $AD$, then $AP + PD > AD$, or by substitution,

$PA + PB + PC > AD$

Of course if $P$ were actually on $AD$, then $PA + PB + PC = AD$, and the sum $PA+PB+PC$ would be optimal.

This proves the optimal place for $P$ to be on the intersection of $AD$ and the circumcircle of $\triangle BCD$.

If we repeat this for the other three sides, we notice that $P$ is the intersection of the three circumcircles, and also the intersection of $AD$, $BE$, and $CF$:

Further, we see that quadrilaterals $BPCD$, $APCE$, and $FAPB$ are all cyclic. As $\angle BDC = 60^\circ$, $\angle BPC = 120^\circ$, and similarly $\angle APC = 120^\circ$ and $\angle APB = 120^\circ$.

### Designing an algorithm

We have enough information now to start working on an algorithm. Let us come back to the previous diagram:

So knowing that the central angles are all $120^\circ$, we can apply the cosine law ($\cos 120^\circ = -\frac{1}{2}$):

$a^2 = q^2 + r^2 - 2qr \cos 120$

$a^2 = q^2 + r^2 +qr$

A similar formula applies to sides $b$ and $c$. We call $(x,y)$ a pair if $x^2 + xy + y^2$ is a perfect square.

We have found a Torricelli triangle if for three integers $p$, $q$, $r$, all of $(p,q)$, $(p,r)$, $(q,r)$ are all pairs.

This leaves us with an outline of an algorithm:

1. Generate all pairs $(x,y)$ under the limit (with $x+y < 120000$ and $x^2 + xy + y^2$ being a square)
2. Sort the list of pairs and index them (to be explained in step 3)
3. For each pair $(a,b)$ in the list, search through the list to check if there exists some $c$ where $(a,c)$ is a pair and $(b,c)$ is a pair. We index the list to drastically improve searching time.
4. If an $(a,b,c)$ triple is found and $a+b+c \leq 120000$, then mark $a+b+c$ as found. This is easily implemented as a bit array of size 120000 with all bits initially 0 and flipped to 1 when a sum is encountered.
5. Add up the sums and output

Step 1 takes up most of the time in this algorithm. To do this by ‘brute force’, we need an outer loop from 1 up to 120000, then another inner loop again from 1 to 120000 (actually a bit less), which is essentially $120000^2$ operations, which is in the billions. An implementation of this algorithm would take about 3 minutes.

### Parameterizing the a^2 + ab + b^2 = c^2 equation

It is well known that all primitive triples of the form $a^2 + b^2 = c^2$ can be generated by $a = m^2 - n^2$, $b = 2mn$, $c = m^2 + n^2$ where $m$ and $n$ are coprime and are of opposite parity.

Can a similar parameterization be found for the equation $a^2 + ab + b^2$ equation?

Letting $x = \frac{a}{c}$, and $y = \frac{b}{c}$ and dividing by $c^2$, we get

$x^2 + xy + y^2 = 1$

which is the equation of an ellipse.

Originally we wanted to find integer solutions to the equation $a^2 + ab + b^2 = c^2$. This is equivalent to finding all rational points on the ellipse $x^2 + xy + y^2 = 1$:

It is easy to see why: if $x$ and $y$ are rational points such that $x^2 + xy + y^2 = 1$ then $x = \frac{a}{c}$ and $y = \frac{b}{c}$ where $a$, $b$, $c$ are positive integers, and we arrive back at the original form of the equation.

Also in order for $a$, $b$, $c$ to be positive, we will only consider points on the ellipse in the first quadrant, with both $x$ and $y$ being positive. We do this by choosing a point on the ellipse, and from there combine the equations of the ellipse and the line.

Let us choose the point $(0,-1)$ to be the first point of the line. Then the equation of the line is $y = tx-1$, where $t$ is valid only if it is positive and greater than 1.

Substituting into the equation of the ellipse:

$x^2 +tx^2 - x + t^2 x^2 - 2tx + 1 = 1$

Simplifying:

$x (t^2+t+1) = 2t + 1$

$x = \frac{2t+1}{t^2 + t + 1}$

Now evaluating for $y$:

$y = t (\frac{2t+1}{t^2 + t + 1})-1 = \frac{t^2-1}{t^2+t+1}$

Notice now that for $x$ and $y$ to be rational, we just need the slope $t$ to be rational. So we can write $t$ as $\frac{m}{n}$ where $m$ and $n$ are positive integers and $m>n$.

Simplifying the expressions for $x$ and $y$:

$x = \frac{2 \frac{m}{n} + 1}{(\frac{m}{n})^2 + \frac{m}{n} + 1} = \frac{2mn+n^2}{m^2 + mn + n^2}$

$y = \frac{(\frac{m}{n})^2 - 1}{(\frac{m}{n})^2 + \frac{m}{n} + 1} = \frac{m^2 - n^2}{m^2 + mn + n^2}$

Since we defined $x = \frac{a}{c}$ and $y = \frac{b}{c}$, we have a valid triple if $a = 2mn+n^2$, $b = m^2 - n^2$, and $c = m^2 + mn + n^2$.

If $m$ and $n$ are not coprime, then neither will $a$, $b$, and $c$ be coprime as they will share a common divisor. So in order for $(a,b,c)$ to be primitive we will need $m$ and $n$ to be coprime.

Still this isn’t quite sufficient. Consider what would happen if $m \equiv n (\mod 3)$. We have $a = 2mn+n^2$ which can be written as $n(n-m+3m)$ which is divisible by 3. Next $b = (m+n) (m-n)$ which is divisible by 3. Finally, $c = m^2 + mn + n^2$ which can be written as $(m-n)^2 + 3mn$, again divisible by 3! So if $m \equiv n \mod 3$, then the resulting triple is not primitive!

But it turns out that we can ignore all cases where $m \equiv n \mod 3$. Given a triple $(2mn+n^2, m^2-n^2, m^2+mn+n^2)$, with $m \equiv n \mod 3$, it is possible to find an identical parameterization giving $(\frac{2mn+n^2}{3}, \frac{m^2-n^2}{3}, \frac{m^2 + mn + n^2}{3})$.

As $m \equiv n mod 3$, we have $m+2n \equiv 0 \mod 3$ and also $m-n \equiv 0 \mod 3$. Thus we can have $u$ and $v$ where $u$ and $v$ are positive integers:

$u = \frac{m+2n}{3}$

$v = \frac{m-n}{3}$

Multiplying by 3, $3u = m+2n$, and $3v = m-n$. Combining the two and substituting we get

$m = u+2v$

$n = u-v$

If we substitute $u$ and $v$ for $m$ and $n$ in the triple $(2mn+n^2, m^2-n^2, m^2+mn+n^2)$, we get:

$2mn + n^2 = 3 (u^2-v^2)$

$m^2 - n^2 = 3(2uv + v^2)$

$m^2 + mn + n^2 = 3 (u^2 + uv + v^2)$

So this proves that when $m \equiv n \mod 3$, the case is nonprimitive and has already been covered by a smaller (possibly primitive) case. We are done.

### Implementation in C++

Now using the new parameterization scheme, we can generate all pairs in about $n$ operations instead of $n^2$ operations. So we can loop up to $\sqrt{120000}$, which cuts down the time down to about 2 seconds:


#include <cstdio>
#include <vector>
#include <algorithm>
#include <utility>
using namespace std;

// Limit and square root of limit
#define L 120000
#define LSQ 347
typedef long long int int64;

int64 gcd(int64 a, int64 b)
{
int64 c = a % b;
while(c != 0)
{
a = b;
b = c;
c = a % b;
}
return b;
}

int main()
{
// Store pairs in here
vector< pair<int64,int64> > pairs;

// Use the parameterization
for(int64 u=1; u<LSQ; u++){
for(int64 v=1; v<u; v++){
if(gcd(u,v) != 1) continue;
if((u-v) % 3 == 0) continue;
int64 a = 2*u*v + v*v;
int64 b = u*u - v*v;
if(a+b > L) break;
// From coprime pairs make composite pairs
for(int k=1; k*(a+b)<L; k++){
pairs.push_back(make_pair(k*a,k*b));
pairs.push_back(make_pair(k*b,k*a));
}
}
}

// Sort pairs list
sort(pairs.begin(),pairs.end());

// Create index
int index[L];
for(int i=0; i<L; i++) index[i] = -1;
for(int i=0; i<pairs.size(); i++){
if(index[pairs[i].first] == -1)
index[pairs[i].first] = i;
}

// Which sums have been reached?
bool sums[L];
for(int i=0; i<L; i++) sums[i] = false;

// Iterate through all pairs
for(int i=0; i<pairs.size(); i++){
int64 a = pairs[i].first;
int64 b = pairs[i].second;

// Construct vectors for indices
vector<int64> va;
vector<int64> vb;

// Fetch indices
int ia = index[a];
int ib = index[b];

while(ia<pairs.size()){
pair<int64,int64> next = pairs[ia];
if(next.first != a) break;
va.push_back(next.second);
ia++;
}

while(ib<pairs.size()){
pair<int64,int64> next = pairs[ib];
if(next.first != b) break;
vb.push_back(next.second);
ib++;
}

// Find common objects between va and vb
for(int ja=0; ja<va.size(); ja++){
for(int jb=0; jb<vb.size(); jb++){
if(va[ja]==vb[jb]){
// Potential c found
int64 c = va[ja];
if(a+b+c<L) sums[a+b+c] = true;
}
}
}
}

// Tally up sums
int64 s = 0;
for(int i=0; i<L; i++)
if(sums[i]) s+=i;

printf("%lld\n",s);
return 0;
}