AI Project: Harmonizing Pop Melodies using Hidden Markov Models

It is often said that all pop music “sound the same”. If this is the case, then a computer program should be able to compose pop music, right?

This is the problem we studied for the AI final project (CS486) — kind of. We restrict to the simpler problem of harmonizing an existing melody: in other words, given a fixed melody, find a sequence of chord progressions that match it.

We formulated the task as a probabilistic optimization problem, and devised an algorithm to solve it using Hidden Markov Models (HMMs). Then, we used the Music21 library to generate MIDI files from the output so that we can listen to it.

In the experiment, we chose the melody from If Only by JJ Lin. Here’s a demo:

Music Theory Model

In our model, pop music consists of two parts: a melody line consisting of a sequence of individual notes, and a harmony line of a sequence of chords. Generally, the melody line is the notes sung by the lead vocalist, and the harmony line is played by accompanying instruments (piano, guitar, digital synthesizers). Conventionally, the chord progression is written as a sequence of chord names above the melody; the exact notes in a chord progression is up to the performer’s discretion.

Above: Example of two-part song with melody and chord progression

It is hard to quantify exactly what make a chord progression “good”, since music is inherently subjective and depends on an individual’s musical tastes. However, we capture the notion of “pleasant” using music theory, by assigning a penalty to musical dissonance between a note in the melody and a note in the chord.

According to music theory, the minor second, major second, and tritone (augmented fourth) intervals are dissonant (including the minor and major seventh inversions). Therefore, our program tries to avoid forming these intervals by assigning a penalty.

Above: List of dissonant intervals. All other intervals are consonant.

We assume that all notes in the melody lie on either a major or minor scale in some fixed key. The set of permissible chords for a scale are the major and minor chords where all of its constituent notes are in the scale. For example, in the scale of G major, the permissible chords are {G, Am, Bm, C, D, Em}.

This is a vastly simplified model that does not capture more nuanced chords that appear in pop music. However, even the four-chord progression G – D – Em – C is sufficient for many top-40 pop songs.

First attempt: Harmonize each bar independently

The first thing we tried was to looking at each bar by itself, and searching through all of the permissible chords to find the chord that harmonizes best with the bar.

In other words, we define the penalty of the chord for a melody bar is the number of times a melody note forms a dissonant interval with a chord note, weighted by the duration of the note. Then, we assign to each bar the chord with the lowest penalty.

Here’s what we get:

Above: Naive harmonization. Notice that the Am chord is repeated 3 times.

This assigns a reasonable-sounding chord for each bar on its own, but doesn’t account for the transition probabilities between chords. We end up with the A minor chord repeated 3 times consecutively, which we’d like to avoid.

How to account for chord transitions, while still harmonizing with the melody in each bar? This is where we bring in Hidden Markov Models.

Second attempt: HMMs and the Viterbi Algorithm

Hidden Markov Models are a type of probabilistic graphical model that assumes that there is an unobservable hidden state that influences the output variable. In each timestep, the hidden state transitions to a different state, with probabilities given by a transition matrix.

A standard problem with HMMs is: given a sequence of observations, find the most likely sequence of hidden states that produced it. In our problem, the hidden states are the unknown sequence of chords, and the observed sequence is the bars of our melody. Now, solving the most likely sequence problem yields the chord progression that best harmonizes with the melody.

To encode the rule that we don’t like having the same chord consecutively for multiple bars, we assign the transitions so that the probability of transition from any chord to itself is low. The transition probability to every other chord is equal.

Above: Transition probabilities for chords (only 3 shown here, but this generalizes easily).  A chord is equally likely to transition to any other chord, but unlikely to transition to itself.

The only thing left to do is solve the most likely sequence problem. We used a modified version of the Viterbi algorithm to do this — it solves the problem quickly using dynamic programming.

Running the Viterbi algorithm on our test melody:

Above: Using HMMs to eliminate repeated chords

And there we go — no more repeated chords! It doesn’t produce the same chord sequence as my manual harmonization, but sounds quite natural (and doesn’t violate any musical constraints).

Results and Future Directions

Using HMMs, we were able to harmonize a short melody segment. By looking at the entire melody as a whole, the HMM algorithm is able to produce a better chord progression than by optimizing every bar locally. Our implementation uses a greatly simplified musical model: it assumes the melody is in G major, and only considers 6 basic major and minor chords, but it successfully demonstrates a proof of concept.

When implementing this project, we thought this was new research (existing literature on harmonization focused mostly on classical music, not pop music). Alas, after some more literature review, it turns out that a Microsoft research team developed the same idea of using HMMs for melody harmonization, and published a series of papers in 2008. Oh well. Not quite publishable research but still a cool undergraduate class project.

The source code for this project is available on Github. Thanks to Michael Tu for collaborating with me!

If you enjoyed this post, check out my previous work on algorithmic counterpoint!

Probabilities of a slightly altered dice

In art class a few months ago, I made a dice out of clay. This dice isn’t quite your typical dice. Its structure is tampered with ever so slightly, so that just $\frac{1}{64}$ of the material is removed.

Let us model this crude, imperfect dice as a hypothetical, perfect dice. Suppose this dice is represented as a cube, composed of 64 smaller ‘cubelets’ (imagine the cubelets as 1x1x1 and the dice as 4x4x4). Furthermore, we imagine four ‘layers’ of cubelets, such that each layer is a 4×4 square.

In this dice, we remove one of the 16 cubelets from the second layer, like this:

(keep in mind that this is still a solid cube)

Assuming that the cube is made out of uniform density material (and that the missing cubelet has no mass), what is the probability of landing on each of the six sides when the cube is thrown?

A mathematical interpretation

I will solve a problem which may or may not be equivalent to throwing a dice. Instead of throwing the dice, we place the dice on a table with a random orientation, and let go. We then observe which side the dice lands on.

This appears to be a legit and identical interpretation, but there is a flaw. With an uneven dice, it is not guaranteed that each orientation would appear with uniform likelihood. Nevertheless, for our purposes we ignore this and we assume that this is an identical problem.

Which side will it land on? The center of gravity of the dice must be directly on one of the six sides; this side is the side that the dice will fall onto.

Consider a sphere of some radius, centered at the cube’s center of gravity and entirely contained in the cube:

Lines are drawn from the center of gravity to the eight vertices of the cube, essentially dividing the sphere into six pyramid-like parts.

The chance of landing on a side of the cube is proportional to the side’s surface area when ‘projected’ onto the surface of the sphere. This is because the side can be divided into infinitely many small ‘pyramids’, each one of which corresponds to the dice landing on that side when left to drop at that specific angle.

Such a portion of the surface area is called a solid angle, which is measured in steradians (sr). Whereas the entire circle is denoted as $2 \pi$ radians, the entire sphere is denoted by $4 \pi$ sr. Likewise, one hemisphere would be $2 \pi$ sr, and so on.

There is an efficient formula for calculating the solid angle of a tetrahedral projection onto a sphere, given by Oosterom and Strackee. If $\vec{a}$, $\vec{b}$, $\vec{c}$ are vectors of three vertices of a tetrahedron (in relationship to the fourth), then the subtended solid angle is given by:

$\tan(\frac{1}{2} \Omega) = \frac{|\vec{a} \vec{b} \vec{c}|}{abc + c(\vec{a} \cdot \vec{b}) + b(\vec{a} \cdot \vec{c}) + a(\vec{b} \cdot \vec{c})}$

Here $a$, $b$, $c$ are magnitudes of the vectors, the dot represents a dot product, and $|\vec{a} \vec{b} \vec{c}|$ is the determinant of the scalar product of the vectors.

The magnitude of the 3-D euclidean distance:

$a = \sqrt{a_x^2 + a_y^2 + a_z^2}$

The dot product of two vectors is:

$\vec{a} \cdot \vec{b} = a_x b_x + a_y b_y + a_z b_x$

Finally,

$|\vec{a} \vec{b} \vec{c}| = |a_x b_y c_z + b_x c_y a_z + c_x a_y b_z - a_x b_z c_y - b_x c_z a_y - c_x a_z b_y|$

Calculating the probability

Let us position the cube on a 3D coordinate, such that the corners are the origin and $(4,4,4)$. The empty cube is the cube with center $(2.5,2.5,2.5)$.

To find the center of gravity, we just take the averages of the centroids of all the smaller cubes (the missing cube contributing 0 to the sum). This turns out to be $(k,k,k)$ with $k= \frac{251}{126}$ or about 1.9921.

We have $k = \frac{251}{126}$; now we let $k' = 4-k$. To facilitate vector manipulations, we will now transpose the cube such that the center of gravity is at $(0,0,0)$:

Because of symmetry, it is obvious that three of the six sides should have one equal probability, and the other three sides to have a different, but again equal, probability. Thus it is not strictly necessary to go through the calculations for all six sides; we only need to do them for two of the sides.

A face forms a solid angle for a square pyramid, not a tetrahedron (for which we have a formula for). To solve this, we note that a square pyramid can be divided into two tetrahedrons, and the solid angle for it is simply the sum of solid angles for the two tetrahedrons.

Now by doing these calculations, we get a probability of 16.740% of landing on each of the three heavier sides, and 16.594% on each of the three lighter sides (sides closer to the missing cubelet).

By comparison, the probability of landing on any side on a fair dice is 16.667%. As only $\frac{1}{64}$ of the dice’s mass is removed, the resulting difference is slightly less than 1%.

Socks in a Drawer

I’ve began studying some probability and statistics for fun, and I’m using a book called Fifty Challenging Problems in Probability by Mosteller:

Printing math textbooks is a great way to use up the school’s printing credits at the end of the year.

The first problem is about socks in a drawer. At first glance it looks really, deceptively simple. But when I start doing the problem I find that it’s a bit trickier than I expected.

Solutions to the problems are provided at the back of the book. For this problem, I find that the book’s solution is really ugly and unintuitive. Thus I’m going to give an alternate solution here.

Problem (my version)

A drawer contains red and black socks. I randomly take a sock out of the drawer. I then randomly take another sock out of the drawer.

The probability that both are red is exactly $\frac{1}{2}$. Determine how many socks of each color there are in the drawer.

Solution

Let r be the number of red socks in the drawer, and b be the number of black socks. Then the total number of socks in the drawer is r+b.

The probability that a sock you randomly take is red would be

$\frac{r}{r+b}$

Now the probability that the next sock you take would also be red is

$\frac{r-1}{r+b-1}$

This is because there is one less red sock in the drawer, and at the same time there is one less sock (of any color) in the drawer.

The probability that both socks will be red is the two probabilities multiplied:

$P = \frac{1}{2} = \frac{r}{r+b} \cdot \frac{r-1}{r+b-1}$

Now we’re solving the equation to find possible values of r and b. There are two variables, and one equation. However, both variables have to be positive integers. This type of equation is called a diophantine equation.

We begin by simplifying the equation into a quadratic.

$\begin{array}{rcl} \frac{r(r-1)}{(r+b)(r+b-1)} &=& \frac{1}{2} \\(r+b)(r+b-1)&=& 2r(r-1) \\ r^2+2rb+b^2-r-b&=&2r^2-2r \\ r^2+(-2b-1)r+(b-b^2)&=&0 \end{array}$

Now using the quadratic formula to solve for r:

$r = \frac{(2b+1) + \sqrt{(-2b-1)^2-4(b-b^2)}}{2}$

It’s not necessary to consider the negative root of the quadratic. Simplifying the quadratic further, we get

$r = \frac{(2b+1)+ \sqrt{8b^2+1}}{2}$

Here we can make a substitution. Let $k = \sqrt{8b^2+1}$. Then,

$r = \frac{(2b+1)+ k}{2}$

In this equation r will be an integer as long as b is an integer and k is an odd integer.

We are now finding integer solutions to the equation

$k = \sqrt{8b^2+1}$

..where k is also odd.

Simplifying, we get:

$\begin{array}{rcl} k^2 &=& 8b^2+1 \\ k^2-8b^2&=&1 \end{array}$

At this point we can drop the condition that k is odd, because if k is even then left hand side is even, but the right hand side is 1. As a result, k cannot be even if it’s a solution to this equation.

We’ve basically converted this problem into a form of Pell’s equation, which generally has this form:

$x^2 - Dy^2 = 1$

..where D is a constant and x and y are unknown integers. Our instance of Pell’s equation is the case where D=8.

Fortunately for us, the Pell’s equation is well studied. We can use existing results in number theory to help us with this problem.

By trial and error, it can be found that the smallest nontrivial solution, or the fundamental solution to the equation is $(k,b) = (3,1)$. Once the fundamental solution to a Pell type equation has been found, all other solution pairs can be generated. And there are infinitely many such pairs.

Generating solution pairs can be done with this recurrence relation:

$\begin{array}{rcl} k_{i+1} &=& k_1 k_i + D b_1 b_i \\ b_{i+1} &=& k_1 b_i + b_1 k_i \end{array}$

It’s simple to write a computer program to generate these pairs. For instance, the first ten pairs of $(k,b)$ are:

 (3,1) (17,6) (99,35) (577,204) (3363,1189) (19601,6930) (114243,40391) (665857,235416) (3880899,1372105) (22619537,7997214)

All that is left is to substitute values of b to get integer values of r. The first ten pairs of $(r,b)$ are:

 (3,1) (15,6) (85,35) (493,204) (2871,1189) (16731,6930) (97513,40391) (568345,235416) (3312555,1372105) (19306983,7997214)

We are done; we can use the same recurrence relation to generate arbitrarily large sock combinations.

Project Euler 285

Problem 285 of Project Euler was released yesterday; it requires some geometry and probability sense.

The problem goes like this:

Albert chooses two real numbers a and b randomly between 0 and 1, and a positive integer k.

He plays a game with the following formula:

$k' = \sqrt{(ka+1)^2 + (kb+1)^2}$

The result k’ is rounded to the nearest integer. If k’ rounded is equal to k, then Albert gets k points. Otherwise, he gets nothing.

Find the expected value of the total score if Albert plays this game with $k = 1, 2, \cdots , 10^5$.

For example (from the problem statement), let k=6, a=0.2, and b=0.85.

Here the expression $\sqrt{(ka+1)^2 + (kb+1)^2}$ gives the number 6.484. When rounded to the nearest integer, it becomes 6, which is equal to k. Therefore, Albert gets 6 points.

Yup – another expected value problem. But the concept of this problem is simple:

For any value of k, there are certain values of a and b where Albert will get points. Otherwise, he does not get points. Here the gray blob is the area where he does get points, and the white area is where he does not.

The red square is the sample space of the random variables a and b, since $0 and $0.

The probability of Albert getting the points is the fraction of the square that’s gray.

The area of the red square is always 1. The area of the gray area varies with k. Thus the idea is to calculate the area of the gray area for each value of k from 1 to $10^5$.

Now we can get to the details of the problem.

Notice that if $round(k') = k$, then $k-0.5 < k' < k+0.5$ (the difference between less-equal and less is irrelevant in this problem):

In a similar way, the equation of the larger circle is $(a+\frac{1}{k})^2 + (b+\frac{1}{k})^2 < (\frac{k+0.5}{k})^2$.

These are equations of circles. A circle has the equation $(x-a)^2 + (y-b)^2 = r^2$ where $(a,b)$ is the center and r is the radius.

Thus both the smaller and the larger circles have a center at $(-\frac{1}{k},-\frac{1}{k})$. The radii of the circles are different: the radius of the smaller circle is $\frac{k-0.5}{k}$ and the radius of the larger circle $\frac{k+0.5}{k}$.

Plotting the inequalities:

We can see now that this is a calculus problem! Indeed, the shaded area is the area of the larger circle inside the boxarea of the smaller circle inside the box.

Initially I tried integrating the circles directly, but I ran into problems while integrating and I was unable to correctly integrate the equation. Instead, I integrated an equivalent problem:

It’s obvious that the areas do not change with this transformation.

So now for the smaller circle the equation is $a^2 + (b+\frac{1}{k})^2 > (\frac{k-0.5}{k})^2$. Solving for b, we get:

$b > \sqrt{(\frac{k-0.5}{k})^2-a^2} - \frac{1}{k}$

.. And this equation for the larger circle:

$b < \sqrt{(\frac{k+0.5}{k})^2-a^2} - \frac{1}{k}$

The variable of integration here is a, and k is only a constant. Therefore we can replace the radius with r and get this equation to integrate:

$b = \sqrt{r^2-a^2} - \frac{1}{k}$

With the help of an integration table, this can be integrated fairly easily.

There’s one additional problem – computing the integrating ‘limits’ of the two circles. Let’s call them $L_S$ and $L_L$ for small and large respectively.

Both of them can be computed using the pythagorean theorem.

The formula for $L_S$ (it should be obvious by now what $L_L$ should be):

$L_S = \sqrt{(\frac{k-0.5}{k})^2-\frac{1}{k^2}}$

Now we can get this really big formula for computing the area for any integer k:

$\begin{array}{l} \int_{\frac{1}{k}}^{\sqrt{(\frac{k+0.5}{k})^2-\frac{1}{k^2}}} (\sqrt{(\frac{k+0.5}{k})^2-a^2}-\frac{1}{k}) da \\ - \int_{\frac{1}{k}}^{\sqrt{(\frac{k-0.5}{k})^2-\frac{1}{k^2}}} (\sqrt{(\frac{k-0.5}{k})^2-a^2}-\frac{1}{k}) da \end{array}$

Also, fortunately for me, it is impossible for the upper bound to cross the circle at the top or right, I mean, we never have something like this –

Anyways, here’s a Haskell implementation of the algorithm:

integral a r k = 0.5 * (a * sqrt (r^2 - a^2) + r^2 * atan (a/sqrt (r^2-a^2))) - a/k

area k = let area_1 1 = 0
area_1 k = integral (sqrt (((k-0.5)/k)^2 - (1/k)^2)) ((k-0.5)/k) k - integral (1/k) ((k-0.5)/k) k
area_2 k = integral (sqrt (((k+0.5)/k)^2 - (1/k)^2)) ((k+0.5)/k) k - integral (1/k) ((k+0.5)/k) k
in  area_2 k - area_1 k

main = print . sum . zipWith (*) [1..] \$ map area [1..10^5]


The code runs in about 3 seconds.

There seems to be an edge case where k=1 (probably an asymptote) so I have to specially make it be zero.

This problem was released 10 pm last night and I was tired, lol, but I still managed to solve it and get into the top 20.

Project Euler 280 (Revisited)

A while ago I wrote this blog post on this problem. The previous code was able to produce the answer in about 20 minutes, using several gigabytes of RAM.

You may want to read the previous blog post first.

There is a simpler and more efficient solution (used by inamori, zwuupeape, and many others).

The new approach

We do not need to know the exact value of the solution, only an approximate. This approximation has to be accurate only to six digits.

Any any step, there is a certain probability that the ant will reach the final state in that step.

Let’s call this function $P(s)$: the probability that the ant will reach the final state on step s:

At very small values of s, the probability is zero because it takes a certain number of steps to get to the final position. 44 is the smallest number of steps to get from the initial to the final position, and the probability increases from there.

But it only increases until a certain point. After that point, the probability decreases. After all, if we run this for thousands of steps, how likely is the ant to still not have reached the final position?

However, the graph continues forever to the right. Even after thousands of steps, there is still a chance, however small, that the final position hasn’t been reached.

Let’s define $Q(s)$ as the weighted average of the probabilities of each value of $P(s)$ from 0 to s:

The expected value and the answer for this problem would simply be the weighted average for the entire graph.

Since s must be an integer greater than zero, we don’t have to do any integration with curves. A bit misleading because I’ve drawn $P(s)$ and $Q(s)$ as curves, when they are not.

But because $Q(s)$ cannot be expressed algebraically, we cannot evaluate it as it approaches infinity. Instead, we pick a point that gives us enough precision:

Either that, or we keep calculating the weighted average while increasing s until we have enough precision.

Calculating the probabilities

In order to calculate a weighted average $Q(s)$, we need to be able to calculate $P(s)$; while $P(s)$ is the probability function for the final state, we’ll need to know the probabilities for the in-between states as well.

The probabilities of the states on step s depend most upon the probabilities of the previous step, s-1.

So if on step s-1 there’s a 0.6 probability that we’re on state X, then on step s we should have a 0.3 probability of being on either Y or Z as there’s an equal chance of moving to either state.

And if multiple states can go to our state, we can get the probability by adding up the probabilities:

Coding a solution

At first we start on the initial state, with a probability of 1 (certain chance) of being on it.

On each step we generate a new set of probabilities using rules I described above, and so on.

A lot of code is stolen from my previous work on this problem. The State class is essentially unchanged in this version.

Here’s the code:

import java.util.*;

public class Main{

public static void main(String[] args){

final int ITERATIONS = 2501;

// Initialize probability map.
// The initial step is 0, containing the initial position with a probability of 1.
prevProbs.put(State.INIT_STATE, 1.0);

// Distributions for the final state.
double[] finalValues = new double[ITERATIONS+1];
Arrays.fill(finalValues, 0);

// Keep the time.
long startTime = System.currentTimeMillis();

// Loop counter is actually the step number of the next step.
for(int step=1; step<ITERATIONS; step++){

// Make new probabilities.

// Loop through old probabilities and add
Set<State> stateSet = prevProbs.keySet();

for(State state : stateSet){

// Probability of being on this state right now
double stateProb = prevProbs.get(state);

// List of next states
List<State> nextStates = nextStates(state);

// Probability factor for each of the next states
double multiplier = 1.0 / nextStates.size();

// Added value to each of the next states
double added = stateProb * multiplier;

// Now go and update all the state probabilities.
for(State nextState : nextStates){

// If we've got the final state, handle it separately.
if(nextState.equals(State.FINAL_STATE)){
continue;
}

// Could be the first time.
if(!nextProbs.containsKey(nextState))

// Might not be the first time.
else {
// Old value, we need to add to it instead of replacing it.
double previous = nextProbs.get(nextState);
}
}
}

// Our new probabilities are the old probabilities for the next loop.
prevProbs = nextProbs;

System.out.println(step + " " + weightedAverage(finalValues));
}

System.out.println((System.currentTimeMillis() - startTime) + "ms");
}

// Calculates a weighted average of a double array with the indices as
// values and the values in the array as weights.
static double weightedAverage(double[] array){

double sumValues = 0;
for(int i=0; i<array.length; i++)
sumValues += i * array[i];

return sumValues;
}

// Returns a list of possible continuation states from the current one.
static List<State> nextStates(State state){

// Current and changing position of the ant
int antX = state.ant % 5;
int antY = state.ant / 5;

// Whether it can go into each of the four directions (N,S,E,W respectively).
boolean[] possibleDirs = new boolean[4];
Arrays.fill(possibleDirs, true);

// Take out some directions if it's in the corner.
if(antY == 0) possibleDirs[0] = false; // Can't go north
if(antY == 4) possibleDirs[1] = false; // Can't go south
if(antX == 4) possibleDirs[2] = false; // Can't go east
if(antX == 0) possibleDirs[3] = false; // Can't go west

// Construct a list of continuations.
List<State> nextStates = new ArrayList<State>();

// Loop through the four directions.
for(int i=0; i<4; i++){

// Cannot go this direction.
if( !(possibleDirs[i])) continue;

int newAntX = antX;
int newAntY = antY;

// Modify direction.
switch(i){
case 0: newAntY--; break;
case 1: newAntY++; break;
case 2: newAntX++; break;
case 3: newAntX--; break;
}

// Start constructing new state object.
int oldAnt = state.ant; // old ant position
int newAnt = 5*newAntY + newAntX;
int[] board = state.board.clone();
boolean carrying = state.carrying;

// Carrying a seed. Notice that a square can contain
// two seeds at once (but not more); seeds are indistinguishable
// so we just need to keep track of the number of seeds
// on each square.
if(carrying){
board[oldAnt] --;
board[newAnt] ++;
}

// Drop off the seed.
if(newAntY == 0 && board[newAnt] == 1 && carrying)
carrying = false;

// Pick up a new seed.
if(newAntY == 4 && board[newAnt] == 1 && !carrying)
carrying = true;

// Treat the five done positions the same.
if(donePosition(board))

}

return nextStates;
}

// Is the board in the done position?
static boolean donePosition(int[] b){
return b[0]==1 && b[1]==1 && b[2]==1 && b[3]==1 && b[4]==1;
}
}

class State{

static final State INIT_STATE = new State(
new int[]{
0,0,0,0,0,
0,0,0,0,0,
0,0,0,0,0,
0,0,0,0,0,
1,1,1,1,1
}, 12, false);

// Consider all final states the same; there is
// no ant position.
static final State FINAL_STATE = new State(
new int[]{
1,1,1,1,1,
0,0,0,0,0,
0,0,0,0,0,
0,0,0,0,0,
0,0,0,0,0
}, -1, true);

// 25 board.
int[] board;

int ant;
boolean carrying;

State(int[] board, int ant, boolean carrying){
this.board = board;
this.ant = ant;
this.carrying = carrying;
}

State(State s){
this(s.board, s.ant, s.carrying);
}

public boolean equals(Object o){
State s = (State) o;
return Arrays.equals(s.board, board) && s.ant == ant && s.carrying == carrying;
}

public int hashCode(){
return Arrays.hashCode(board) + ant;
}

// For debugging mostly.
public String toString(){
StringBuilder ret = new StringBuilder("\n");
for(int i=0; i<5; i++){
for(int j=0; j<5; j++){
int pos = 5*i + j;
if(ant == pos) ret.append("#");
else ret.append(board[pos] >= 1? "+" : "-");
}
ret.append("\n");
}
return ret.toString();
}
}


As we run this program, the answer becomes more and more accurate. We only need 2300 or 2500 steps to get the six digits of accuracy we need.

Running the program:

This takes just over a minute to run — not bad at all.