Hall’s Marriage Theorem explained intuitively

Imagine that you have 4 students looking for a job, and 4 positions available to fill. Not all students are equal — some are smarter than others. So the companies want to hire only the smartest students.

(Students are happy with any job they can get)

In this diagram, a bipartite graph, the students are at the top and the companies are at the bottom. A student and a company is connected if the company wants to hire the student. For example, Costco will hire any student, so Costco is connected to Andrei, Bill, Corki, and Danny.

Hall’s Theorem, formally

Hall’s Theorem tells us when we can have the perfect matching:

Suppose G is a bipartite graph with bipartition (A,B). There is a matching that covers A if and only if for every subset X \subseteq A, N(X) \geq |X| where N(X) is the number of neighbors of X.

Huh what?

Hall’s Theorem, intuitively

If you look closely at the diagram, you’ll notice that it doesn’t quite work:

Both Blizzard and Google want to hire Corki and only Corki. But Corki can only work for one company! So the whole thing collapses; the matching fails.

Let’s rewrite Hall’s condition in the context of students and jobs:

For a set of n companies, denote m to mean the number of students that at least one of these companies want. If m \geq n for every set of companies, then a matching is possible. Otherwise, the matching fails.

Here, a set of {Blizzard, Google} consists of 2 companies, but only one student, Corki, is wanted by either company. Since 1 < 2, the matching fails.

Suppose we tell this to Blizzard’s hiring manager, who decides he’ll hire Andrei instead:

Then the matching is successful and every student gets a job. Yay!

Notice that in this example, there are 4 students and 4 jobs. In general, these numbers don’t need to be equal. If we have 10 students and 4 jobs, and we want to fill every job, we can still use Hall’s Theorem. (of course, not every student will get a job)

I like this theorem because it seems so simple. The matching can fail in an obvious way. But if it doesn’t fail in this obvious way, then there’s no way it can fail in a less obvious way — it can’t fail at all.

Application: Putnam 2012 Problem B3

Let’s apply our knowledge to a harder problem. Actually, this problem becomes quite easy if we know to use Hall’s Theorem:

Suppose 2m teams play in a round-robin tournament. Over a period of 2m-1 days, every team plays every other team exactly once. There are no ties.

Show that for each day we can select a winning team, without selecting the same team twice.

Hint: we can view the teams as one half of the bipartite graph, and the days as the other half. A team is connected to a day if it won its match that day.

Solution

That’s the hint. Here’s a more detailed solution.

We want to find a matching that covers all the days. Suppose, for contradiction, that this is impossible.

From Hall’s Theorem, there has to be a set of n days, in which there are fewer than n winners in these n days.

Let’s call a team a “loser” if it lost every single game in these n days:

So this poor loser team has lost to n different teams in these n days.

But wait! If it has lost to n teams, then these n teams are winners! Yet we just stated that there are less than n winners. Contradiction — QED.

A group of seventeen people

In a group of 17 people, every person has exactly 4 friends (friendship is mutual). Prove that there are two people who don’t know each other and having no mutual friends.

Solution

We use a proof by contradiction. We attempt to construct a graph so that the required proof is false, that is, every pair is either friends or has at least one mutual friend. Then we show that this cannot work, and that something goes wrong when we try to construct such a graph.

Let us represent the entire network of 17 people as a graph with 17 nodes, with a friendship being an edge of the graph. It is given that everyone has 4 friends, and friendship being mutual it follows that the total number of edges, or friendships is \frac{4 \cdot 17}{2} or 34.

Next we count the number of paths of length 2. If indeed every pair of people are friends or friend-of-friends, then every pair should be separated by a path of at most length 2.

In the entire graph there are 17 people, thus 17 \choose 2 or 136 pairs of people. We already know that 34 of these pairs are friendships, so counting that out, we have 102 pairs of people who are not friends, and thus in our case are separated by a path length of exactly 2.

This establishes a lower bound for the number of friend-of-friends, as a friend-of-friend might be linked intermediately in more than one way. Furthermore, a friend-of-friend might already be one of the friends.

Next we count friend-of-friends a different way. How would the graph be so that the number of friend-of-friends is a maximal? Intuitively, there cannot be any immediate diamond-shapes (4-cycles) or triangles (3-cycles). We get a maximal of 12 friend-of-friends like this:

Each of your friends have a friendship with yourself (the central node), plus three other friends. None of the friend-of-friends are coincident.

Suppose that everyone is arranged in this optimal way, such that everyone has 12 friend-of-friends. Then how many paths of length 2 are there in all? It turns out that there are \frac{17 \cdot 12}{2}, or 102 in all.

This establishes an upper bound for the number of total friend-of-friends: a high number cannot be achieved since each node has its optimal configuration to maximize total friend-of-friends.

But we have the lower bound, \binom{17}{2} - 34 = 102, because each pair must be connected; we also have the upper bound, \frac{12 \cdot 17}{2} = 102, because more connections per person is not logically possible. So the lower bound is the upper bound, and the total number of friend-of-friends must be exactly 102.

The number 102 is not as important as what the condition implies. In order for this to succeed, each node must be in the optimal configuration. This means that in the entire graph, there must be no 3- or 4-cycles!

Having restricted the graph considerably, let use start by drawing the graph:

We introduce some new notation here. Let the four branches of the graph so far be considered as clusters A, B, C, D, each with 3 nodes in it. Notice there are exactly 17 nodes in the graph.

Consider the node A1. It ‘needs’ to have 4 connections, so where to? The 5 unlabelled nodes are already full. If we connect it to another node in the A cluster (A2,A3), it forms a 3-cycle (to the unlabelled node), which is not allowed. The only option is to connect it to something from a different cluster, say, D3.

Then what should it connect to? If we choose D2 or D1, we get a 4-cycle, again not allowed. So we must choose something from the B or C cluster.

It then follows that A1 is connected to one node in each of the three clusters. But that is not all.

We temporarily put aside diagonal connections (from A to C, or B to D). Now suppose that A1 is connected to B1, which is connected to C1, then to D1, and finally back to A1. This is a 4-cycle!

The only other option is to, instead of A1, connect to A2 (or A3, but they are currently interchangeable). Following, we connect to B2, C2, D2, A3, B3, C3, D3, and back to A1.

So the other nodes have to be connected in a 12-cycle. If we arrange the nodes on a circle, the connections look like this:

Now we have the diagonal clusters. A1 has to be connected to something in C. Suppose we choose C1. Then we have A1 -> B1 -> C1, which creates a 3-cycle. Similarly, if we choose C3, we get A1 -> D3 -> C3, which again creates a 3-cycle. The only option that avoids this problem is C2.

By this logic, it is possible to ‘force’ all six of the diagonals (this time represented by dotted lines):

So our graph is complete. Now all we have to do is find a pair not being friends, or being friend-of-friends. Easier, we can also find a 3- or 4- cycle, which implies the existence of such a pair.

Now despite our best efforts not to introduce 3- or 4-cycles, we inevitably still have a few (although it’s not entirely obvious at a first glance where they are). Do you see it?

Take A2, for instance. It connects to C3, then to B3, to D1, and back to A2. A 4-cycle! We are finished.