About a month ago, the Facebook Hacker Cup included a nice programming problem involving a certain kind of matching. The *min-cost bipartite matching problem* can be stated like this:

- You have a number P of people, and a number Q ≤ P of jobs. Each person is willing to do any of the jobs, but each person will charge a different price for each job. You are only allowed to hire each person to do at most one job, and you need every job to get done. What is the cheapest way to hire employees for all the jobs?

Just as a quick concrete example, let’s say we have P=3 people (Alice, Bob, Carl) and Q=2 jobs (Vacuuming, Window-washing).

- Alice charges $50 to do vacuuming and $40 to do window-washing.
- Bob charges $30 to do vacuuming and $20 to do window-washing.
- Carl charges $60 to do vacuuming and $30 to do window-washing.

In this case, the minimum cost to get all the jobs done is $60: pay Bob $30 for vacuuming, and pay Carl $30 for window-washing.

I had known about one way to solve this sort of problem, but I later learned another approach called *The Hungarian Algorithm*. It’s a very efficient approach, taking O(Q²P) time. In this post I’ll explain how it works, and in the next post, I will explain how it generalizes to an algorithm to the *min-cost max-flow* problem, specifically the *capacity-scaling successive shortest paths algorithm.*

## Basics

There are some preliminaries that I am going to build on here, which are generally covered in any introduction to flows:

- bipartite matching can be thought of as a flow problem, by attaching a source node
**s** to all people, directing edges from people to jobs, and attaching all jobs to a sink node **t**
- given some flow in a graph, the
*residual graph* represents the unused capacity of the edges, together with added *reverse edges* that represent the cost that can be saved by decreasing the flow on that edge
- in the min-
*cost* flow setting, a reverse edge’s cost is the negative of the original edge’s cost (i.e., cost savings)

- an
*augmenting path* is a path in the residual graph (using edges with nonzero capacity) from the source **s** to the sink **t**

The simpler *max-flow* problem, where we don’t care about costs, can be solved like this:

- As long as the residual graph has an augmenting path from
**s** to **t**, increase the flow on any augmenting path.

There are several ways to prove this algorithm’s correctness. Here is one way, which will be relevant later on:

Let F be some non-maximum flow, and X be any maximum flow. View the difference between X and F as a flow in F’s residual graph. By the *flow decomposition theorem*, this difference can be written as a nonempty list of augmenting **s**–**t** paths plus a possibly empty list of cycles. Therefore, there is an augmenting **s**–**t** path, and we can keep improving F.

Let N be the number of nodes, and M be the number of edges. For unit capacities, this simple max-flow approach takes at most N iterations of a path-finding algorithm, which itself takes M steps, and so the total time complexity is O(MN).

Moreover, this simple algorithm works quite nicely when we add capacities to the edges (but still ignoring costs):

- Taking the
*shortest* path each time yields a maximum flow in O(MN) iterations;
- taking the
*fattest* path (highest residual capacity) yields a maximum flow in O(M log K) iterations, where K is the maximum capacity.

## Greed and Costs

A *greedy* algorithm is usually one that just makes the cheapest decision at every point in time, without worrying about troubles that can come down the road from not planning ahead. Here’s the simplest greedy algorithm you could imagine for our matching problem now that costs are involved:

- Repeatedly do the following: out of all the people you haven’t hired and jobs you haven’t filled, pay the cheapest price you can to get one more job done.

However, this algorithm is not correct. In our example above, we’d first pay Bob $20 to do window-washing, since that’s the cheapest of all prices. After that, the cheapest price that doesn’t involve Bob or window-washing is to pay Alice $50 to vacuum. This total cost of $70 is not optimal.

Nonetheless, even once you add costs into the picture, there still remains a very elegant greedy algorithm that successfully solves min-cost flow:

- As long as the residual graph has an augmenting path from
**s** to **t**, increase the flow on any **minimum-cost augmenting path**.

So in our example, the second iteration would use the augmenting path [pay $30 for vacuuming by Bob, refund $20 from Bob for not window-washing, pay $30 for window-washing by Carl] which has cost $30-$20+$30 = $40, less than the direct path [pay $50 for vacuuming by Alice].

I was surprised the first time that I read this algorithm is correct! But, it has a very nice inductive proof. In the proof, the *value* of a flow is the amount of flow it sends from the source to the sink (so max-flow is really the problem of finding a flow with maximum value).

For simplicity, assume all capacities are unit. Let F be a minimum-cost value-**k** flow. Let X be any minimum-cost value-**(k+1)** flow. View the difference between X and F as a flow in F’s residual graph. By the flow decomposition theorem, this difference can be written as a single augmenting **s**–**t** path P plus a possibly empty list of cycles.

None of the cycles can have negative cost, because they could have been used to augment F into a cheaper value-**k** flow. Therefore, augmenting F by P gives a value-**(k+1)** flow whose cost is at most X’s cost, and which is therefore a minimum-cost value-**(k+1)** flow.

Let’s assume unit capacities for now (which is true in the application to min-cost bipartite matching). This algorithm needs O(N) iterations. How can we find the minimum-cost augmenting paths? The residual graph has negative-cost reverse edges, but crucially, the above proof shows that it never has a negative-cost cycle. So the Floyd-Warshall algorithm in O(N³) time, or the Bellman-Ford algorithm in O(NM) time, can be used in each iteration. This gives a total time complexity like O(N²M).

We can do better!

## Using Your Full Potential

The Hungarian Algorithm, when viewed in the context of flows, lets us improve the above algorithm simply by switching to a better algorithm for shortest paths. Namely, Dijkstra’s algorithm. The only problem is those pesky negative edges; Dijkstra’s algorithm won’t work with those!

There is, however, a way to transform some graphs to avoid negative edges. The transformation is to use a *potential function*, which I’ll denote by **y**, since it’s like a dual solution for the associated linear program.

- A potential
**y** is a vector that assigns a real value to every node.
- A
*valid* potential is one that satisfies, for every edge uv, y(v) ≤ cost(uv) + y(u).
- The
*reduced cost* of edge uv is defined as reduced_cost(uv) := cost(uv) + y(u) – y(v);
- so a potential is valid if all reduced costs are nonnegative.

Potentials are a way of transforming a graph in a superficial way:

- Let the
*reduced distance* from u to v be defined as the shortest path distance from u to v, along the reduced costs.
- Then you can prove that the reduced distance from u to v doesn’t depend on the path:
- the reduced distance equals the normal distance from u to v, plus y(u), minus y(v).

Conversely, if we compute the reduced distances, we can easily recover the real distances. So as long as the algorithm can maintain a valid potential, the reduced costs are nonnegative, and it can use Dijkstra’s algorithm to find the shortest **s**–**t** path.

How can we maintain a valid potential? It turns out that not only are potentials useful for finding shortest paths, but also shortest paths are useful for computing potentials.

- In a graph with no negative cycles, define
**y**(v) for every node v to be the shortest-path distance from **s** to **v**. Then this **y** is a valid potential.

This is useful algorithmically because it lets us leapfrom from a valid potential, to shortest paths, to the augmenting path needed to augment the min-cost flow, to the next potential. Expressed this way, the Hungarian algorithm looks like this:

- Let
**y** be the all-zero vector.
- Using
**y** as a valid potential, find a shortest **s**–**t** path **P**, stopping if none exists.
- Redefine
**y** to be the shortest path distances from **s**.
- Push a unit of flow along
**P** and update the residual graph.
- Go back to step 2.

To prove that this actually works, we need to show that when we update the residual graph, **y** remains a valid potential.

This is because the edges **uv** on any shortest path from** s** satisfy satisfy d(s, u) + cost(uv) = d(s, v), which is a form of *complementary slackness*.So adding the reverse edge **vu** with cost(vu) = -cost(uv) will not violate the validity constraint y(u) ≤ cost(vu) + y(v) for the potential, as it holds with equality.

That’s it!

## Remarks

Some explanations of the Hungarian algorithm are slightly different in step 3, rephrasing it as the equivalent “Increase **y**(v) by the reduced distance from s to v, for each node v.”

Historically, the algorithm is based off of ideas of Kőnig and Egerváry, with Kuhn in 1955 being the first to consider a polynomial-time implementation. This algorithm is the oldest one that I know of where you get a substantial benefit from thinking both in the primal (the flow) and dual (the potential) at the same time. Similar ideas were used in 1965 by Edmonds to derive an algorithm for min-cost non-bipartite matching; more on that to come in the following post.

The algorithm above will work eventually for graphs with arbitrary edge capacities; but if the maximum capacity is K, the number of iterations might be something like K or larger. In the next post we’ll get that back down to polynomial-time by giving an algorithm that takes O(M log K) iterations.