The Traveling Salesman Problem in General. Fastest Exact Solution

There is only one path to greatness, and that path is through suffering.

– Albert Einstein

This work is the conclusion of a five-year marathon to find the fastest way to find the minimal exact solution to the traveling salesman problem in general.

Here I would like to summarize all the approaches I have tried and choose the best one in my opinion.


I know of four proven ways to find an exact solution to the asymmetric traveling salesman problem.

1. Brute force method;

2. Dynamic programming method with memorization;

3. Branch and bound method with complete enumeration;

4. Methods based on linear programming.

Here we will not consider algorithms that do not guarantee an exact solution. Although many approaches based on approximate heuristics give remarkable fast results, there is no guarantee that we have received exactly the minimum result. We will also not consider the distance between points as Euclidean space. For Euclidean space, there are algorithms that significantly reduce the solution area. In addition, I suggest considering the problem asymmetric, even if this is not the case. That is, we have no idea about the input matrix at all, except that the graph formed by it does not contain loops, but this circumstance can also be ignored. Only a complete hardcore!

Let's get started!

  1. Brute force method

It consists of a cyclical enumeration of all possible combinations of the solution to the problem while preserving the shortest solution. The method is practically useless due to its incredible computational complexity. For the traveling salesman problem in general, the computational complexity of the method is:

(n-1)! – Number of combinations as a factorial of the number of cities.

Already with n = 20, we will need to go through 19! = 121645100408832000 options. If we go through a billion combinations per second, it will take us several years.

Let us roughly estimate the possibility of the method as capable of solving the problem for no more than one and a half to two dozen cities.

  1. Dynamic programming method with memorization.

A very cool method for solving the traveling salesman problem. Its main advantage is that it is very stable from input data in terms of solution speed, and it is also good for parallelization. The previous article provides an implementation suitable for GPU.

But this method has an Achilles heel: it requires exponentially more memory to store intermediate calculations, namely (n-1)*2^(n-2). Each increase in the array by one requires more than a doubling of the memory.

For a problem of size n = 30, you will need 29 * 268435456 * 4 bits – about 29 GB of memory.

I evaluate the method’s ability to solve problems for up to three dozen cities.

  1. Branch and bound method with exhaustive search;

The branch and bound method differs from the previous two approaches in that it allows rejecting solutions that are worse than the best one found already at the calculation stage. Thus, we can reject huge areas of the decision tree, significantly reducing the time. However, this method is very unstable in speed from the set of input data. Different solutions on a matrix of the same size can differ in time by tens of thousands of times or more in calculation speed.

Depending on the implementation of the method, it is possible to obtain significant advantages over dynamic programming, but there is no guarantee that we will be able to wait for the result for a specific problem. In the article, I described my own version of the algorithm in C, with which I was able to find an exact solution to the traveling salesman problem in 90% of cases for random data sets up to n = 40.

I roughly estimate the method as capable of solving some problems of up to n ~ 40 cities.

  1. Methods based on linear programming.

The traveling salesman problem can be formulated as a problem integer linear programming. Several formulations are known. The two most famous of them are:

Miller-Tucker-Zemlin (MTZ)

Dantzig-Fulkerson-Johnson (DFJ)

In both formulations, we list all possible paths between cities by an integer linear variable x = {0, 1}, where x takes the value 1 if the path is chosen in the resulting solution, 0 otherwise.

a) Miller–Tucker–Zemlin (MTZ) formulation

It is described by the following diagram:

Miller–Tucker–Zemlin (MTZ) formulation

Miller–Tucker–Zemlin (MTZ) formulation

Of all the models that are found as a solution to the traveling salesman problem with ILP on the Internet, this formulation is the most popular. The solution is found in one step. We are looking for a minimal cycle that passes through all vertices exactly once. The lower block of conditions prohibits the appearance of cycles of a smaller size.

In tests, MTZ sometimes wins over dynamic programming in terms of solution time, but is not at all stable to the scatter of initial data. However, the algorithm is not very demanding on RAM, unlike DP.

I suggest considering it not too practical, but still capable of solving problems with up to three dozen vertices, sometimes a little more.

b) Dantzig–Fulkerson–Johnson (DFJ) formulation

Dantzig–Fulkerson–Johnson (DFJ) formulation

Dantzig–Fulkerson–Johnson (DFJ) formulation

This formulation is even less suitable for finding a solution to the traveling salesman problem using the ILP, since the subtour exclusion condition creates an exponentially larger number of constraints. In its pure form, this is almost a complete enumeration. Only we enumerate not combinations, but constraints, and there are very, very many of them: (n-1)*2^(n-2).

I evaluate the approach as capable of solving problems for up to two dozen cities with some stretch.

c) Author's approach

If you take the Dantzig-Fulkerson-Johnson model as a basis and then slightly change the approach to choosing the limiting conditions, you can get a simply gigantic increase in the speed of the solution.

I have written about this in my works once, twice and three times. Perhaps my literary talent is somewhat far from perfect and I do not always manage to convey the main idea to readers. Here I wanted to summarize everything described above and display the best approach known to me.

The basic idea is that we break one large exponential problem into a series of simpler exponential problems. But the total running time of the sequence of small problems is significantly less than solving the single large problem.

To understand how much more significant, I will say that I managed to solve problems up to half a thousand vertices accurately. True, this was possible with good input data, but even in the worst case, the method beats all other methods given above by a huge margin.

Let me briefly repeat the essence of the method: The beginning is the same as for the MTZ and DFJ models.

Although I prefer index numbers starting from zero rather than one, that's not the point.

The first difference is that we will not specify an upper bound for x.

x = {0, 1} becomes: x > 0, where x is an integer.

So we reduce the number of constraints of the linear solver on x inequalities. The upper constraint is implicitly obtained from the condition that the number of edges entering and leaving each vertex is strictly equal to one. So why do we need to additionally constrain the variable x?ij from above, if the sum is always no more than one?

Next, we simply solve the system of linear equations and obtain the reference solution. Then we check whether the solution has split into several subsets, if the resulting subsets are more than one, we add a constraint to the model and repeat the step.

If as a result only one set remains, then we consider it the desired one; we have found the minimal solution.

Another optimization that was not considered in previous works: we add constraints only for those subsets whose size does not exceed half of n. This solution is based on the fact that when obtaining a subset of size greater than n / 2, there must necessarily be another subset of size less than n / 2, and we include all such constraints in the model. In this way, we can slightly reduce the number of constraints by removing redundant conditions.

Since we are only taking subsets of size less than half of n, we will always be better off choosing the following formulation of the condition to create the constraint:

For each subset, the number of links in it must be at least one less than the number of vertices in the given subset.

The method works well on random data, but worse on symmetrical or homogeneous sets.

For objectivity, I will give one of the worst problems for the method. This is a series of points in a metric space at the same distance from each other. The adjacency matrix is ​​calculated using the Euclidean norm.

One of the worst tasks for the method

One of the worst tasks for the method

Since this example has a lot of solutions, the linear equation solver needs to try them all.

Implementation of the method in Python: github.

The CLP models can be compared using the following metaphor:

DFJ is like piling up a huge pile of restrictions. There will be a lot of restrictions, it is almost impossible to get a result, for this you need to process all the conditions.

Slag heap - DFJ metaphor

Slag heap – DFJ metaphor

MTZ is like a tower – a fortress. From the outside, the stone masonry of conditions does not allow the multitudes to spread out. It is easier to build such a tower than to simply pour a mountain of earth of a similar height, but still quite a lot of work.

Fortress - metaphor for MTZ

Fortress – metaphor for MTZ

Author's approach – It can be compared to a reinforced concrete frame of a modern building. We fill it layer by layer, adding new restrictions that correspond to the solutions already found. Thanks to this approach, we can build very tall buildings, consider finding larger solutions in the same amount of time.

Reinforced concrete frame - a metaphor for the author's approach

Reinforced concrete frame – a metaphor for the author's approach


Resume:

We have gone over the methods for obtaining an exact solution to the traveling salesman problem in general.

All the approaches considered are exponential in time from the size of the initial data! However, the time required for the solution can vary greatly depending on the method.

I roughly estimate the capabilities of methods as capable of solving problems of dimension n from the number of graph vertices.

1. Brute force method no more than 15 – 20

2. Dynamic programming method with memorization of no more than 30

3. Branch and bound method with full search of no more than 30 – 40

4. Methods based on linear programming:

a) DFJ no more than 20

b) MTZ no more than 30 – 35

c) The author's approach from 50 in the worst case and more than 500 in the best.

I suggest you draw your own conclusions.

Thank you for your attention!

Github link

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *