In a number of practical applications, it becomes necessary to search for simple cycles on graphs, which raises the question of estimating the number of computational operations required for this, i.e. computational complexity of the problem.
By itself, the task of iterating cycles on a particular graph is not particularly difficult and can be easily implemented by means of almost any programming language that supports recursive calculations. However, the search for methods for a priori estimation of the computational costs of the enumeration algorithm, to my surprise, did not give sane results. Nevertheless, fairly simple arguments allow one to obtain the required estimate, which agrees well with the test data and allows, among other things, to obtain some additional information about the structure of the enumeration results.
Formalization of the task
So, for definiteness, we will consider the graph G – directed, without loops.
In this case, the order of the graph is obviously equal to N, and the number of edges is M.
A graph can be one-to-one associated with an adjacency matrix A.
Due to the orientation of the graph G, the corresponding adjacency matrix is generally asymmetric:
The condition (4) of the absence of loops in the graph leads to the fact that only zero elements are allowed on the main diagonal of the adjacency matrix
and the matrix thus has the form
By a simple cycle L of length K on a graph G we mean a sequence of K non-repeating graph vertices such that every two consecutive vertices in the sequence, as well as the last and first vertices in it, are adjacent. In what follows, only simple cycles will be discussed; we will, for brevity, call them simply cycles.
We will further denote
Estimating the number of cycles
To construct an evaluation formula, we make the following assumption: for a graph G, we know only the order of the graph and the number of edges in it; the exact form of the adjacency matrix, any patterns of its formation, etc. we are unknown. In other words, we will consider that our particular graph was constructed as a result of a stochastic process of this kind:
N vertices of the graph were fixed, i.e. set v.
An adjacency matrix of size N x N was filled with zero values.
In the adjacency matrix, exactly M elements that did not lie on the main diagonal were chosen at random, and they were assigned single values.
On the indicated vertices, in accordance with the adjacency matrix, the edges of the graph E were constructed.
Taking into account the above and (11), the probability that an arbitrary element of the adjacency matrix outside the main diagonal is equal to one is obviously equal to
It is easy to see that expressions (12), (13) define a cyclic placement without repetitions of length K from the general population V of size N elements; it is known from combinatorics that the number of different arrangements in this case is
In order for the cyclic placement from (12), (13) to correspond to a cycle on the graph G, it is necessary that conditions (14), (15) be satisfied, i.e., so that there are corresponding edges, or, in other words, the K corresponding elements of the adjacency matrix are equal to one. Taking into account (18), the probability of this is
Thus, based on the chosen model for the formation of graph G, an arbitrary cyclic placement without repetitions, consisting of K graph vertices, forms a cycle on this graph with probability (21).
Note that now the number of cycles of length K on the graph can be considered as a random variable whose mathematical expectation from (19) and (21) is
And, finally, the mathematical expectation of the number of cycles on the graph G, taking into account (17), is equal to
This is the required estimate.
Remarks and discussion
First of all, we note that this estimate is in fairly good agreement with the experimental data. Table 1 shows the data on the number of cycles in the columns obtained by the enumeration method. The experiment covered 4 groups of graphs, each group contained 10 graphs with 50 vertices each and the number of edges 90, 100, 110 and 120 for each of the groups, respectively. Despite the fact that for each particular graph the number of cycles can differ markedly from the estimated one, the group average value is quite close to it.
Further, it is worth noting that even for relatively small graphs, estimate (23) predicts the presence of a very significant number of cycles on them; a number of estimated values are given in Table 2. This should be taken into account when planning calculations related to iterating cycles on a graph: they can take a little more time and resources than it seems!
Another interesting point is that formula (23) makes it easy to construct a function that can be interpreted as an estimate of the cycle length distribution density:
On fig. 1 shows graphs of the estimated and experimental cycle length distribution density on a graph with 60 vertices and 140 edges. As you can see, they agree quite well with each other. In this case, the experimental density is understood as
where the number of cycles in the relation is obtained by direct counting.
Thus, estimate (24) makes it possible to make a well-founded prediction about the characteristic lengths of cycles on the graph.
It remains to be noted that although the reasoning in the article refers to the case of simple cycles on directed graphs without loops, similar results can be obtained for other types of graphs and cycles by similar reasoning.
I hope that the results obtained will be of interest to the reader, will be useful to someone in his practical activities, and, perhaps, will serve as an impetus for further research in this area.