Machine learning faced an unresolved math problem

Salute, Khabrovites! In anticipation of the launch of new threads on advanced and basic courses “Mathematics for Data Science” We want to share with you a rather interesting translation. There will be no practice in this article, but the material is interesting for general development and discussion.

A group of researchers faced an open mathematical problem associated with a number of logical paradoxes that were discovered by the famous Austrian mathematician Kurt Gödel in the 1930s.

The mathematicians who worked on machine learning problems proved that the possibility of “learning”, that is, whether the algorithm can extract a pattern from limited data, is closely related to the paradox known as the continuum hypothesis. Gödel said that using the standard capabilities of a mathematical language, a hypothesis can neither be confirmed nor disproved. Recent research findings on this topic have been published in Nature Machine Intelligence January 7th.

“It was a surprise for us,” said Amir Yehudaev of Technion, Israel’s Institute of Technology in Haif, who co-wrote the study. He said that despite a number of technical issues, which are also known as “insoluble,” he did not expect this phenomenon to occur in the seemingly simple problem of machine learning.

John Tucker, computer science specialist at Swansea University, UK, says this work is “a tangible result at the edge of our knowledge,” with fundamental implications for both mathematics and machine learning.

Not all sets are the same.

Researchers often determine learnability in terms of whether an algorithm can generalize its knowledge. The algorithm answers “yes” or “no”, for example, to the question “Is there a cat in the image?” For a limited number of objects, and then it should make a forecast for new objects previously unknown to it.

Yehudaev and his colleagues taught the results by exploring the relationship between learning and “squeezing,” which includes finding a way to map the characteristics of a large data set into a smaller set. The authors found that the ability of information to be compressed effectively reduces to the question of set theory – mathematical sets of objects, such as sets in Venn diagrams. In particular, this applies to sets of various sizes containing an infinitely large number of objects.

Georg Cantor, founder of set theory, in the 1870s proved that not all infinite sets are equal: for example, the set of integers is “less” than the set of all real numbers, also known as the continuum. (Since real numbers include irrational as well as rational and integer numbers.) Cantor also suggested that there are no sets of intermediate size, that is, larger than the set of integers, but smaller than the continuum. But he could not prove this continuum hypothesis, like many mathematicians and logicians – his followers.

Their efforts were in vain. In 1940, Godel conducted a study (which was completed only in the 1960s by the American mathematician Paul Cohen), in which, using axioms, he proved that the continuum hypothesis can be neither true nor false.

Gödel and Cohen’s work on the continuum hypothesis admits that there may be parallel mathematical universes that meet the laws of standard mathematics: one in which the continuum hypothesis becomes a generally accepted axiom, that is, is declared true, and the second in which it is also declared false.

Learning limb

In their latest work, Yehudaev and his colleagues define learning as the ability to make predictions for a relatively large data set by sampling a small number of data points. The connection with Cantor’s problem is that there are infinitely many ways to select a smaller set, but the size of this infinity is unknown.

Further, the authors show that if the continuum hypothesis is true, then a small sample is sufficient for extrapolation. But if it is false, then there cannot be a finite sample that would be sufficient. Thus, they believe that the learning problem is actually equivalent to the continuum hypothesis. As a result, the problem of learning is also in a state of uncertainty, which can only be solved by choosing an axiomatic universe.

“The result of the study also helps build a wider understanding of learning,” says Yehudaev. “This connection between compression and generalization is really fundamental in understanding the learning process.”

“Researchers have discovered a number of such“ insoluble ”problems,” says Peter O’Hearn, a computer science specialist at University College London. In particular, according to the results of Godel’s work, Alan Turing, one of the founders of the theory of algorithms, discovered a class of questions that no computer program can be guaranteed to answer for any finite number of steps.

“However, the insolubility obtained in recent studies is very rare and much more surprising,” adds O’Hearn: it indicates that Godel discovered the inner incompleteness of any kind of mathematical language. The results obtained are likely to be important for the theory of machine learning, but this is unlikely to have a great practical impact.

Write in the comments what you think about this material, and we invite you to free webinarin which we talk about regression analysis methods in Data Science.