The practice of teaching and learning programming predominantly in imperative languages (including object-oriented imperative languages) leads to the fact that such a fundamental algorithmic mechanism as recursion remains poorly understood by many programmers and generates misconceptions broadcast in popular culture. Let’s try to bring some clarity to the question and state some of our thoughts on this matter.
What is recursion?
What is recursion in the everyday sense? This is the solution of the problem by reducing it to itself in simpler conditions.
Let’s say we are tourists, and we are faced with the task of trying to find a can of stew in our backpack with food. Let’s check if the backpack is empty. If empty, the problem is solved, there is no stew. Otherwise, we take out the first item that comes across from the backpack. If this is a can of stew, the problem is solved, the stew is found. Otherwise, we repeat our entire algorithm for the lighter backpack we now have with the remaining contents, and the result of solving the problem is postponed to the next step.
Even a small child will understand this example of recursion, and programmers will notice that it is essentially the task of finding an element in a singly linked list.
Recursion itself is just one possible way to describe repetitive actions. A cycle can always be described as a recursion. Recursion can be described as a finite number of loops, but this may sometimes require an additional data structure to store the sequence of recursive calls.
We formally introduce some definitions. The question is far from new, so let’s take definitions from fairly academic sources, such as [Friedman1975] And [Хювёнен1990]. In mathematics, recursion is considered in the theory of recursive functions, which is one of the formalisms of the theory of algorithms, so the definitions of many terms are based on very clearly defined mathematical concepts.
recursion – using oneself. Also, for simplicity, the word recursion refers to a recursive call.
Recursive call – a direct or indirect call by a function to itself.
simple recursion is a recursive call that occurs at most once in each branch of the function code. Chandra in [Chandra1972] showed that simple recursion can always be reduced by the compiler to an iterative loop, this result was further improved, as described below.
terminal branch is a branch of the recursive function code that completes it without further recursive calls.
Infinite recursion is a sequence of recursive calls that never goes to the terminal branch.
Parallel Recursion – recursion occurring several times in one branch of the function code.
Mutual recursioni is a call to two or more functions of each other.
Recursion by value is a recursive call that determines the result of the function.
Recursion on arguments is a recursive call involved in the calculation of the function arguments.
High order recursion – the case when in the definition of a function a recursive call is an argument to a call to the same function. Note that high order recursion has nothing to do with high order functions. With recursion of order N, computations in N+1 nested loops can be described, but the converse is not always true.
Recursion Order – the level at which the call is located in the high-order recursion. The usual cases of recursion used in practice contain zero-order recursive calls.
Styled recursive function – a special kind of zero-order recursive function, more general than simple recursion, and satisfying the seven requirements described in the article [Friedman1975] (those who wish can read them at the link below). The authors show that a styled function can always be reduced by the compiler to an iterative loop.
The following are strictly undefined concepts.
tail recursion – simple recursion, the recursive call of which is at the end of the function code. Many sources on the internet refer to tail recursion as only those calls that immediately precede the return from the function (let’s call it tail recursion I), while others interpret tail recursion more broadly, understanding it as a single recursive call somewhere in the last linear code branch (let’s call it tail recursion II), or as a simple recursion.
Tail recursion optimizationor tail call optimization – transformation by the translator of the tail call of the function (not necessarily recursive) into a linear (cyclic) code. Here again, the range of interpretations is wide, ranging from a simple replacement of a pair of commands call/ret on jmp (which, among other things, eliminates tail recursion I) to more complex optimizations of tail recursion II and simple recursion.
Application of definitions
Our everyday example with stew in a backpack was a simple recursion with two terminal branches and a tail recursion. Note that recursive code always starts with terminal branches, otherwise the program execution will plunge into endless recursion and never reach them.
Let’s write pseudo-code in Lisp for our tourists so that they understand exactly what to do when they wake up in the morning hungry and not remembering what they have in their backpacks:
(defun ищемТушёнку (рюкзак) (cond ((null рюкзак) nil) ((тушёнка? (car рюкзак)) (car рюкзак)) (t (ищемТушёнку (cdr рюкзак)))))
Here we have defined the function looking for stew from list parameter backpack. Its body consists of a conditional statement cond, which has two terminal and one recursive branch. The backpack is checked for emptiness, then the first element of the backpack (car backpack) is checked by a special predicate whether it is a stew, then by the third branch, which is preceded by the condition that is identically true at this moment t, we go to a recursive call with the rest of the list (cdr backpack).
If there is a desire to bring the matter to computer executable code, we also define our predicate:
(defun тушёнка? (x) (eq x 'тушёнка))
Right in this form, you can enter it into GNU Common Lisp or SBCL and look for stew.
(ищемТушёнку '(носки хлеб учебникЛиспа штопор тушёнка смартфон ноутбук шишка шишка кирпич носовойПлаток нож кредитка конфета бумажка зубочистка непонятныйМусор)) ТУШЁНКА
The code here is written in a purely functional style, does not contain variables and assignments, which is efficient in terms of memory allocation in the static section and on the heap.
Many programmers think that recursion is inefficient because it eats up stack space and has no place in production code. Is it so?
Of course, any tool must be used for its intended purpose, and to enumerate numbers from 1 to N, you probably do not need to use a recursive call. However, is recursion really that terrible compared to an iterated loop?
Modern 64-bit programming systems usually easily allow up to gigabytes of address space to be allocated on stack memory. This is enough for literally billions of nested recursive calls. It is unlikely that you will ever need recursion of this depth, and even if you do, then the main problem is likely to be processor time, not stack memory. A loop with meaningful meaning for a billion iterations is a serious matter. Nevertheless, there is still a problem with the stack, and it must be remembered.
However, as follows from the above definitions, any stylized recursion, and even more so any simple recursion, can be reduced to the form of an iterative cycle. Many translators are aware of this, to a greater or lesser extent.
A frequently used optimization technique in compilers is tail recursion optimization, or tail call (tail recursion) optimization. Many programmers know that compilers usually convert the tail recursion of I into an equivalent loop, so a task like our knapsack search will, after optimization, not take up space on the stack as we move deeper into the knapsack.
However, the Internet is full of opinions that the ability of the compiler to optimize tail recursion is exhausted by replacing the pair of call / ret instructions with the jmp instruction. Therefore, allegedly, even the usual factorial function in the form
(defun fact (n) (cond ((zerop n) 1) (t (* n (fact (- n 1))))))
is unsuitable for optimization and will cause a destructive stack growth, since after the recursive call there is still a multiplication to be performed, and the function code, therefore, is not a tail recursion of I. Further, people who share this opinion suggest “optimizing” this factorial, passing the multiplied values as the second parameter .
In reality, for example, [Penman2018] dismantled an example of compiling the corresponding code in C/C++ and showing that tail recursion II is optimized and replaced with a loop by a modern compiler. Attempts to perform such optimization manually lead nowhere at the level of machine code and only make it difficult to understand the source code.
In general, already at the level of common sense, it is clear that the calculations written in the code after calling tail recursion II can be moved from the epilogue of the resulting loop to the prologue of its next iteration, than to reduce the problem to tail recursion I.
In practice, it turns out that compilers of popular computing languages (C/C++, Fortran) usually provide deep tail recursion optimization with optimization enabled. Lisp translators optimize tail recursion to varying degrees. The Java and Python translators do not optimize tail recursion as a matter of principle, as the developers consider it important to preserve the original call trace.
Again, we must return to the circumstance that already the meaning (fact 1000) occupies the whole screen with the digits of the resulting number, and only 1000 elements are typed on the stack.
High order nightmares
Let us now consider a truly non-optimizable recursive function, for example, an extremely computationally intensive Ackermann function with first-order recursion that quickly goes to great depths. Quoted from [Хювяйнен1990]:
(defun аккерман (m n) (cond ((= m 0) (+ n 1)) ((= n 0) (аккерман (- m 1) 1)) (t (аккерман (- m 1) (аккерман m (- n 1))))))
Meaning (akkerman 4 1) it is calculated on my computer in SBCL in 16 seconds with a occupied stack of less than 4 megabytes, that is, the stack is consumed at a rate of 256 kilobytes per second. Thus, a 4 GB stack would be enough for 4.5 hours of computation of a recursive function that essentially does nothing but deepen its recursion. (For the sake of completeness, note that the value (ackerman 4 2) calculate by brute force is not yet in human power; and we think it is quite improbable that anyone would need a computational process described by recursion of the second or higher order for practical purposes).
Since this is an article about recursion, the conclusion is that the conclusions are presented in this article about recursion itself.
[Хювёнен1990] E. Hyvönen, J. Seppänen. World of Lisp. M.: Mir, 1990.
[Chandra1972] A. Chandra. Efficient compilation of linear recursive programs. Stanford AI project, STAN‑CS-72–282, 1972.