The 3n+1 Dilemma in Java: Caching Recursion

Hello everyone, today I want to tell you about one of the most interesting unsolved mysteries of mathematics. The Collatz conjecture, or the 3n+1 dilemma, became famous due to the simplicity of its formulation, while remaining unproven for over 90 years.

A brief formulation, that is, a slightly modified excerpt from Wikipedia (en And ru):

We take any natural number n:

  1. If it is even, then divide it by 2,

  2. If it is odd, then multiply by 3 and add 1.

We perform the same actions on the resulting number, and so on.

The Collatz conjecture is that no matter what initial number n we take, sooner or later we will get one.

Example:

  • 3 10 5 16 8 4 2 1 (came to unity in 7 steps)

  • 9 28 14 7 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1 (here it took 19 steps)

I am far from being a mathematician, but intuitively I understood that the main problem here is that we do not know how long the number will grow, since the factor 3 is greater than the divisor 2, by which we divide the number if it is even. Therefore, we cannot use the induction typical for such problems, and claim that each k number will be less than the previous one, so sooner or later we will come to one.

Actually, I didn’t intend to prove the Collatz conjecture, but it would be a sin not to program such a simple problem.

I wrote the conditions “head-on”:

    public static long coll_func(BigDecimal n){
        BigDecimal copyNum = n;
        long stepsCounter = 0;
        while(!n.equals(BigDecimal.valueOf(1))){
            stepsCounter++;
            if(n.remainder(BigDecimal.valueOf(2)).equals(BigDecimal.valueOf(0))){
                n = n.divide(BigDecimal.valueOf(2));
            }
            else{
                n = n.multiply(BigDecimal.valueOf(3)).add(BigDecimal.valueOf(1));
            }
        }
        return stepsCounter;

And it worked! Indeed, by running this function in a loop from main, I could see the number of steps it would take to get to one for various numbers.

But what is very striking is the fact that the chains given in the example have a fairly large common part:

  • 3 10 → 5 → 16 → 8 → 4 → 2 → 1

  • 9 28 14 7 22 11 34 17 52 26 13 40 20 10 → 5 → 16 → 8 → 4 → 2 → 1

That is, the program recalculates these values ​​each time, and I would like to avoid this. I wanted to make a cache so that when entering 10, the program would understand: damn, I've seen this somewhere before, and would simply add 6 from the cache to the current 13 steps, getting 19.

I found information about it on the Internet Guava cache and decided to use it. Found out that you can configure auto-deletion by time, maximum size, level of multi-threaded isolation and much more.

This is done when creating a CacheBuilder object (before calling .build() you can configure different cache options):

Cache<BigDecimal, Long> collatzCache = CacheBuilder.newBuilder().build();

By the way, maven dependency used this one:

<dependency>     
  <groupId>com.google.guava</groupId>
  <artifactId>guava</artifactId>
  <version>33.2.1-jre</version> 
</dependency>

However, in order for the function to start caching normally, I had to rework the infinite loop into recursion and cache the result of its call in the main method.

It looked like this:

public static long collatzFunc(BigDecimal n, long stepsCounter) {
    if (n.equals(BigDecimal.ONE)) return stepsCounter;

    if (isEven(n))
        n = n.divide(BigDecimal.valueOf(2));
    else
        n = n.multiply(BigDecimal.valueOf(3)).add(BigDecimal.ONE);

    Long stepsInCache = collatzCache.getIfPresent(n);
    if (stepsInCache != null)//If the function has a value in the cache, we get it from there.
        return stepsInCache + stepsCounter + 1;
    else return collatzFunc(n, stepsCounter + 1);//otherwise, simple recursive call
}

The isEven method checks if a BigDecimal is even, I originally used

private static boolean isEven(BigDecimal n) {
    return n.remainder(BigDecimal.valueOf(2)).equals(BigDecimal.ZERO);
}

But then he replaced it with something more concise:

private static boolean isEven(BigDecimal n) {return !n.testBit(0);}

The main method (we test the hypothesis on numbers from 1 to diapason-1 and output the number with the longest path to one):

public static void main(String[] args) {
    long maxSteps = 0;
    long maxHardNumber = 0;
    for (long i = 1; i < diapason; i++) {
        long l = collatzFunc(BigDecimal.valueOf(i), 0);
        collatzCache.put(BigDecimal.valueOf(i), l);

        if (l > maxSteps) {
            maxSteps = l;
            maxHardNumber = i;
        }
    }
    System.out.println("The most hard humber is " + maxHardNumber + ". We need " + maxSteps + " steps to make 1");
}

After doing this kind of optimization, I started experimenting with the cache size and ways to clean it, as a result of which I managed to find the softValues ​​method, which sets the values ​​from the cache to Strength.SOFT and this has the best effect on performance! The garbage collector removes values ​​from such a cache only when it needs memory.

So my cache was created with the following line:

CacheBuilder.newBuilder().softValues().build();

(while the original solution with a loop checked the first 100 million natural numbers in 27 minutes 44 seconds, the solution with cache and softValues ​​did it in just 7 minutes 19 seconds. An increase of more than 3.7 times, not bad)

It is also interesting that none of the implementations of the algorithm, even on relatively weak hardware, goes into the fog or crashes from OutOfMemory, calmly checking millions of 19-digit numbers, and doing so in a reasonable amount of time.

In fact, the most disgusting on the segment [1,100 000 000) стало число 63 728 127: чтобы привести его в единицу, пришлось сделать 949 шагов. А если подумать, то 949 шагов — это не очень много, поэтому вычисления для компьютера не сложны и не вызывают переполнения памяти.

На отрезке [1, 50 000 000) я получил следующее время выполнения для различных конфигураций:

Конфигурация

Время выполнения, сек.

Кэш с фиксированным максимальным размером в 10 000 000, где по истечению лимита элементы вытесняются по принципу FIFO (first in, first out)

159

Кэш без указания максимального размера, с модом softValues

78

Без кэша, цикл

501

Без кэша, рекурсия

387

Интересный факт, в википедии указано, что «По состоянию на июль 2023 года проверены все натуральные числа до 10¹⁰⁰ (десять в сотой степени), и каждое из них продемонстрировало соответствие гипотезе Коллатца.» Гипотеза Коллатца — Википедия (wikipedia.org)

Я прогнал алгоритм на числе 10¹⁰⁰+1 и уверенно могу вам сказать: это число тоже соответствует гипотезе Коллатца, путь в единицу из него занимает 2308 шага.

Таким образом, вы можете самостоятельно запустить этот код, и, дописав в википедию, что проверили еще парочку миллиардов чисел на соответствие гипотезе, обеспечить математическому сообществу крепкий и здоровый сон)

Код проекта на GitHub: https://github.com/youngmyn/collatz‑theory

Источники:

Самая простая нерешённая задача — гипотеза Коллатца [Veritasium] (youtube.com) – a very cool video on the topic, which inspired me to write this article

Collatz conjecture – Wikipedia (wikipedia.org) — Russian Wikipedia hypotheses

Collatz conjecture – Wikipedia Collatz conjecture – Wikipedia (wikipedia.org) — English Wikipedia hypothesis

Cache (Guava: Google Core Libraries for Java 23.0 API)) – guava cache

https://habr.com/ru/articles/673 224/ — a cool article about a mathematical approach to creating an optimal implementation of lru cache

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *