Large Prime Numbers: Fourier Transform

In one of my previous articles, I talked about mathematical algorithms that allow you to check the primality of a very large number. But all those algorithms are based on one basic operation – multiplying two large numbers. It is the long multiplication operations that take 99.9% of the time to perform any primality test. How is multiplication implemented in practice? They say that with the help of Fast Fourier Transform. But a quick reading of Wikipedia is puzzling. What does the Fourier transform have to do with multiplying integers? Let's figure it out.

The Fourier transform is widely used in our daily lives. It can be found in technologies such as JPEG, MP3, Wi-Fi, 4G communications. It is one of the basic algorithms of digital signal processing. Any signal recorded as individual points, whether it is a two-dimensional picture or one-dimensional sound, can be decomposed into components using the discrete Fourier transform.

When we talk about components, we mean primarily sinusoids, the sum of which makes up the signal. This is logical when applied to physical processes, because sinusoids occur wherever there are vibrations. Air vibrations, which we call sound waves, consist of individual sinusoids, the frequency of which we perceive as different tones. We perceive the frequency of light vibrations as color. Our eyes and ears are natural detectors of sinusoids of different frequencies.

In the case of digital signal processing, identifying sinusoids is also one of the main tasks. The easiest way to do this is to take a sinusoid of known frequency f and multiply each of its points by the corresponding point of the signal. If the signal has exactly the same sinusoid, then we get the square of the sinusoid (case 1 in the illustration). Since the square is always positive, then, having summed up all the points of the result, we get a large positive number. This is a clear indication that such a sinusoid exists in the original signal. If the frequency of the sinusoid in the signal differs from the specified one, then the product will contain both positive and negative numbers. Their sum will tend to zero (case 2). Zero means that there is no sinusoid of frequency in the signal fand we need to try other frequencies. This approach does not give an answer to what frequency the original signal has, but it allows us to find it by enumeration.

However, in addition to the frequency, the sinusoid also has such a characteristic as the phase, or shift relative to the zero point. In the case of a shift by half a period, we get a mirror sinusoid (antiphase). But the product of such sinusoids still gives a square, albeit a negative one (case 3). The sum of the result gives a large negative number, which indicates that the signal contains such a frequency, but in antiphase. However, if the phase is shifted by only a quarter of the period, the sinusoid turns into a cosine, and the sum of the result again turns to zero (case 4), as in the case of the absence of the desired frequency.

The solution to this problem is simple – you need to multiply the original signal by both the sine wave and the cosine wave of the given frequency. Both results will be zero if there is no such frequency in the signal, and will take non-zero values ​​depending on the phase of the signal. Moreover, the ratio of the results can even be used to accurately determine the phase. The cosine part seems to provide an additional coordinate axis. Now we have not only a positive or negative sine result, we have two numbers that define a vector in two-dimensional coordinates. The angle relative to the horizontal axis gives us the phase, and the amplitude of the original sine wave can be calculated from the length of the vector.

This turns out to be so convenient that multiplication by sine and cosine simultaneously became a standard operation. And there is even a convenient mechanism for this – complex numbers a+ib. When multiplied by an ordinary number, each of the pair (a, b) is multiplied independently, which is what we need. Traditionally, the cosine is taken as the real part, and the sine as the imaginary part. Such a complex number looks like \cos{x}+i\sin{x}. Accordingly, you can use all the formulas for working with complex numbers. The phase and amplitude are calculated as the argument and modulus of the complex sum of the result.

Moreover, this notation can be simplified even further. The fact is that when the great Russian mathematician of Swiss origin Leonard Euler began to study the expansion of functions into infinite series, he noticed that the sum of the series \cos{x} And i\sin{x} matches the row e^{ix}This equality is now called Euler's formula:
e^{ix} = \cos{x}+i\sin{x}

What is the meaning of the function e^{ix}? For a long time this remained unclear. For example, in the 19th century, Harvard University professor Benjamin Peirce, after proving this formula in a lecture, said: “This is probably true, but it is absolutely paradoxical; we cannot understand it, and we do not know what it means, but we have proved it, and therefore we know that it must be true.” Currently, the geometric interpretation of this formula is accepted:

e^{ix} — this is the rotation of the vector (1,0) by the angle x in the complex plane. For example, if x=\piwhich corresponds to a rotation of 180 degrees, we get the vector (-1,0), which can also be expressed as the very well-known Euler's identity:
  e^{i\pi} + 1 = 0.

This formula is considered by many to be the most beautiful formula in mathematics. It uses the basic mathematical constants: 0, 1, \pi, e, i; and the basic operations: =, +, *, power. If we fly to another planet with intelligent life and watch how aliens write this formula, we will begin to understand a significant part of their mathematics.

But let's return to the topic of digital signal processing. In digital systems, we work not with the signal directly, but with its discrete recording in the form of a sequence of values a_n lengths N. Accordingly, both the sine and cosine wave must also be discretized. This can be easily achieved by taking t = n/N.

Thus, if we write down in formal form the sum of the product of a discrete signal by a sine wave and a cosine wave, we get:

\overline{a} = \sum_{n=0}^{N-1}{a_n e^{i2\pi fn/N}}

This formula is a powerful tool for detecting a signal at a frequency f. But what if we want to scan a range of frequencies? The easiest way is to start with the frequency f_0 and go in increments, checking each one (the same thing happens when you turn the knob on a radio). But if we can choose which frequencies to use, then we have a very tempting option – frequencies from 0 to N-1. Substituting them into the previous formula, we get a sequence of N results:

  \overline{a}_k = \sum_{n=0}^{N-1}{a_n e^{i2\pi kn/N}}, \quad k = 0,...,N-1.

If you look at the Wikipedia article “Discrete Fourier Transform“, you will see a practically identical formula. The main advantage of choosing such frequencies is the efficiency of calculations. Due to the fact that the function e^{i2\pi kn/N} accepts only N different values, they can be taken out of brackets, grouped, use various programming tricks, and get an algorithm called “Fast Fourier Transform» (FFT). This algorithm calculates N results from N points of the original signal for N\log{N} multiplications. In practical terms, this means that it is possible to transform a signal of a million points, obtaining the phases and amplitudes of a million frequencies in 20 million multiplications (that is, in a fraction of a second on modern computing devices). It is precisely the high speed of transformation that has led to the widespread use of this algorithm in our everyday life.

An interesting feature of this transformation is that f=0 is not strictly speaking a frequency. When k=0 the exponent becomes one, which gives us simply the sum of all the original values. k=N/2 we similarly obtain a sum with alternating sign. These two values ​​do not have an imaginary part (which can be used to optimize storage). Also, these values ​​can be used to quickly check the correctness of the algorithm.

We've covered what the Fast Fourier Transform is, but what about multiplying large numbers? First, we need to define the problem. First, remember that we never operate with numbers, only with their representations. When we talk about the number 3196, we understand that “3196” is just the representation of this number in the decimal number system. C7C16 and 1100011111002 are representations of the same number in hexadecimal and binary notations. These representations are no better or worse, they describe the same number, and they can be used where it is convenient. And these representations have one thing in common – they use a positional notation, that is, digits can have different values ​​depending on their position. We implicitly assume that each digit is multiplied by the corresponding power of the base.
3196 = 3\cdot10^3 + 1\cdot10^2 + 9\cdot10^1 + 6\cdot10^0.
In other words, the numbers are the coefficients of a polynomial
A(x) = 3x^3 + 1x^2 + 9x + 6,
and the number itself is equal to the value of this polynomial at a point equal to the base: 3196 = A(10).

Thus, the problem of multiplying two numbers is reduced to the problem of multiplying two polynomials:
P(x) = A(x) \times B(x)
The result of the multiplication will be the value P(10). The easiest way to find a polynomial P(x) is taught in elementary school and is called “column multiplication.” Each digit of one number is multiplied by each digit of another number, and the result is placed in a position equal to the sum of the positions of the digits. For example, let's calculate the product of the polynomials corresponding to 3196 and 32:
P(x) = A(x) \times B(x) = (3x^3 + 1x^2 + 9x + 6)(3x + 2) = 9x^4 + 9x^3 + 29x^2 + 36x + 12
The result of multiplicationP(10) = 102,272. Note that we can apply the carry operation to the resulting polynomial (write 2, carry 1 to the next digit, 36 + 1 = 37, write 7, carry 3 to the next digit, etc.). The result will be the polynomial
P'(x) = 1x^5 + 0x^4 + 2x^3 + 2x^2 + 7x + 2.
Polynomials P(x) And P'(x) are completely different (even their degrees are different), but have the same value at point 10: P(10) = P'(10) = 102 272. The carry operation is not mandatory to obtain the correct result, you can do it or not depending on your goals. And it is executed very quickly, so it does not greatly affect the overall execution time of the program. Column multiplication, on the contrary, is a very slow operation. Its execution time is proportional to the square of the size of the numbers.

There is a story about Andrey Kolmogorov, at a seminar in 1960, suggesting that it is impossible to multiply two numbers faster than the square of their length. In the audience sat 23-year-old Anatoly Karatsuba, who on his way home came up with a faster algorithm, now known as Karatsuba algorithm. This broke the “square wall” in the mind, and mathematicians began to find even faster algorithms. Eventually, in 2019, Harvey-van der Hoeven algorithm with theoretical complexity N\log{N}which is considered optimal and unimprovable. But this algorithm is extremely complex, and it becomes faster than all others only when multiplying numbers greater than the size of the Universe. Therefore, in practice, other algorithms are used. Let's consider one of them.

This algorithm is based on the fact that the value of the product polynomial at an arbitrary point is equal to the product of the values ​​of the original polynomials at that point. That is, for any r
P(r) = A(r) \cdot B(r)
Thus, even without knowing the polynomial P(x)we can find out its value at any point. But how can we find out the coefficients of the P(x)? This will help us Bezout's theoremwhich states that the value of the polynomial at the point r is actually the remainder of the column division of this polynomial by the polynomial x - r:
P(r) = P(x) \bmod (x - r)
In other words, if we have N values ​​of a polynomial at different points \{P(r_i)\}then we can consider this as a set of polynomials of degree zero, which are remainders from division P(x) to different modules x - r_i. By using Chinese Remainder Theoremwhich also works for polynomials, we can construct the remainder of division by the product of all moduli:

P_N(x) = P(x) \bmod \prod_{i=0}^{N-1}{(x - r_i)}.

The degree of such a constructed polynomial will not be higher N-1and therefore, taking a sufficient number of points (more than the sum of the degrees of the original polynomial) we can get P_N(x) = P(x).

For example, let's take the same polynomials A(x) = 3x^3 + 1x^2 + 9x + 6 And B(x) = 3x + 2. Let's calculate their values ​​at points 1, 2, 3, 4, 5, multiply them and get the values ​​of the polynomial P(x) at these points. The product of all modules is equal to
(x - 1)(x - 2)(x - 3)(x - 4)(x - 5) = x^5 - 15x^4 + 85x^3 - 225x^2 + 274x - 120.
The Chinese Remainder Theorem allows us to find

P_5(x) = P(x) \bmod (x^5 - 15x^4 + 85x^3 - 225x^2 + 274x - 120) =\\= 9x^4 + 9x^3 + 29x^2 + 36x + 12.

Note that this polynomial is the same as the one obtained by column multiplication.

Show calculations

Dots

1

2

3

4

5

A(x)

19

52

123

250

451

B(x)

5

8

11

14

17

P(x)

95

416

1353

3500

7667

deg 2

416(-1)(x – 3) + 1353(x – 2) = 937x – 1458 mod (x^2 – 5x + 6)

3500(-1)(x – 5) + 7667(x – 4) = 4167x – 13168 mod (x^2 – 9x + 20)

deg 4

((937x – 1458)(1/3x – 6/12) mod (x^2 – 5x + 6))(x^2 – 9x + 20) + ((4167x – 13168)(-1/3x + 22/ 12) mod (x^2 – 9x + 20)) (x^2 – 5x + 6) = (7286/12x – 13740/12)(x^2 – 9x + 20) + (-5666/12x + 43664/ 12)(x^2 – 5x + 6) = 135x^3 – 610x^2 + 1422x – 1068 mod (x^4 – 14x^3 + 71x^2 – 154x + 120)

deg 5

95(1/24)(x^4 – 14x^3 + 71x^2 – 154x + 120) + ((135x^3 – 610x^2 + 1422x – 1068)(-1/24x^3 + 13/24x^ 2 – 58/24x + 96/24) mod (x^4 – 14x^3 + 71x^2 – 154x + 120))(x – 1) = 95(1/24)(x^4 – 14x^3 + 71x^2 – 154x + 120) + (121/24x^3 + 1667/24x^2 – 4382/24x + 11112/24)(x – 1) = 9x^4 + 9x^3 + 29x^2 + 36x + 12 mod (x^5 – 15x^4 + 85x^3 – 225x^2 + 274x^1 – 120)

The computational complexity of this algorithm can be estimated as N\log^2{N}. The logarithm turns out to be squared because the algorithm is recursive – in the course of the Chinese remainder theorem we have to multiply polynomials of a lower degree. But there is one interesting possibility here – we are free to arbitrarily choose the points at which we calculate the values ​​of the polynomials. And if we could choose the points so that all the intermediate polynomials had the form x^k - athen this would greatly speed up the algorithm, because multiplying by a polynomial of this type is fast and simple. How to achieve this? Very simple, just remember the formula from school algebra x^2 - a^2 = (x - a)(x + a). Let us take the simplest polynomial as the product of all modules x^N - 1 and we will expand it using this formula until we get polynomials of the first degree. At the first step, it is expanded into x^{N/2} - 1 And x^{N/2} + 1. The first of them is folded out again into x^{N/4} - 1 And x^{N/4} + 1and the second in x^{N/4} - i And x^{N/4} + i. Then we get x^{N/8} - 1, x^{N/8} + 1, x^{N/8} - i, x^{N/8} + i, x^{N/8} - \sqrt{i}, x^{N/8} + \sqrt{i}, x^{N/8} - i\sqrt{i}, x^{N/8} + i\sqrt{i}. And so on. As follows from fundamental theorem of algebrathe process will end in N roots of a polynomial x^N - 1The roots of this polynomial are also known as roots of unitythey are uniformly distributed on the unit circle in the complex plane:

Now is the time to remember about the function e^{ix}. With its help we can easily write the formula for roots of unity:
r_k = e^{i2\pi k/N}
Thus, we can now formally define what it means to calculate the value of a polynomial A(x) at points r_k:

A(r_k) = \sum_{n=0}^{N-1}{a_n r_k^n} = \sum_{n=0}^{N-1}{a_n e^{i2\pi kn/N } }, \quad k = 0,...,N-1.

Compare this with the discrete Fourier transform formula derived above. Yes, we accidentally got exactly that. And not a similar transform, but an absolutely identical one. If you look at the software implementation without context, you cannot understand what problem the programmer was solving – was he doing digital signal processing or multiplying large numbers. The code is identical. Of course, this result is not accidental; if you dig deeper into mathematics, you can find a connection between multiplication and signals, like convolution theoremsBut you shouldn't do this, and here's why.

The point is that the discrete Fourier transform coincides with fast multiplication only for a polynomial x^N - 1. But this is not the only polynomial used in practice. All numbers from the list largest prime numbers of Proth were found using a polynomial x^N + 1The roots of this polynomial are also uniformly distributed on the unit circle, but with a shift:

When using x^N + 1the multiplication is still fast, the algorithm is almost the same, the changes are minimal. But this is no longer a Fourier transform. In this transform there is no zero frequency, points 0 And N/2which do not give an imaginary part. All values ​​(for even N) are complex numbers. The closest to this algorithm are z-transform And chirp algorithm. Also, this algorithm is called “negacyclic convolution” due to the fact that the application of the module x^N + 1 is equivalent to subtracting the coefficients greater than Nfrom the lower coefficients. But in general, the algorithm does not have any specific name of its own. Moreover, its practical implementations often continue to be called the fast Fourier transform, although this is questionable from a mathematical point of view.

But even this variety of options for fast multiplication is not limited. The program Cyclo uses roots circular polynomial x^2 - x + 1 to efficiently search for prime numbers of the form b^{2^n} - b^{2^{n-1}} + 1. At the moment, 7th and 8th largest known prime numbers were found using this program. It is already extremely difficult to call such algorithms fast Fourier transform.

What do we have in the end? There are very efficient algorithms for multiplying large numbers that are either the same as or very similar to the fast Fourier transform. However, they have a completely different, algebraic nature. Understanding this allows us to create new algorithms and adapt old ones to solve other, more interesting problems.

If you want to know more about multiplication algorithms, you can read my treatise “Multiplication of large numbers. Exponentiation and factorization” (third chapter in progress).

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *