Algorithm for dividing 2W-bit numbers using operations with numbers of bit depth W
Using 32-bit integers as an example, we consider a scalable division algorithm that uses numbers with half the width (16 bits). To illustrate the performance of the algorithm, the code of a test application in C++ is given.
Description of the algorithm
Let's imagine some -bit number as:
Where , – senior and junior parts -bit components of the number, respectively.
Then division of two numbers, And can be written as a fraction:
Note that if -bit number. Let's limit ourselves to this case. For in the example below, an algorithm for dividing a “wide” number by a “narrow” one is implemented in C++, based on the representation of the dividend as a product of the quotient and divisor plus the remainder term :
We further assume that , otherwise, the result of division is zero. Let's imagine a number as:
Here – integer part from division, and – remainder of the division on .
Let's rewrite the fraction:
Let's neglect the term to get a lower bound for the division result plus further simplify the formula:
Let's select the term as the main component:
Let's make a change of variables The point is that “severe” cases correspond to the parameter large enough, so the entered delta parameter will be small and the formula will quickly lead to the result:
The numerator and denominator of the fraction contain “wide” numbers (with digits ). Since it is possible to use the algorithm for dividing a wide number by a narrow one, we will achieve the “narrowness” of the number in the denominator by removing the multiplier outside the brackets and performing the division sequentially:
Thus, the “broad” numerator sequentially divide by “narrow” denominators. The first denominator can sometimes be equal to the multiplier , which needs to be monitored in the algorithm: in the CPU registers the number will actually be reset to zero and an exception will occur. Alternatively, one can subtract one in advance and not worry about boundary conditions, since for this algorithm which will ultimately give the final version:
A numerical experiment showed that one iteration described above is sufficient. Physically, this is explained by the fact that the algorithm is designed specifically for
conclusions
An algorithm for dividing numbers consisting of high and low halves has been proposed and tested, scaling to an arbitrary bit depth in multiples . This algorithm, in a sense, is a variant of “smart” long division: first, the first approximation is calculated, equal to the division of the higher halves of the number, and then the second, equal to the corrected first. The corrector is equal to the sequential division of a certain wide number into two narrow ones.
The proposed algorithm easily scales to the 128-bit version using built-in 64-bit arithmetic. However, the option with scaling, for example, to 256-bits, requires the implementation of a full multiplication in the U128 structure, which can be done by scaling the implemented “half” multiplication operator: U32 by u16. You will also need to implement the bit shift operator. Ultimately, when all the necessary operators are implemented, it is possible to implement a template structure with an arbitrary bit depth recursively (by dividing in half) into the arithmetic of 64-bit built-in numbers.