Stribog hash function. Features of hardware implementation on System Verilog
Summation modulo two
In the figure it is denoted smod2(a,b). In this function, 512-bit input a is added to 512-bit input b, the result is written to 512-bit output s. If there is an overflow, it is discarded.
G function
In the figure we denote g(m,h,n).
m is the input data array for which the hash function is calculated, divided into 512 bits. If the last piece is less than 512 bits, then it is immediately padded with zeros and a one in the first additional bit to the full 512 bits;
h is the output of the function from the previous iteration; if the iteration is the first, then the initialization value V is supplied0 = `{64{8`h00}} or V0 = `{64{8`h01}} ;
n – the number of bits of the original message in a 512-bit piece. For a whole piece of value there will be 512 bits;
If the source array is larger than 512 bits, then the calculations consist of three stages:
If the source array is less than or equal to 512 bits, then the calculation consists of two stages:
The Stribog hash function can have two implementations with a resulting value of 256 or 512 bits in length. To implement 512 bits in the initialization stage V0 = `{64{8`h00}}, to implement 256 bits in the initialization stage V0 = `{64{8`h01}}, and the most significant 256 bits are taken from the output of the last g function.
Let's consider the internal structure of the g function. The block diagram is presented in Figure 2. This function consists of LPSX functions and exclusive “OR” operations. In addition, an array of iteration constants is used, which is used for the input argument b of the LPSX function at iterations 1 to 12. The values of the constants from the array are given in GOST 34.11-2012.
LPSX function
The block diagram is presented in Figure 3. The LPSX function is an algorithm for converting and rearranging bytes, which can be implemented sequentially through the functions X, S, P, L, about which you can read in detail here or Here. The S and L transformations can be performed in advance to form eight arrays of 256 eight-byte numbers, which will contain all possible values of these two transformations. In addition, when calculating the hash sum using pre-calculated values, you can immediately make the necessary permutations in accordance with the P transformation. Thus, successive LPSX transformations can be replaced by a single PRECALC transformation with an array of pre-calculated values.
Hardware implementation
Parallelization of calculations
Unlike software implementation, hardware implementation in FPGA or ASIC allows parallelization of calculations. Thus, at the top level, in parallel and independently of each other in each iteration, calculations are performed in two summation functions modulo 2 and the calculation of the g function. If you look at the block diagram of the implementation of the g function, you will notice that the calculation of LPSX functions from input n and the array of constants C occurs independently of the calculation of LPSX functions from input m (see Figure 2). Therefore, when calculating a g function, you can create two instances of LPSX functions that will run in parallel, performing 12 iterations until the output value of the g function is obtained. This implementation is shown schematically in Figure 4.
Description of interfaces
For each function I will provide an interface for implementation in the hardware description language Verilog or VHDL, as well as a signal diagram.
LPSX function
It is a register calculation pipeline. The input is data and a validity signal. The output is converted data and a validity signal.
G function
It is a register calculation pipeline. The input is data and a validity signal. The output is converted data and a validity signal.
Summation modulo 2
It is a register calculation pipeline. The input is data and a validity signal. Since the input number is summed with the result of the previous calculation, it is more convenient to leave one data input and add a signal to clear the clear_i module.
Top level module
It is a register calculation pipeline. The module receives 5 groups of signals:
system – frequency and reset
settings – select hash length
input data – data with valid/ready handshake, as well as the last message flag and the number of significant bits in the last message
control – a request to start operation of the module’s machines and a confirmation issued after receiving the hash
output – data with valid/ready handshake
Conclusion. Description of sources
The source files of the Stribog hash function in the System Verilog language are in my repositories git. There is also a testbench with examples from GOST. To start the simulation in QuestaSim you need to do:
in console mode for implementation without precomputations,
make run_questa_console
in console mode for implementation with precalculations,
make precalc=1 run_questa_console
in graphical mode for implementation without precomputations,
make run_questa_gui
in graphical mode for implementation with precomputations.
make precalc=1 run_questa_gui