Analysis of bit block information by the number of zeros and ones in the block

Among the methods of information analysis, this article presents an analysis of the distribution of information density in a bit data block. This method can be a guideline in developing information compression methods, since it provides estimates of how the information density is distributed depending on the composition of the block, which is determined by the number of zeros and ones that form the bit data block.

An input stream with a length of CT bits is given. Next, the number of zeros and ones in the block is determined, then the number of combinations that a given set of zeros and ones specifies for a given block length can be determined using the formula for calculating permutations:

N=CT!/(K0!·K1!) (1)

Where

N is the number of permutations that determine the number of combinations;

CT — block length in bits, CT=K0+K1;

K1 — the number of units in the block;

K0 is the number of zeros in the block.

This formula determines the number of permutations that produce a given number of zeros and ones in a bit block.

For example, for 16 bits of block length, the combinations are represented by the table:

Table 1 — 16-bit block information

Table 1 — 16-bit block information

Where

CT – number of bits in a block;

K0,K1 – the number of zeros and ones in the block;

N is the number of combinations that specify zeros and ones in a block;

Log2(N) – binary logarithm showing the number of permutations in bit form (data in the column are rounded);

I% is the percentage of the full 2^CT range that the permutations occupy (data in the column are rounded).

For 32 bits of block length, the combination estimate in Table 2 is:

Table 2 - 32-bit block information

Table 2 – 32-bit block information

Where

CT – number of bits in a block;

K0,K1 – the number of zeros and ones in the block;

N is the number of combinations that specify zeros and ones in a block;

Log2(N) – binary logarithm showing the number of permutations in bit form (data in the column are rounded);

I% is the percentage of the full 2^CT range that the permutations occupy (data in the column are rounded).

Figure 1 - Graph of the distribution of the number of combinations in a 32-bit block depending on the number of ones in the 32-bit block.

Figure 1 – Graph of the distribution of the number of combinations in a 32-bit block depending on the number of ones in the 32-bit block.

Bit block data analysis and permutations.

A block of length n bits contains 2^n combinations, this article describes how in a block of information of length n bits the information density is distributed depending on the number of zeros and ones that form this block. Formula (1) shows how to estimate the number of combinations that are formed with a given block length CT and the number of constituent zeros and ones K0, K1. It is shown that the maximum number of combinations and therefore the information density will be at K0=K1. And unlike Shannon's approach, where the amount of information is characterized by the probability of a symbol appearing, then in this method it is possible to accurately estimate the amount of information for a given input block of bits.

And we can conclude that for a bit block of length CT bits the most dense distribution of information is at K0=K1 for K0+K1=CT.

It can also be concluded that the number of combinations N for a given K0,K1 is less than 2^CT. This means that if there were no need to store K0,K1 as source data, it would be possible to obtain compression:

sz=(CT!/(K0!·K1!)) / (2^CT) (2)

And if in full form, then for compression you can save the information

K0,K1,NPR;

where K0 is the number of zeros in the block,

K1-number of units in a block,

NPR is the number of permutation of zeros and ones for given K0,K1.

Function to convert a bit string to a permutation number

NPR=PR(CT,BSTR);

Where

CT – length of the BSTR bit string,

BSTR – bit string,

NPR is the number of permutations.

Function to convert permutation number to bit string

BSTR=BS(K0,K1,NPR);

Where

K0 is the number of zeros in the block,

K1-number of units in a block,

NPR – number of permutations,

BSTR is a bit string.

NPR will range from 0 to CT!/(K0! K1!) which can provide compression in some cases.

And it is also possible to set a certain block length in the encoder and decoder L, then, for example, it is possible to transmit only K1, and obtain K0 using the formula K0=L-K1.

Conclusion

Using these examples and techniques, compression can be built in some situations for a bit block. And this approach can be attributed to statistical compression methods. But at the same time, dictionary methods are also statistical, but there, frequently repeated sections are stored in additional memory.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *