explanation from ML and AI expert Petr Yemelyanov

Blockchain is credited with three properties: immutability, distribution and consensus. Let's take a closer look.

Immutability

You can look at the blockchain as a linked list, which is quite problematic to make changes to. But this in itself is not very interesting. Therefore, in this article, we will consider the blockchain from the point of view of its main application in crypto using the example of Bitcoin – as the most popular and well-known cryptocurrency on the market.

Distribution

Bitcoin blockchain is cryptographically protected and unchangeableand also distributed, that is, in the Bitcoin blockchain (and not only) there is no single center of consolidation. The network is peer-to-peer, and all participants in this network are in approximately the same position, have the same rights.

But distribution is not actually a mandatory property of a blockchain. There are blockchains built with a trust center, a single point of data consolidation, which is responsible for adding new blocks.

Consensus algorithm

Bitcoin has a clever consensus algorithm that ensures trust in this entire peer-to-peer network. It is thanks to it that each participant has more or less equal rights. If you have encountered any other distributed systems, for example, Kafka, Redis, MongoDB clusters, then you know that consensus is a difficult task, a problem with an asterisk. And this is true even if in a particular distributed information system, nodes do not seek to deceive each other and trust is assumed by default.

There are many consensus algorithms. In blockchain, it is even more complex, because it is assumed that participants can lie to each other, deviate from the protocol in order to gain an economic advantage. After all, bitcoin is money, and in this system, it really makes sense.

Now let's talk about what gives the Bitcoin blockchain its three key properties.

Hashing and what does the chain have to do with it

The first thing worth mentioning is this hashing. It is what ensures the immutability of the blockchain. In general, hashing is used everywhere, for example, in cryptography, in algorithms (as the basis of hash tables, which people like to ask about in interviews), and some rather naive people use it to anonymize personal data (in reality, this almost never works). In blockchain, hashing is a mechanism that tightly links blocks, and is also used as a computational basis for the Proof of Work consensus mechanism, but more on that later. First, definitions.

Hash function — is a function that takes data of any size, shuffles it, and “cooks” it into data of a fixed size, potentially reducing entropy.

Entropy – this is, in simple terms, a measure of variability. For example, a 16-bit number can take 65,536 (2^16) different values, while an 8-bit number can only take 256 (2^8). It turns out that some function that turns a 16-bit number into an 8-bit one compresses the space: for example, 256 different 16-bit arguments will correspond to the same 8-bit value. These will be collisions caused by a decrease in entropy.

As entropy decreases, collisions can occur. Collision in hashing – a situation in which different arguments of a function produce the same result. If hash(Petya) = 42 and hash(Masha) = 42, then it is impossible to distinguish Masha from Petya by hash. In the case of cryptocurrencies, such “indifference” can collapse the entire structure: people, as a rule, want to distinguish their money from other people’s.

The better and more robust the cryptographic hash function, the lower the probability of collisions and the probability of reversibility. Even with a hash, it becomes more difficult (in fact, very difficult) to unwrap it and see what's inside. Another important property of a good hash function is sensitivity: the slightest change in the argument changes the result beyond recognition. This is the “butterfly effect”.

Bitcoin uses a good, cryptographically strong hash function called SHA256. The 256 in its name hints that the function's output is a 256-bit number. These are very, very large numbers, and the SHA function itself is a rather tricky sequence of shuffling and arithmetic, so the probability of a collision with SHA256 is quite low. Also, SHA256 values are more or less pseudo-random numbers from a uniform distribution. That is, despite the fact that the dimensionality of the function's output may be lower than its argument, this result is still quite high-entropy.

SHA256 hash function hashes the block header. Now it is not so important what this header consists of. What is more important is that the hash is calculated from it and stored in the same header. And in the header of the next block, the hash is calculated from the header of the block itself and from the hash of the header of the previous block. So, in the header of block #N there will be a hash of the header of block #N-1, in the header of block #N+1 there will be a hash of the header of block #N-1, in the header of block #N+2 there will be a hash of the header of block #N+1, and so on.

This is how a linked list is implemented. This makes the task of changing information inside a block computationally difficult. If we change something inside one block, all subsequent blocks will fall apart because the hashes will not match. To change information in the middle of the chain, we will have to change all the other blocks, starting from the one we changed and ending with the end of this chain. All the magic is hidden in how the blocks are linked together by hashes.

It is logical that there is no point in simply hashing some headers, and each of the blocks contains information. Thanks to this, you can store anything inside the blocks and do this immutablethat is, it will not be possible to change it after recording in the blockchain. In the case of the bitcoin we are considering, transactions are saved in the block data. Transactions are grouped, put into a block and hashed, remaining unchanged.

Digital signature

Now about authenticity. Let's get it out of the chambers ~~of reason~~ digital signature.

Digital signature — is a cryptographic primitive that is used not only in blockchain. For example, you can sign mortgage documents without going to the bank at all — through the Gosklyuch app.

A similar principle is used in blockchain. Electronic signature is based on asymmetric cryptography. Asymmetric cryptography is when there are two keys: one is private, the other is public.

The private key is called private because you don't give it to anyone or show it, but keep it as safe as possible. The public key, as the name suggests, can be shown to anyone, even distributed to everyone on the Internet. The magic is that there is a certain mathematical connection between these two keys.

In the case of blockchain, there is an ace up its sleeve – instead of classic encryption, we use hash calculation and a private key. Now in more detail.

Classic encryption vs. electronic signature

In classical encryption algorithms it is assumed that information is encrypted with a public key and decrypted with a private one. For example, I gave you my public key, and if you want to send me a secret message, you encrypt it with the public key and, without fear for the contents of the message, send it over unprotected communication channels. After all, you know that only a person who has a private key can decrypt this message. The encrypted “abracadabra” passes through many devices on the Internet, eventually reaching the recipient, who, using his private key, decrypts it and reads the contents.

Electronic signature works exactly the opposite: we encrypt with a private key and decrypt with a public one. It is assumed that if something was decrypted with my public key, then I did it, because I have the corresponding private key.

If there is some data that we want to sign, then hashing the data is a desirable but not mandatory step. After all, all calculations in cryptography, especially encryption in asymmetric schemes, are quite heavy and take a long time to calculate. The less data we encrypt, the better. Therefore, we can first calculate a hash from this data, the same SHA256, or use any other hash function.

We calculated the hash of the data, and then encrypted this hash with a private key. In the picture above, Bob wants to sign a document and send it to Alice so that Alice would be sure that it was he who signed the document. Therefore, Bob, using his private key, encrypted the hash of the message. And then attached the resulting cryptotext to the document itself. The result is a message consisting of the original document and the hash of this document, encrypted with Bob's private key. Then the message is sent over unprotected communication channels, bounces around switches, routers, and, ultimately, comes to Alice.

Having received the message, Alice proceeds to verify the electronic signature. To do this, she simultaneously calculates the SHA256 hash of the document and decrypts the cryptotext attached to it with Bob's public key, because she knows it. As a result, Alice receives the hash of the document and the decrypted information. If these two entities match, then Alice is sure that the document was signed by Bob, because no one else could have created such a cryptotext that, when decrypted, would give a result that matches the hash of the document itself attached to the message.

Bitcoin (and other things) uses digital signature on elliptic curves. This is a popular method. The thing is that algebra with points on elliptic curves is quite complexly organized, but it allows using keys shorter than in RSA and calculating them faster.

Key to the wallet

As explained above, in the Bitcoin blockchain we use keys – or rather two keys: private and public, and, of course, a hash – where would we be without it.

When you create a Bitcoin wallet, you create private key. And then from the private you create a public one, complementary to your private one. The hash of the public key is the address of your bitcoin wallet. When money is transferred from one wallet to another, the hash of the public key is used as an address. This is convenient because the public key travels, it is already in each transaction. And the private key is only with you and no one else. If you give it to someone or it is stolen, your bitcoins will disappear, you will not be able to get them back. If you forget or lose the key, then the bitcoins will also disappear. There are enough such stories on the Internet and they are very sad.

There are different ways to protect the key. For example, hardware wallets were in fashion at one time – flash drives with a screen or metal cylinders in which you had to put together nine or twelve words from special letters. Keys were also written down on paper and stored in special wallet applications.

Another option is to use a wallet on the exchange, but it is important to understand that in this case you do not actually own the private key. It is clear that there is a reputation of the exchange or online wallet. But there is a risk that at any moment their owner can disappear into thin air along with all the users' money. This happens rarely, but such a risk exists. And if we are talking about the institution of reputation, then banks and states also have it. And if we trust someone so much that we give our money for safekeeping, then why do we need bitcoin and blockchain at all? My opinion: if you play this game, then play by the rules, and keep the private key to yourself.

The hardest thing is consensus

All the wonderful bonuses listed above would not work without the consensus algorithm. Let's get another ace up Satoshi's sleeve and talk about consensus.

Consensus – the most difficult thing in all distributed systems. Bitcoin with its blockchain is no exception. Since Bitcoin is distributed, it does not have any single center of trust. Instead, there are quite a few nodes in the Bitcoin network.

There are different types of nodes: full nodes, light nodes, miners. A full node stores a full copy of all transactions that have occurred in the Bitcoin network throughout history, starting with the very first one. There are several full nodes, and each of them stores a copy of the entire Bitcoin blockchain. This provides a certain replication and, accordingly, reduces, and, in fact, makes fraud extremely unlikely. After all, there are several thousand full nodes, and it is difficult to imagine that they will somehow agree and change something in the blockchain.

Blockchain is developing, transactions in the network continue to go, and we use bitcoins. Moreover, governments of some countries are thinking about or even already using bitcoin. In other words, blockchains that store full nodes are constantly updated. The update mechanism is implemented by special nodes – miners (which, in fact, also store a copy of the blockchain and in this sense are full nodes).

Everyone knows that mining — is a magical process that allows you to burn electricity and earn money. It works something like this: there is a queue of transactions that users generate. When I send a bitcoin, and the transaction goes to some node closest to me, this node sends it to other nodes, and thus this transaction multiplies and immediately appears in several places. But even though the transaction appears there, it is not yet confirmed, not included in any block, and even if it is included, the block itself has not yet become part of the blockchain. First, the transaction will stand in the queue of unconfirmed transactions, and will wait until someone confirms it. Miners take transactions from this queue in any order: chronologically or by the size of the commission.

A miner collects a certain number of transactions and forms a block. This block contains transactions and a header. Now it seems that let all miners send these blocks to the blockchain full nodes that store the blockchain, and let these full nodes include these blocks in their stored blockchains. But a problem arises: how to determine whose block should be included? For example, two miners sent two blocks containing contradictory transactions. How to determine whose block should be included in the blockchain? In a distributed system, there is no trust center. In our usual economic system, this trust center is the Central Bank or just a bank. In a blockchain, some mechanism is needed that would allow miners to prove that their block is valid, that it can be trusted, and it should be included in the blockchain. This mechanism is called consensus. I will tell you about it.

Proof of Work

But in the Bitcoin blockchain (and not only) we also have a complex consensus algorithm. In the blockchain, consensus is used by genesis Proof of Work (proof of work), but there are other consensus algorithms. However, Proof of Work is the most used in modern blockchains.

The proof of work method itself was actually invented before blockchain. Back in the 2000s, I worked for a company that deals with network hardware and software related to networks. A typical task was protection against DDoS. One of the methods of protection was Proof of Work. A client requests a service from you, and you, realizing that you are under load, give him a computationally complex task. You do this, assuming that if the client is really real, then it will cost him nothing to solve it and return with an answer. If this is not a client, but a bot whose task is to take down your site by calling Denied Service, then it will leave (and that's where it belongs).

A similar thing is used to protect sites from DDoS at levels higher than L3: the browser calculates the hash, and then returns it. If the result matches what you expect, then you skip it. This technique is called HashCache.

In the blockchain, everyone keeps a queue of transactions, which is distributed among full nodes. Where are the full nodes on the globe? Everywhere: most of them are in China, America, Russia. Miners take transactions from the queue and put them into blocks. When a miner has collected enough transactions for a block, he prepares a header for it. The block header includes the root of the Merkle tree. These are specially collected hashes from all the transactions that are in this block. There is also a timestamp and a hash from the header of the previous block. This is the principle of a link in the blockchain. It contains a protocol version, difficulty, and a special “number” called Nonce.

Initially Nonce is chosen more or less randomly, that is, the miner himself can set this Nonce to any value. Further from this data block, the miner calculates the hash. And there is a special condition that this calculated hash must satisfy: in the Bitcoin network, this hash must be less than a certain pre-set value chosen by all the nodes of the blockchain.

When you have calculated the hash of a data set, you get a pseudo-random combination from a more or less uniform distribution. And if you impose a condition on this – for example, that the Nonce must be less than a predetermined value – then this task becomes difficult. The only known way to solve this task today is a complete enumeration. The miner must calculate hashes almost non-stop, changing the headers. If the condition is not met, it calculates again, and the Nonce is simply increased by one. And this is repeated many, many times – until a hash value is obtained that satisfies the condition. Once this happens, the task is solved. Such a block can be sent back to the network nodes, which will have to include it in the blockchain.

The hash operation has an interesting property: it is one-way. It is easy to calculate the hash in one direction, but computationally difficult to calculate the hash in the opposite direction. Therefore, nodes that receive a block can easily check it. They will simply calculate the hash from the data that is already included in this block, including the Nonce, and compare them. If the hash value is less than the well-known target, then you have indeed solved this problem, and this block can be included in the general chain of blocks in the blockchain. All miners compete in choosing a hash that will satisfy the condition.

The target that miners aim at is changeable. Moreover, it changes according to certain rules. It is assumed that a new block should appear in the Bitcoin network every 10 minutes. It is clear that there are more or less participants in the network, and the number of transactions in the network also varies. Therefore, this target is selected based on the condition: one block per 10 minutes. If suddenly full nodes notice that there are fewer blocks and the load is high, then the complexity of the task decreases and it becomes easier to solve. If the opposite happens, which happens more often, and too many miners begin to generate new blocks faster than once every 10 minutes, then there are more confirmed blocks and the blockchain grows. This means that some branches need to be cut off, and the task becomes more complicated. This is how the system balances itself.

When forming a block, the miner includes in it the transactions that he collected from the queue, and another special transaction that accrues a reward specifically to him. Therefore, if some miner confirmed his block, managed to do it before others, sent it to the network, and this block was included in the blockchain, then this transaction is considered confirmed, and the miner gets his penny. This penny depends on the target. The more difficult the task, the more money the miner gets. Therefore, as a rule, miners rake out the most difficult and expensive ones from the transaction queue.

There are many users in the Bitcoin network, everyone wants transactions to go faster. For this purpose, Bitcoin has a mechanism where the sender himself sets a commission to the miner for including his transactions in the block, and the corresponding block in the blockchain. He does this to speed up confirmation.

If you are a generous person and you really need to send money quickly, you can raise the fee a little bit, and then the miner will be more likely to take it before everyone else.

Confirmations or ways to combat dangers

As we have already discussed above, it happens that the blockchain grows and from a straight line turns into a tree with different branches. This is an unpleasant situation that should be avoided. As a rule, when full nodes detect it, they begin to track the size and length of these branches, because the branches grow parallel to each other, and as soon as one of them becomes longer than the other, the short one is cut off. And all the transactions that were included in the blocks of the cut off branch, but were not included in the blocks of the remaining one, are put back in the queue, and they are again considered unconfirmed. As a result, it can be like this: you picked up a block, solved the problem, received a commission, and then it turned out that this block was not in the main branch and it was cut off, and the commission was taken back.

But the main problem that can happen is different: when the transaction is cancelled and put back in the queue, the money has disappeared from the recipient and returned to the sender. And what if the sender has spent the money by that time? To prevent this from happening, there is another complication called the “number of confirmations”. As a rule, all Bitcoin wallets change the balance and allow you to use Bitcoins after a certain number of confirmations.

A block is considered a confirmation of a transaction. Once your transaction has been included in a block, and the block in the blockchain, this is the first confirmation of your transaction. When other transactions have been included in a block, and this block is after the one in which your transaction was included, this is the second confirmation. As soon as a third block appears, this is the third confirmation. And for transactions that are placed in a block after yours, this will be the second confirmation, and so on. That is, each new block makes plus one to the confirmation of your specific transaction.

Statistically, these growing parallel branches do not exceed six blocks before the cutoff. Almost always, when a branch is five blocks long, the second one has already overtaken it. For this reason, many Bitcoin wallets and exchanges that provide the ability to store and spend Bitcoins set a limit on the number of confirmations equal to five. Once you have received five confirmations, this means that your transaction will almost certainly not be canceled, and the received bitcoins are indeed yours – you can use them. You can view the number of confirmations in any wallet online.

This is how blockchain works, using the example of a star in the world of cryptocurrency, Bitcoin. Do you use cryptocurrencies? Do you trust the principles of blockchain more than state financial regulators? Write your position in the comments.

explanation from ML and AI expert Petr Yemelyanov

Immutability

Distribution

Consensus algorithm

Hashing and what does the chain have to do with it

Digital signature

Classic encryption vs. electronic signature

Key to the wallet

The hardest thing is consensus

Proof of Work

Confirmations or ways to combat dangers

How AI helps throw hats out of the window directly onto the heads of passers-by

5 phrases you shouldn’t tell your tech lead, even when you really want to

How to Use Google Services on Huawei Smartphones with Gbox and MicroG

Windows Forensic Artifacts Overview

Problems with design review

What it's like to work in Azure support

Leave a Reply Cancel reply

Immutability

Distribution

Consensus algorithm

Hashing and what does the chain have to do with it

Digital signature

Classic encryption vs. electronic signature

Key to the wallet

The hardest thing is consensus

Proof of Work

Confirmations or ways to combat dangers

Similar Posts

Leave a Reply Cancel reply