Byzantine failure problem in blockchain

In the previous section, we saw how the blocks are appended to form a blockchain. We also looked at how the cryptographic hash function plays a vital role to ensure the integrity of the blockchain. Although the blockchain maintains integrity, it doesn't ensure that a single version of blockchain can be maintained in the decentralized network. Every node in the network is capable of maintaining their own version of blockchain since the block creation is not a difficult task. This is a well-known distributed system problem called the Byzantine Generals' Problem, or Byzantine failure.

Byzantine failure is a fault that presents different symptoms to different observers. It occurs when there is a loss of a service in a system that needs to achieve a consensus. This kind of failure is witnessed in distributed systems, where it is difficult to gather information about the status of components, and the presence of bad actors makes it more difficult to reach a consensus.

The term Byzantine failure is derived from the Byzantine Generals' Problem, which is an agreement problem in which a group of generals representing the Byzantine army are planning to attack a city. Some of the generals might decide to attack, whereas others retreat. They should come to an agreement about whether to attack or retreat so that the mission would be a success. Communicating the generals' votes to each other is a difficult task because they are distant from each other and there was no convenient way of communication. Due to this, there would be a delay or miscommunication among the generals. The problem is further elevated by the presence of unreliable generals as they might try to cheat while casting a vote so that the mission fails. If such a system fails to achieve an agreement with the majority of votes, it would result in a failed mission because the army that decided to attack might not have enough support from the rest of the generals. This is a classic agreement problem for which there is no one single solution. The solution to the Byzantine General's Problem is to find a majority vote among the honest generals.

A system that displays a Byzantine fault tolerance (BFT) is one that can overcome the Byzantine failure problem. In a digital system, cryptographic primitives such as digital signatures can provide fault tolerance for security-critical systems by creating unforgeable message signatures. Achieving data integrity can provide some resistance to the Byzantine failure problem, but it would not be a complete solution.

Now that we understand the Byzantine problem, we can notice that the problem is applicable to any distributed system. The problem also exists in the blockchain network, where participants are spread across a decentralized peer-to-peer network. Maintaining a single truth in a decentralized network is a difficult task, and the involvement of bad actors in the network makes it even more difficult. The decentralized network of a blockchain must agree on a single state to make the blockchain consistent among all the blockchain nodes. The occurrence of the Byzantine problem in a blockchain network is inevitable because blockchain networks exist in a decentralized trustless environment. The nodes of the network should reach a consensus on how to attain a universal blockchain state. Miners, in particular, should reach a consensus because they are the ones that contribute to the growth of the blockchain.

A blockchain miner is a node that not only validates the data of the blockchain, but also contributes resources to create a new block in the blockchain ledger.

Bitcoin was the first decentralized application that solved the Byzantine problem. It achieved this by using a consensus algorithm called Proof of Work, which was inspired by the Hashcash system proposed in 1997 by Adam Back, a British cryptographer. Hashcash was developed to validate legitimate users and reduce email spam by creating a stamp that requires some amount of computation. The Hashcash stamp was created using a hashing algorithm. Although the stamp creation was time consuming, verification could be performed instantly. Similarly, Bitcoin's Proof of Work also uses a cryptographic hash function to achieve consensus in the network.

There are several consensus algorithms that achieve a common global view in a blockchain. Proof of Stack, Proof of Activity, Proof of Capacity, and Proof of Elapsed Time are just a few examples. Even the popular Ethereum blockchain framework currently uses the Proof of Work consensus, but there have been active development efforts to include Proof of Stake in the future release of Ethereum.