Comparing Finality and Liveness of Various Consensus Algorithms
BFTF技术社区联盟
2018-10-19 03:13
本文约4814字,阅读全文需要约19分钟
Don't waste time designing consensus algorithms for purely asynchronous networks.

Original Author: Soul Machine

Due to the principle of FLP Imposibility,

No completely asynchronous consensus protocol can tolerate even a single unannounced process death


So don't waste time designing a consensus algorithm for a purely asynchronous network. The solution is to either strengthen the requirements for the network and require the network to be synchronous or partially synchronous (from asynchronous to synchronous or semi-synchronous, the network status becomes better), or relax the requirements for finality, do not require Deterministic finality, only require Probabilistic Finality is fine. It is also possible to do both at the same time.


Table 1 Comparison of various consensus algorithms


The contents of the above table are explained in detail below.


Finality

Finality looks similar to Safety, Consistency, but it seems to be different, which is very confusing.

For a simple understanding, you can think that Finality, Safety, and Consistency are synonyms, that is, Finality=Safety=Consistency.

For a deeper understanding, I think Finality is a complex, Finality>Safety>Consistency. Consistency is applicable to all distributed systems, including trusted environments such as Hadoop and untrusted environments such as Bitcoin; Safety is applicable to Byzantine environments; Finality is a term in the blockchain scenario, and blockchain is a Byzantine scenario Subset (the Byzantine problem has non-blockchain solutions in addition to blockchain). Consistency is the C in CAP theory, which is more general and requires less requirements than Finality. Finality contains more and stronger meanings than Consistency. Finality includes all of the following meanings:

  • The data of all nodes should be consistent. The client sends a read operation to any node, and the result should be the same. This is the Consistency mentioned in CAP. This term is applicable to all distributed systems, including trusted environments such as Hadoop and untrusted environments such as Bitcoin.

  • All nodes must ensure that they do not enter a state of mutual conflict and split, that is, safety. This term is applicable to the field of Byzantine fault tolerance. In an open environment like Byzantine, a malicious node will broadcast two conflicting messages (such as a double-spending transaction), causing all nodes in the entire network to fall into a split state, unable to achieve consistent , which destroys the nature of Safety. All Byzantine consensus algorithms need to deal with the problem of malicious nodes to ensure the safety of the entire network.

  • Once a transaction enters a block, it should be irrevocable, that is, Immutability. In the blockchain scenario, to ensure Safety, it is not only necessary to deal with the problem of double-spending transactions issued by malicious nodes, but also to prevent malicious nodes from revoking transactions that have been packaged into blocks. Because the blockchain is a single-linked list and a linear structure, malicious nodes can theoretically start from an old block, fork a longer new chain, cancel all the transactions they have sent, and put their own Spend the money again (another form of double spending)


In summary, Finality=Consistency + Safety + Immutability.

Liveness

Liveness can be considered equivalent to Availability in CAP. When a partition occurs on the network, such as a submarine optical cable breaks and the global Internet is divided into two parts, can the entire blockchain system write new transactions normally? I like Finality's consensus algorithm. At this time, I will choose to wait indefinitely. New transactions cannot be written until the submarine optical cable is repaired and the Internet on both sides is interoperable. I like Availabilty's consensus algorithm. At this time, the two networks will operate independently, and the data will be separated. The data in the full nodes becomes inconsistent.

For example, Tendermint is this type, sacrificing Liveness in pursuit of Deterministic Finality. Assuming that the submarine cable is broken and the network is divided into two sides, each side has half of the validators, so in the vote and commit stages, all validators on each side vote for a certain proposal block 100%, and can only receive 50% of the proposal block at most. Voting cannot reach 2/3, so the entire blockchain network will wait indefinitely until 2/3 votes are collected. During this waiting period, the next block cannot be generated, and new transactions cannot be written, and the entire network is paralyzed.


When Bitcoin encounters this kind of network segmentation, the two parts of the Bitcoin system will continue to move forward, and new transactions can still be written to generate new blocks. When the submarine cable is repaired and the Internet on both sides is connected, then Choose Merge.

Before the submarine optical cable is broken, the status of all full nodes in the world is consistent, as shown in the following figure:

When the submarine optical cable breaks, the global network is divided into two parts, and the two parts will generate blocks independently. At this time, the two sides are inconsistent, but they cannot perceive each other, thinking that they are still a linear blockchain, as follows picture:

On the left and right, although a new block is still mined every 10 minutes, the block hashes of block07 on the left and block07 on the right are not equal. At this time, the Bitcoin network is still available, but Finality is destroyed.

When the submarine optical cable is repaired, at this time, the two sides synchronize the blocks with each other, and they will realize that there is a fork, as shown in the following figure:

Now the state of all full nodes in the world becomes the picture above, and there is a fork. Since the height of both sides is 8, it is impossible to decide which fork is correct. At this time, it depends on which side the miners support and which side counts. Ligao, which side has a new block first, then which side wins, and the shorter chain will be discarded. For example, if the right side has a new block first, then the right side wins, and the left fork is discarded. All full nodes The data in becomes a linear block chain again, reaching a consensus, as shown in the figure below:

In fact, even if the submarine optical cable is continuous and the network does not have partitions, it will often happen that two miners dig out a new block at the same time. At this time, it is a competition to see who is lucky, and whichever fork the next new block inherits will win. That is to say, Bitcoin will never have a deterministic consistent state in theory, and forks will appear at any time at any height, so Bitcoin sacrifices a little Finality in exchange for stronger Liveness.


Network Assumption

All distributed consensus algorithms have an implicit assumption about the network.
Let me talk about the classification of the network first:

Synchronization: The message must be delivered within a certain time T. The value T of the upper bound is a known constant, and all nodes know it. If the message is not delivered within T time, it will not count on the message, and the node will think that the message has been lost and will not continue to wait for it. All the nodes are in an orderly manner, and every time T passes, they will go one step forward, which is very neat.

Semi-synchronous: The message must be delivered within a certain time T, but the value of T is not a fixed value, but changes dynamically, for example, it is dynamically calculated according to the network conditions. all nodes

Asynchronous: The message will arrive at any time, it may be very fast, or it may be very slow. In short, there is no clear upper bound (upper bound) or even an indefinite delay. No matter how late it arrives, the node must accept and process the message, which cannot be simply Discard message due to timeout

The assumption of Bitcoin on the network is that the network is synchronous, and the time limit is about 10 minutes. After a Miner digs out a block, it broadcasts to the entire network. At this time, the entire Bitcoin system expects that this block will be all within 10 minutes. The online full node received it, which means that every 10 minutes, all full nodes will take a step forward neatly, that is, add a block to the end of their own blockchain. Even if the network speed is very fast, for example, in less than 3 minutes, this new block has been received by all full nodes, and Bitcoin will still move forward every 10 minutes (a new block is produced). Ethereum is similar, but the time limit is 15 seconds.

Tendermint assumes that the network is semi-synchronous in the propose phase, because there will be a timeout period in this step. If a new proposal block is not received after the time expires, other validators will think that the proposer node has hung up, so an empty block will be generated. Until round-robin to the next proposer. Tendermint needs to collect more than 2/3 of votes in both prevote and precommit, which is infinitely waiting, that is, in these two stages, it is assumed that the network is asynchronous. Ultimately, Tendermint's requirements for the network are semi-synchronous.

pBFT is asynchronous in the three stages of pre-prepare, prepare, and commit. Since it is asynchronous and there is no timeout mechanism, how can it move forward? If you collect more than 2/3, you can move on. However, when all nodes receive a request from a client, they will start a timer. If the request has not been completed within a certain period of time, View Change will be triggered. The View Change section is semi-synchronous.

Here you can appreciate the simplification of Tendermint compared to pBFT. Tendermint moved the timeout mechanism to the propose stage (equivalent to the pre-prepare stage of pBFT). If the proposer broadcasts a proposal block within the specified time, it will proceed to the next step. If it times out, it will also proceed to the next step. But this is that the proposal block is an empty block (the pre-vote and pre-commit process will be completely gone through when the block is empty, but the block height will not increase by 1, which is equivalent to the network idling until the round-robin to the next The new proposer node rebroadcasts the new proposal block). That is to say, in any case, the propose stage will move forward to the next step. But Tendermint's pre-prepare is asynchronous, and it may be stuck forever. pBFT moved the timeout mechanism to the View Change part, so pBFT has an extra View Change step, which is a little more complicated than Tendermint. Tendermint replaces the proposer node by submitting empty blocks and round-robin, while pBFT replaces the primary node through View Change. Tendermint eliminates the complicated View Change step.

In addition to eliminating View Change, Tendermint is also simplified in another place. All Tendermint information is stored in the blockchain. However, pBFT was proposed in 1999. At that time, there was no such thing as blockchain. Therefore, although all nodes of pBFT have consistent data, the data is stored in a decentralized manner. The data of each node of pBFT includes:

The state of each replica includes the state of the service, a message log containing messages the replica has accepted, and an integer denoting the replica’s current view.

Blockchain is a distributed database. For example, before DBMS databases such as MySQL appeared, people wrote data into files and stored them on hard disks, inventing various strange file formats and organization methods. With MySQL, managing data is much more convenient. Similarly, Tendermint stores all data in the blockchain, pBFT does not have a distributed database such as blockchain, and all nodes need to manage data on the hard disk by themselves. For example, in order to compress message logs, discard old messages, and save hard disk space, the concept of checkpoint is introduced .

The relationship between Tendermint and pBFT is similar to the relationship between Raft and Paxos. Tendermint is a simplified version of pBFT, which is a simplified version of pBFT for the blockchain scenario.

The following figure is the algorithm flow chart of Tendermint:

The following figure is the algorithm flow chart of pBFT:


References

References

  1. 1999.Castro.  Practical Byzantine Fault Tolerance

  2. Tendermint: Byzantine Fault Tolerance in the Age of Blockchains

  3. Consensus Compare: Casper vs. Tendermint

  4. A Proof of Stake overview

  5. Compared with traditional PBFT, what advantage does Tendermint algorithm has?

  6. Synchronous, partially synchronous and asynchronous consensus algorithms

  7. GRANDPA Block Finality in Polkadot: An Introduction (Part 1)

  8. Finality in Blockchain Consensus


BFTF技术社区联盟
作者文库