Ethereum Subverted Ethereum: Introducing Cryptography to Achieve 2.0 Performance Breakthrough

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

Ethereum Subverted Ethereum: Introducing Cryptography to Achieve 2.0 Performance Breakthrough

安比（SECBIT）实验室

2020-06-15 02:21

本文约5285字，阅读全文需要约21分钟

Block producers no longer need to store blocks to verify and package transactions.

Editor's Note: This article comes fromAmbi Labs (ID: secbitlabs), reprinted by Odaily with authorization.

Editor's Note: This article comes from

Ambi Labs (ID: secbitlabs)

, reprinted by Odaily with authorization.

Ethereum has also been trying various methods to improve performance. On the eve of the launch of 2.0, it "tried" cryptography. Ethereum 2.0 will be a public chain based on "distributed system + cryptography". This cryptography does not refer to the part used for signature and privacy, but refers to the core component of a high-performance system. that part.

From this point of view, perhaps we can say that it is not others who subvert Ethereum, but itself. It jumped out of the single idea of distributed system design and embarked on the road of combined design of distributed system + cryptography.

This article will try to introduce how the distributed system design is combined with the cryptography design in Ethereum 2.0 to achieve a breakthrough in the performance of the public chain.

secondary title

State sharding: from single ledger to multiple ledgers

The blockchain is a distributed ledger, and the block nodes are miners who keep accounts, and they are responsible for writing transactions into the ledger. In addition to competing for bookkeeping rights, the most important job of block producers, or their own job, is to check whether the transactions they pack are legal. It is not difficult to complete this work, because the block producing node holds the ledger in its hands, and it just needs to check whether the transaction sender has the money.

For the non-fragmented public chain, all nodes hold the same ledger; and in order to prevent bookkeeping conflicts, only one block-producing node is allowed to keep books at a time. Ethereum proposes state sharding, which actually divides a ledger into multiple ledgers. In this way, some nodes keep accounts in No. 1 ledger, and some nodes keep accounts in No. 2 ledger... (equivalent to 7-11 from a cash register The platform is increased to multiple cash registers), and multiple nodes keep accounts at the same time, and the performance of the entire public chain will be qualitatively improved.

But if we fix the relationship between block producing nodes and ledgers/shards, for example, it is determined that four nodes a, b, c, and d are in charge of the No. 1 ledger, then the bad guys only need to buy some of a, b, c, and d. It can destroy the ledger, and while the performance of the public chain is improved, the security will decrease in the same proportion.

The solution given by Ethereum is that the block producing node does not take any ledger, or in other words, the block producing node does not need a ledger to keep accounts.

This will bring two major benefits. First, no matter which shard the node is assigned to, it can start bookkeeping (block production) immediately, and it takes almost no time to obtain and synchronize the ledger of the shard. The node can therefore It is easy to jump between different shards; the second is that the block producing node does not need to store ledgers, and does not need high hardware configuration. Anyone who mortgages 32ETH can become a validator, which is very helpful for the decentralization of Ethereum PoS And the security of the entire public chain.

But a new problem emerges: If the block producer does not have a ledger, how does it know whether the transaction sender has enough money? This is where cryptography comes into play.

secondary title

Vector Commitment: From Query to Proof

It sounds incredible to be able to keep accounts without a ledger, but the idea is simple: in the past, a node had a ledger, and after a transaction came, it looked through the ledger to check whether the transaction was legal; in the future, the node has no ledger, and the transaction sender When submitting a transaction, you need to submit a cryptographic proof (in order to distinguish, the following text refers to the cryptographic proof as proof), and you can prove that your transaction is legal.

Why can the block producing node judge whether a certain transaction is legal through a proof? Two important concepts of cryptography are involved here, the first one is called "Member Proof". It refers to the process of proving that an individual is part of a group. If it can be proved that a certain account state is part of the entire ledger state, the block producing node can of course believe in this account state, and use this as a basis to judge the legality of the transaction.

The second one is called "Vector Commitment", which can compress a group, no matter how large the group is, into only one number, and then give a proof of membership, which shows that an individual belongs to the group behind the number Group, and can prove the position of the individual in the group, and update the certificate.

The figure below is a Merkle tree. The leaf nodes at the bottom layer store application data, and other non-leaf nodes store the hash values of their child nodes. If you know the value of the green node and all the yellow nodes, you can perform three hash operations from bottom to top to get the value of the root of the Merkle tree, which is 6c0a.

Then, if the verifier has the value of the tree root (6c0a), the proof provider takes the value of the green node and all the values of the yellow nodes as a proof to the verifier, will the verifier be able to calculate the hash value three times? Equal to 6c0a to judge whether the value of the green node is in this Merkle tree? The answer is yes. This is the proof of membership of the green node to the Merkle tree, which is done as a vector commitment, and this is pretty much how Bitcoin SPV nodes (Simple Payment Verification) work.

As shown in the figure below, the SPV node does not store the complete block/ledger, but stores the root of the Merkle tree in each block (the leaf nodes of the Merkle tree store all transactions in the block), when it needs to query When a transaction exists, the full node will be asked for a proof of the transaction, which is similar to a package (Merkle path) of the green node and yellow node values above, and then the SPV node will calculate the total hash of these values Whether the value is equal to the value of the root of the Merkle tree in your own hand. If they are equal, it means that the transaction is a member of the Merkle tree, that is, the transaction exists.

image description

The SPV node judges whether the transaction exists through membership proof. The proof system consists of three parts: the node has a short summary (tree root); the proof provider gives a proof; the node calculates the proof to see if it is consistent with the summary in its own hand match.

At this point, we have completed "checking accounts without ledgers", which is to change the query thinking to proof thinking; the next thing we want to achieve is "keeping accounts without ledgers".

For the block producing nodes on the Ethereum 2.0 shard, its proof system is also composed of three parts: abstract, proof, and verification, but it must use the proof given by the transaction sender (not the full node) To judge whether a new transaction is legal (rather than whether the old transaction exists), and use this judgment as the basis for accounting.

secondary title

Stateless: From Proof of Ledger to Proof of Behavior

Imagine a small village with only 3 transactions between villagers every day, and the village head is responsible for keeping accounts with the ledger. A now wants to transfer 5 yuan to B. The traditional idea is very simple: the village chief checks to see if there is 5 yuan in A's account, and if so, write down this new transaction.

Now change the way of thinking: Suppose A wants to transfer 5 yuan to B this morning, and the village chief knows that A’s account has 10 yuan yesterday morning, then if A can prove that the 3 transactions yesterday have nothing to do with him, right? Means his account still has 10 dollars this morning? In this way, can the village chief write down this new transaction without checking the ledger? The answer is yes.

What if A had a transaction yesterday? It's very simple. At this time, A does not prove that he has no transaction, but proves that he only had one transaction yesterday, and that transaction cost 3 yuan; the village chief knows that he still has 7 yuan, so he can record the new transaction.

This change of thinking is very important, and you must understand it. This is the mystery of the "stateless" thing. It is not difficult to find that even an SPV node that does not hold a ledger actually needs to use the ledger, or the state, when querying transactions, but it does not store the state by itself, but goes to the whole node to obtain the proof of this state ; But under this new idea, the role of state can be completely replaced by "behavior proof", then this chain can be designed in a stateless way. (Note: The word "behavior proof" has no source, it is described by the author for easy understanding)

How to achieve statelessness? How to complete bookkeeping with the help of proof of behavior? It is still the method of membership certification. Can a Merkle tree be used to complete this membership proof? In theory, it is possible, but for the "stateless" application scenario, the overhead of using it is too high. In this article, we introduce Proof of Membership via Aggregable Subvector Commitments for ledgerless bookkeeping.

Aggregatable Subvector Commitments (aSVC) is a recent research effort from the paper "Aggregated Subvector Commitments for Stateless Cryptocurrencies" by Alin Tomescu, Ittai Abraham, Vitalik Buterin (Ethereum), Justin Drake (Ethereum Square), Dankrad Feist (Ethereum), Dmitry Khovratovich (Ethereum). Its working process is as follows:

1. Initialize the sharding, that is, determine the initial situation of the account when the account book is established. Suppose a shard has 100 accounts when it is established, and these accounts have initial balances. We need to use v(i) to represent the i-th account, which is a pair of values such as (address i, balance i); use V Represents all accounts, it is a set of values such as (address 1, balance 1) (address 2, balance 2)...(address 100, balance 100).

At the same time, two values need to be generated. The first one is called c, which is a commitment to V, representing all accounts and the balance in the account of the shard at this time. The block producing nodes hold c in their hands (it can be compared with the root of the Merkle tree for easy understanding), which is the summary for future verification.

The second one is called π(i), which is a proof that v(i) is a member of V, representing the i-th account and the balance of the account is in the general ledger V. Each account holds and only holds its own π(i), which is the proof submitted to the block node when sending transactions in the future.

During the initialization phase, the generation of commitments and proofs requires an initial "state".

2. First transaction. Account i initiates the first transaction of the entire shard. At this time, it needs to submit π(i) and the transaction to the block node, and the block node calculates π(i) to see if the result is consistent with c in its own hand. If they match, you can believe how much balance the sender's account has, and use this to judge whether the transaction it submitted is legal.

3. Next is the key point: update c and π(i).

c (commitment to the entire ledger) is no longer generated according to the state, it is generated using c before the first transaction and the balance change caused by the first transaction; π(i) (the account’s proof of itself) It is not generated according to the state, it is generated with π(i) before the first transaction occurred, and the change of the account by the first transaction.

After completing the update of c and π(i), the block producing node has a new commitment (new c) that can promise the new balance of all users, and the account also has a new proof (new π(i) that can prove its new balance )).

By analogy, each transaction will change c once, and change all π(i) once, but this change no longer depends on the state data, it depends on the old c and π(i), as well as the previous transaction; when When a new transaction needs to be verified, the block producing node always has the latest c in its hands. It can judge whether a transaction is legal and whether it can be packaged into a block through c and the π(i) provided by the account.

So at this point, "accounts can be kept without a ledger" is finally realized. Whether for block producers or accounts, what they hold in their hands is some kind of cryptographic proof, not the state of the ledger. It should also be mentioned that statelessness and sharding seem to be a perfect match, but statelessness is not a design for sharding, it is a design for public chains.

The design goal of aSVC is to become an efficient proof of membership, reduce the communication overhead and calculation overhead in the above process, so that this scheme can be used for the implementation of stateless blockchain. From the perspective of the paper, using the aSVC scheme, the size of c and π(i) is only a few dozen bytes, the update time of π(i) is O(1), and the verification time is also O(1). It supports the aggregation of multiple proofs into one O(1) proof. This low-overhead implementation is exactly what aSVC is all about. However, just like Vitalik's discussion in the Ethereum Researcher Forum, aSVC still needs further optimization.

At the end of the article is a brief summary of the full text: the state sharding design of the distributed system is combined with the membership proof design of cryptography to achieve a breakthrough in the performance of Ethereum 2.0.

1. For safety, the state sharding of Ethereum 2.0 needs to randomly assign block producing nodes.

2. If the block producing nodes need ledgers, ledger synchronization will become a new performance bottleneck, and ledger storage will also affect the decentralization of PoS.

4. The first change in thinking: change the way of looking up the ledger to the way of proving the ledger. This needs to be done with the help of cryptography.

5. The second change of thinking: change the way of proving the state of the ledger to the way of proving the transaction behavior, and realize stateless and bookkeeping without ledger. This needs to be done with the help of cryptography.

6. There are many tools for cryptography. When you have a goal, you need to select and combine appropriate tools to form a solution according to the application requirements, and optimize the solution.

secondary title

Easter egg: aggregatable subvector promises

In the article, we describe the work of aSVC in natural language. If you are interested, you can understand it more clearly through the API definition of aSVC. As shown in the figure below: the first red box is to generate a commitment c during initialization, the second red box is to update c according to the transaction; the first green box is to generate a proof π(i) during initialization, and the second green box is to The transaction updates π(i); the blue box is the block node using c and π(i) for verification.

In the above process, the core work is to change the old c into a new c and change the old π(i) into a new π(i) according to the changes caused by the transaction. Not only must the update be able to be completed, but also the overhead of this update is acceptable, which is the key problem that aSVC needs to solve. Let's take the update of c as an example to introduce how aSVC does it.

As mentioned above, what c promises is V, and from c to new c is actually from promising V to promising a new V. For V, it is composed of a series of points, (address 1, balance 1) is one point, (address 2, balance 2) is another point... (address 100, balance 100) is the 100th point.

With the help of Lagrange interpolation, this series of points can be turned into a polynomial (the curve represented by the polynomial passes through all these points), which means that the commitment to a series of points can be turned into a commitment to a polynomial ; From c to new c is equivalent to going from committing one polynomial to committing another polynomial.https://eprint.iacr.org/2020/527.pdf。

From this point of view, perhaps we can say that it is not others who subvert Ethereum, but itself. It jumped out of the single idea of ​​distributed system design and embarked on the road of combined design of distributed system + cryptography.

At this point, we have completed "checking accounts without ledgers", which is to change the query thinking to proof thinking; the next thing we want to achieve is "keeping accounts without ledgers".

5. The second change of thinking: change the way of proving the state of the ledger to the way of proving the transaction behavior, and realize stateless and bookkeeping without ledger. This needs to be done with the help of cryptography.

From this point of view, perhaps we can say that it is not others who subvert Ethereum, but itself. It jumped out of the single idea of distributed system design and embarked on the road of combined design of distributed system + cryptography.