Why is zkEVM possible now? An article to understand the design challenges and workflow of zkEVM

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

Why is zkEVM possible now? An article to understand the design challenges and workflow of zkEVM

以太坊爱好者

2021-10-14 09:52

本文约7230字，阅读全文需要约29分钟

zkEVM can provide the same experience for developers and users.

Author: Ye Zhang@Scroll

Thanks to Vitalik Buterin, Barry Whitehat, Chih-Cheng Liang, Kobi Gurkan, and Georgios Konstantopoulos for their review and insights.

too long

We believe that zk-Rollup will become a "sweet spot" - its cost advantages and extremely high security will make many Layer 2 scalability solutions pale in comparison. However, existing zk-Rollup implementations are all application-specific, so it is difficult for us to build a composable general-purpose dApp within a certain zk-Rollup, and it is impossible to migrate existing applications. We introduce zkEVM to generate zero-knowledge proofs for general EVM verification. In this way, we can build a zk-Rollup that is fully compatible with EVM, so that existing Ethereum applications can be easily migrated to this zk-Rollup.

background

zk-Rollup is recognized as the best Ethereum scalability solution. Not only is it comparable to Ethereum Layer 1 in terms of security, but it is also the leader of Layer 2 solutions in terms of transaction finalization speed (click here to view detailed comparative analysis (original text, translation)).

In the medium and long term, with the continuous development of ZK-SNARK technology, zk-rollup will take the lead in all application scenarios. ——Vitalik Buterin

The basic principle of zk-Rollup is to pack a large number of transactions into a Rollup block and generate a concise proof for the block off the chain. Smart contracts on Layer 1 then only need to verify this proof to apply the new state directly, without having to re-execute these transactions. This results in gas savings of an order of magnitude, since the verification cost of the proof is much lower than the computational cost of re-execution. Another benefit is that storage space can be saved through data compression (i.e. only a minimal amount of data is stored on-chain for validation).

Although zk-Rollup is safe and efficient, its application is still limited to payment and swap. General purpose dApps are difficult to build for two main reasons:

First, if you want to develop a dApp within a zk-Rollup, you need to use a special language (ie, R1CS) to write the logic of all your smart contracts. This language has a complex syntax and requires users to be proficient in zero-knowledge proofs.
Second, existing zk-Rollup implementations do not support composability1. Therefore, on Layer 2, different zk-Rollup applications cannot interact with each other, which seriously damages the composability of DeFi applications.

In short, zk-Rollup is currently not developer friendly and has limited functionality. This is the biggest problem we want to solve. We want to provide the best developer experience by directly supporting native EVM verification, and support composability on Layer 2, so that existing Ethereum applications can be migrated to zk-Rollup intact.

Building Universal dApps in zk-Rollup

We can build general-purpose dApps within zk-Rollup in two ways:

One is to build application-specific circuits (“ASICs”) for different dApps.
The other is to build general-purpose "EVM" circuits for executing smart contracts.

"Circuit" refers to the program representation used in zero-knowledge proofs. For example, if you want to prove that hash(x) = y, you need to rewrite the hash function in circuit form. The circuit form only supports very limited representations (ie, R1CS only supports addition and multiplication). Therefore, writing programs in circuit language is very difficult - you can only use addition and multiplication to build all program logic (including if else, loops, etc.).

The first method requires developers to design dedicated "ASIC" circuits for different dApps. This is the most traditional way of using zero-knowledge proofs. Custom circuit design helps reduce the cost of individual dApps. However, this also poses a composability problem, since the circuits are "static" and the high reliance on circuit design knowledge leads to a poor developer experience2.

The second method does not require any special design, nor does it require the developer to have a strong professional knowledge. The underlying concept behind this type of machine-based proof is that any program will eventually run on a CPU. Therefore, we only need to build a general-purpose CPU circuit to verify low-level CPU operations. We can then use this CPU circuit to verify any program execution. As far as the application scenario of this article is concerned, the program refers to the smart contract, and the CPU refers to the EVM. However, this method has not been widely adopted in the past few years due to prohibitive cost. For example, even if you only want to prove that the result of add in a certain operation is correct, you still need to bear the cost of the entire EVM circuit. If you have thousands of operations in your execution trace, the prover pays 1000 times the cost of the EVM circuit3.

Recently, a lot of research has been devoted to optimizing ZKPs using these two methods, including (i) proposing a new ZKP-friendly primitive Poseidon hashing (Poseidon hashing is 100 times more efficient in circuits than SHA256) ; (ii) continued improvements in the efficiency of general-purpose verifiable virtual machines, like TinyRAM; (iii) more and more general-purpose optimization techniques, such as Plookup, and faster-running cryptography libraries.

Design Challenges of zkEVM

zkEVM is hard to build. Although this intuition has been clear for many years, no one has yet succeeded in building a native EVM circuit. Unlike TinyRAM, zkEVM is more challenging to design and implement for the following reasons:

First, the EVM has limited support for elliptic curves. Currently, the EVM only supports BN254 pairing. EVMs are difficult to implement provably recursive because they do not directly support cyclic elliptic curves. In this setup, it is also difficult for us to use other proprietary protocols. The verification algorithm must be EVM friendly.
Second, the word size of the EVM is 256 bits. EVM runs on 256-bit integers (just like most common virtual machines run on 32-64-bit integers), and zero-knowledge proofs "naturally" run on prime fields. Doing "mismatch domain arithmetic" in a circuit requires a range proof, which in turn adds about 100 constraints to each EVM operation. This increases the EVM circuit size by two orders of magnitude.
Third, the EVM has many special opcodes. Unlike traditional virtual machines, EVM has many special opcodes, such as CALL , and error types related to the execution environment and gas. This will bring new challenges to circuit design.
Fourth, EVM is a stack-based virtual machine. The SyncVM (zksync) and Cario (starkware) architectures define their own IR/AIR in a register-based model. They built a specialized compiler to compile the smart contract code into a new zero-knowledge proof-friendly IR. This method is language compatible, not native EVM compatible. Whether proving a stack-based model, or supporting native toolchains directly, becomes more difficult.
Fifth, the storage layout of Ethereum brings high costs. The Ethereum storage layout relies heavily on Keccak and a giant MPT4. Neither is zero-knowledge proof-friendly and incurs high proof costs. For example, Keccak hashes have 1000 times the circuit size of Poseidon hashes. However, if you replace the Keccak hash with another hash, it will cause some compatibility issues with the existing Ethereum infrastructure.
Sixth, machine-based proofs come with a high cost. Even if you can handle all of the above, you still need to find an efficient way to combine them to get a complete EVM circuit. As I mentioned in the previous section, even an opcode as simple as add can cost you the entire EVM circuit.

Why zkEVM is possible today

Thanks to the significant progress made by researchers, more and more efficiency problems have been solved in the past two years, and the proof cost of zkEVM is finally no longer an obstacle! The main technical progress is reflected in the following aspects:

Use of polynomial commitments. Most succinct zero-knowledge proof protocols over the past few years have used R1CS, with PCP queries encoded into application-specific trusted setups. This tends to increase the size of the circuit, making many custom optimizations impossible because each constraint must be of degree 2 (bilinear pairing only allows one exponential multiplication calculation). With the polynomial commitment scheme, you can increase your constraints to any order through a universal setup or even a transparent setup, greatly increasing the flexibility of backend selection.
The appearance of lookup table parameters and custom widgets. Another important optimization is the use of lookup tables. This optimization was first proposed in Arya and then implemented in Plookup. For primitives that are not friendly to zero-knowledge proofs (ie, bitwise operations like AND and XOR), lookup tables can save a lot of work. Custom widgets can efficiently implement high-level constraints. TurboPlonk and UltraPlonk define elegant program syntax that eases the difficulty of using lookup tables and defining custom widgets. This helps a lot in reducing the cost of the EVM circuit.
The feasibility of recursive proofs is getting higher and higher. In the past, recursive proofs were costly because they relied on special pairing-friendly circular elliptic curves (ie, MNT curve-based structures). This incurs a high computational cost. However, a growing number of techniques enable recursive proofs without sacrificing efficiency. For example, Halo does not require pair-friendly curves, and can also use a special inner product parameter to amortize the recursion cost. Aztec proves proofs that can be directly aggregated from existing protocols (lookup tables can reduce the cost of non-native domain operations, thereby reducing the size of verification circuits). The same circuit size can now achieve more functions.
Hardware acceleration is improving proof efficiency. To the best of our knowledge, we have built the fastest GPU and ASIC/FPGA accelerators for proof programs. Our paper on ASIC proof procedures was accepted this year by ISCA, the top computer academic conference. Our GPU prover is about 5 to 10 times faster than Filecoin's implementation, which can greatly improve the computational efficiency of the prover.

How does zkEVM work and build?

In addition to strong intuition and technical improvements, we have to figure out what we need to prove and conceive a more specific architecture. More technical details and comparative analysis will be reserved for future articles. In this article, we'll introduce the entire workflow and some core concepts.

Workflow for developers and users

Workflow of zkEVM

Even though the external workflow remains the same, the underlying processing of Layer 1 and Layer 2 is completely different:

Layer 1 relies on re-executing smart contracts.
Layer 2 relies on proof of validity of the zkEVM circuit

Let's explain in detail how the transactions on Layer 1 and Layer 2 are different.

On Layer 1, the bytecodes of deployed smart contracts are stored in Ethereum storage (storage items). Transactions will be propagated in a peer-to-peer network. For each transaction, each full node needs to load the corresponding bytecode and execute it on the EVM to obtain the same state (the transaction will be used as input data).

On Layer 2, the bytecode is also stored in the storage item, and the user operates in the same way. Transactions are sent off-chain to a centralized zkEVM node. Then, instead of just executing the bytecode, the zkEVM will generate a succinct proof that the state was correctly updated after the transaction. Finally, the Layer 1 contract will verify the proof and update the state without re-executing the transaction.

Let's take a deep dive into the execution and see what zkEVM ultimately needs to prove. In native execution, the EVM will load the bytecode and execute the opcodes in the bytecode one by one from scratch. Each opcode can be seen as performing the following three steps: (i) read elements from the stack (stack), memory or storage items; (ii) perform calculations based on these elements; (iii) write the results to onto the stack, memory, or storage item5. For example, the add opcode needs to read two elements from the stack, add them, and write the result to the stack.

Therefore, it is obvious that the proof of zkEVM needs to include the following aspects (corresponding to the execution flow):

Bytecode is correctly loaded from persistent storage (you're running the correct opcode loaded from an address)
Opcodes in a bytecode are always executed one by one (bytecodes are executed sequentially, no opcodes are missed or skipped)
zkEVM Design Highlights

zkEVM Design Highlights

When designing the architecture for zkEVM, we need to take measures to meet the above three requirements respectively.

1. We need to design a circuit for a cryptographic accumulator.

This is to act as a "verifiable memory", we need some kind of technology to prove that the reading process is accurate. Cryptographic accumulators can do this more efficiently6. Let's take a Merkle tree as an example. Deployed bytecodes are stored as leaf nodes on the Merkle tree. A verifier can then use a succinct proof to verify that this bytecode was correctly loaded from an address (i.e., verify a Merkle path in the circuit). For Ethereum storage, we need this circuit to be compatible with both Merkle-Patricia trees and Keccak hash functions.

2. We need to design a circuit to associate the bytecode with the actual execution trace.

Moving the bytecode into a static circuit poses a problem: conditional opcodes like jump (as opposed to loop, if else statements in smart contracts) may jump anywhere. Jump destinations are undefined until someone runs that bytecode with a particular input. That's why we need to verify the actual execution trace. An execution trace can be thought of as an "unrolled bytecode", containing the opcodes in the order they were actually executed (i.e. if you jump to another location, that target opcode and location will be included in the trace).

The prover will directly provide the execution trace as the witness data of the circuit. We need to prove that the execution trace is indeed "unrolled" by a particular bytecode with a particular input. The idea is to force the value of the program counter to be consistent. For the problem of uncertain destination, the solution is to ask the prover to provide all the data. You can then use lookup parameters to efficiently check for consistency (ie, prove that the opcode with the exact global counter is contained in the "bus").

3. We need to design circuits for each opcode (to prove that the read, write, and computation in each opcode are correct).

This is the most important part - proving that each opcode in the execution trace is correct and consistent. If you just put everything together, there will be high costs. The important optimization ideas here are:

We can separate R/W and computation into two proofs. A proof puts all elements used by the opcode on the "bus". Another proof would prove that the computation on the elements on the "bus" was performed correctly. This drastically reduces the cost per part (i.e. you don't need to consider the entire EVM storage when generating a proof of computation). In a more detailed specification, the former is referred to as "Proof of State" and the latter as "Proof of EVM". Another finding is that lookup statements can efficiently handle "bus mapping".
We can design a higher degree of custom constraints for each opcode (that is, we can divide an EVM word into multiple data blocks for more efficient processing). We can choose whether to "turn on" a constraint on demand through a selector polynomial. This avoids the cost of consuming the entire EVM circuit for each operation.

Originally proposed by the Ethereum Foundation, this architecture is still in its early stages and is under active development. We are working closely with the Ethereum Foundation to find the best way to implement this EVM circuit. So far we have defined the most important features of the EVM circuit and implemented (using the UltraPlonk syntax from the Halo2 library) some opcodes (click here to view). More detailed content will be introduced in subsequent articles. We recommend interested readers to read this document. The development process will be transparent. This will be a fully open-source design brought together by the entire community. I hope more people will join in and make a contribution.

What else can zkEVM bring us?

zkEVM is much more than Layer 2 scaling. We can understand it as a straightforward way to scale Ethereum Layer 1 through Layer 1 validity proofs. This means that the existing Layer 1 can be extended without any special Layer 2.

in conclusion

about Us

Note:

Note:

Composability was achieved in Starkware's September 1, 2021 announcement (click here to view the announcement).
Circuits are fixed and static. For example, you cannot use variable upper bound loops when implementing a program as a circuit. The upper limit must be fixed at the maximum value. Circuits cannot handle dynamic logic.
For the reader's convenience, we detail the cost of the EVM circuit here. As stated earlier, circuits are fixed and static. Therefore, the EVM circuit needs to contain all possible logic (this volume is 10000 times larger than the circuit containing only add). This means that even if you only want to prove add, you still have to bear the cost of all the logic that may be contained in this EVM circuit. That is, the cost is magnified by a factor of 10,000. In execution tracing, you need to prove a chain of opcodes, and each opcode comes with a high cost.
The EVM itself is not tightly bound to Merkle-Patricia Trees (MPTs). Currently, MPT is only used to store the Ethereum state. It is easy to replace it (some people propose to use Verkle tree to replace MPT).
This is a highly simplified abstraction. Technically, the list of "EVM state" is longer, including program counter, gas headroom, call stack (all of the above plus addresses and statics for each call in the stack), a set of logs and transaction scoped variables (hot storage slot, refund, self-destruct). We can additionally introduce identifiers for different calling environments to directly support composability.
Since the storage is large, we use an accumulator for storage. The memory and stack can be edited using Plookup (we can effectively implement "RAM" this way).
Original link:

Original link:

https://hackmd.io/@yezhang/S1_KMMbGt