ERC 4337: Account abstraction without changing the Ethereum protocol

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

ERC 4337: Account abstraction without changing the Ethereum protocol

DAOrayaki

2021-12-23 12:42

本文约3840字，阅读全文需要约15分钟

Account abstraction has long been a dream of the Ethereum developer community.

DAOrayaki DAO Research Bonus Pool:

Voting progress: DAO Committee 3/7 passed

Funding address: DAOrayaki.eth

Voting progress: DAO Committee 3/7 passed

Total bounty: 80USDC

Types of research: DAO, ERC4337, account abstraction, smart wallet

Original Author: Vitalik Buterin

Contributors: Dewei, DAOctor @DAOrayaki

Original text: ERC 4337: account abstraction without Ethereum protocol changes

Account abstraction has long been a dream of the Ethereum developer community. The EVM code is used not only to implement the logic of the application, but also to implement the logical verification of individual user wallets (nonces, signatures...). This will provide a lot of ideas for the innovation of wallet design, and the wallet can provide some important functions such as:

1. Multisig and social recovery

2. More efficient and simpler signature algorithms (such as Schnorr, BLS)

3. Post-quantum secure signature algorithms (eg, Lamport, Winternitz)

All of these things can be done today with smart contract wallets, but the Ethereum protocol itself requires everything to be packaged in transactions originating from an ECDSA secure external account (EOA), which is difficult for smart contract wallets to achieve. Every user operation needs to be wrapped by a transaction from the EOA, adding 21000 gas overhead. Users need to own ETH in a separate EOA to pay for gas and manage balances in both accounts, or rely on relay systems, which are often centralized.

EIP 2938 is a way to solve this problem by changing some Ethereum protocols to allow top-level Ethereum transactions to start with contracts instead of EOAs. The contract itself will have verification and fee payment logic, which miners will check. However, when protocol developers pay close attention to merging and scalability, they need to make significant changes to the protocol. In our new proposal (ERC 4337), we provide a way to achieve the same benefits without changing the consensus layer protocol.

secondary title

How does this suggestion work?

What we modify is not the logic of the consensus layer itself, but the function of replicating the transaction mempool in the higher level system. The user sends a UserOperation object, packaging the user's intent with a signature and other data for verification. Both miners and bundlers using services such as Flashbots can bundle a set of UserOperation objects into a "bundle transaction", which is then included in an Ethereum block.

Bundlers are paid for bundled transactions in ETH and are compensated through fees paid as part of all individual user actions executed. Bundlers will choose which UserOperation objects to include based on fee prioritization logic similar to how miners operate in the existing transaction mempool. A UserOperation looks like a transaction; it is an ABI-encoded structure that includes the following fields:

1. Sender: wallet for operation

2. nonce and signature: parameters passed to the wallet verification function so that the wallet can verify the operation

4. callData: the data used to call the wallet in the actual execution steps

The rest of the fields are related to gas and fee management, a complete list of fields can be found in the ERC 4337 specification.

secondary title

A wallet is a smart contract that needs to have two functions:

2. The op executes the function and interprets the calldata as an instruction for the wallet to take action. How this function interprets calldata and its results is completely open; but we expect the most common behavior to be parsing calldata into instructions for the wallet to make one or more calls.

To simplify the wallet's logic, most of the complex smart contract tricks needed to ensure security are done not in the wallet itself, but in global contracts called entry points. The validateUserOp and execute functions are expected to be gated with require(msg.sender == ENTRY_POINT) so only trusted entry points can make the wallet perform any operations or pay fees. The entry point only makes an arbitrary call to the wallet after validateUserOp, and the UserOperation carrying the data of that call has succeeded, so this is enough to protect the wallet from attack. The entry point is also responsible for creating a wallet using the provided initCode if the wallet does not exist.

secondary title

Runtime entry point control flow

If the validation of the UserOperation is successfully mocked, the UserOperation is guaranteed to be containable until some other internal state change occurs on the sender account (either because another UserOperation has the same sender or another contract that invokes the sender; in any In one case, triggering this condition for an account requires spending 7500+gas on the chain).

Also, user actions specify a gas limit for the validateUser step, mempool nodes and bundlers will reject it unless this gas limit is very small (eg, below 200000). These restrictions replicate key properties of existing Ethereum transactions, making the mempool immune to DoS attacks. Bundlers and mempool nodes can use logic similar to today's Ethereum transactions to determine whether to include or forward a UserOperation.

secondary title

What properties does this design add, maintain and sacrifice compared to the regular Ethereum transaction mempool?

Maintain properties:

1. There are no centralized participants; everything is done through a peer-to-peer mempool

2. DoS safe (user actions that pass impersonation checks are guaranteed to be containable until the sender has another state change, which would require the attacker to pay 7500+ gas per sender)

3. No complexity of user-side wallet setup: users don't have to care whether their wallet contract has been "published"; the wallet exists at a deterministic CREATE2 address, and if the wallet doesn't exist yet, the first UserOperation will automatically create it

4. Full EIP 1559 support, including fee settings (users can set a fixed fee premium and the highest total fee, and expect fast inclusion and fair fees)

5. Ability to send new user actions at a higher premium than old actions to replace actions or include them faster with paid replacement

New benefits:

1. Verification logic flexibility: the validateUserOp function can add arbitrary signature and nonce verification logic (new signature scheme, multi-signature...)

3. Wallet upgradeability: Wallet verification logic can be stateful, so a wallet can change its public key or (if issued using DELEGATECALL) completely upgrade its code.

shortcoming

4. Execution logic flexibility: wallets can add custom logic for execution steps, eg. Do atomic multi-ops (a key goal of EIP 3074)

shortcoming

1. Despite the best efforts of the protocol, there has been a slight increase in DoS vulnerabilities simply because the allowed verification logic is more complex than the status quo with a single ECDSA verification.

2. Gas overhead: Slightly more gas overhead than regular transactions (although in some use cases this is offset by multi-operation support).

3. One transaction at a time: accounts cannot queue up and send multiple transactions to the mempool. However, the ability to perform atomic multi-operations makes this functionality much less necessary.

Sponsorship with Payer:

There are a number of key use cases for sponsorship transactions. The most commonly cited desired use cases are:

2. Allow users to pay fees with ERC20 tokens, and the contract acts as an intermediary to charge ERC20 and pay with ETH

The proposal could support this functionality through a built-in payment administrator mechanism. A UserOperation can set another address as its payer. If the payment administrator is set (ie non-zero), then during the verification step, the entry point also calls the payment administrator to verify that the payment administrator is willing to pay for the UserOperation. If so, then the fee will be deducted from the ETH of the payment director staked within the entry point (withdrawals delayed for security) rather than the wallet. In the execution step, the wallet calls as usual with calldata in UserOperation, but then calls paymaster with postOp.

secondary title

An example workflow for the above two use cases is:

2. The payment administrator verifies whether the sender's wallet has enough ERC20 balance to pay UserOperation. If so, the paymaster accepts and pays the ETH fee, then claims the ERC20 token in the postOp as compensation (if the postOp fails due to a UserOperation draining the ERC20 balance, execution will resume and the postOp will be called again, so the paymaster always gets paid ). Note that currently, this can only be done if the ERC20 is a wrapped token managed by the payment administrator itself.

Note in particular that in the second case the payment supervisor may be completely reactive and may occasionally rebalance and reset parameters. This is a huge improvement over existing sponsorship attempts, which require payers to be always online to actively process individual transactions.

secondary title

How is this proposal going?

ETH

DAOrayaki

作者文库