Paradigm: Exploring the relationship between MEV-Boost and consensus mechanisms

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

Paradigm: Exploring the relationship between MEV-Boost and consensus mechanisms

DeFi之道

2023-05-01 06:00

本文约3952字，阅读全文需要约16分钟

This article aims to explore the interplay between Mev-Boost and consensus, reveal subtleties in Ethereum's proof-of-stake mechanism, and outline some possible ways forward.

Original title: "Time, slots, and the ordering of events in Ethereum Proof-of-Stake》

Original title: "

Author: Georgios Konstantopoulos, Mike Neuder, Paradigm

Compilation of the original text: wesely

On April 2, a malicious Ethereum network actor exploited a vulnerability in mev-boost-relay to steal $20 million from an MEV searcher (see Flashbots postmortem). Over the next few days, developers addressed the bug by releasing five patches, which, combined with existing network latency and validator policies, caused a brief period of instability on the ethereum network on April 6. Reorganization is bad for network health as it reduces block production rate and reduces settlement guarantees.

This article aims to explore the interplay between mev-boost and consensus, reveal subtleties in Ethereum's proof-of-stake mechanism, and outline some possible ways forward. We've been inspired by attacks on searchers and moments of network instability.

What is mev-boost? Why is it important?

mev-boost is a protocol designed by Flashbots and the community to mitigate maximum extractable value (MEV) from negatively impacting the Ethereum network.

There are three roles in mev-boost:
Relays - Mutual trust auctioneers connecting proposers to block builders.
Proposers - Proof-of-Stake validators for Ethereum.

The approximate sequence of events for each block is:

The approximate sequence of events for each block is:
Builders create a block by receiving transactions from users, searchers, or other (private or public) order streams.
Builders submit the block to Relay.
Relays verify that the block is valid and calculate how much it pays the proposer.
The relay sends the "blinded" header and the payment value to the proposer of the current slot.

Proposers evaluate all bids they receive and sign the blinded header associated with the highest payment.

The proposer sends this signed header back to the relay site.

The block is published by the relayer using its local beacon node and returned to the proposer. Rewards are distributed to builders and proposers by conducting transactions within that block and block rewards.

Relay is a mutually trusted third party that facilitates fair exchange of block space from proposer and transaction sequencing for MEV extraction from builder. Relay protects builders from MEV theft, where proposers replicate builder transactions to get MEV instead of assigning to the searcher/builder who discovered it. The relay protects proposers by validating builder blocks, processing hundreds of blocks per slot on behalf of proposers, and ensuring the accuracy of proposer payments.

mev-boost is key protocol infrastructure as it enables all proposers to have democratic access to MEV without requiring trust relationships with builders or searchers, which contributes to the long-term decentralization of Ethereum.

Ethereum's fork selection rules and mev-boost

Before we dive into the attack and response, let's take a look at Ethereum's proof-of-stake (PoS) mechanism and its associated fork choice rules. Fork choice rules allow the network to reach consensus on chain heads. According to The Post-Merge Ethereum Restructuring:

A fork choice rule is a function, evaluated by a client, that takes as input the blocks it has seen and other information, and outputs to the client what the "official chain" is. The fork choice rule is needed because there may be multiple valid chains to choose from (for example, when two competing blocks with the same parent are published at the same time).

One aspect that is less known about the fork choice rule is its relationship to time, which has a significant impact on block production.

Slot and subslot cycles

In Ethereum PoS, time is divided into 12 second increments called slots. The PoS algorithm randomly assigns a validator to get the slot to propose a block; this validator is called a proposer. In the same slot, other validators are assigned the task of voting for the latest version of the block at the chain head's location in their local view by applying the fork-choice rule. The 12-second interval is divided into three phases, each of which takes 4 seconds.

The events that occur in a slot are as follows, where t=0 indicates the start of the slot.

The most critical moment in the slot is the authentication deadline at t=4. If an attesting validator does not see a block by the attestation deadline, they will vote for the previously accepted head of the chain (according to the branch selection rules). The earlier a block is proposed, the more time it has to propagate, so it accumulates more witnesses (because more validators saw it before the certification deadline).

From a network health perspective, the optimal time for a block release is t=0 (as specified by the specification). However, since block values increase monotonically over time, proposers have an incentive to delay publishing their blocks to allow more MEV to accumulate. See Timed Games in Proof of Stake and this discussion for further details.

Historically, a proposer can still publish a block after the validation period or even near the end of a slot, as long as the next validator observes the block before building its subsequent slot block. This is where parent blocks inherit weight and branch selection rules terminate at leaf nodes resulting in no negative impact of delaying publishing blocks. To help drive rational behavior (delaying block releases) towards honest behavior (publishing on time), an "honest reorganization" was implemented.

Proposer Ascension and Honest Reorganization

Two new concepts are introduced into the consensus client with key implications for proof deadlines.
Proposer Boost (PR) - Attempts to minimize rebalancing attacks by granting proposers a fork-choice "boost" equivalent to 40% of the full proof weight. Importantly, this enhancement only lasts for one slot.

Honest Reorganization (PR) - Adopts proposer augmentation and allows honest proposers to use it to force reorganization of blocks with less than 20% authentication weight. This is implemented in Lighthouse and Prysm (since the release of v 4.0-Capella). This change is optional as it is a local decision made by the proposer and does not affect validator behavior. As such, there was no coordinated effort to roll it out to all clients simultaneously, nor was it tied to any particular hard fork.

Note that honesty reordering is avoided in some special cases:
During epoch boundary blocks
If the chain is not complete

If the chain head is not taken from the slot before the reordered block

Condition 3 ensures that honesty reordering only removes a single block from the chain, which acts as a circuit breaker to allow the chain to continue producing blocks during periods of extreme network latency. This also reflects the reduced confidence of proposers in their view of the network, as they can no longer be sure that their proposer enhancement blocks will be considered normative.

The diagram below demonstrates how honest behavior changes to implement a reorganization strategy.

In this case, let b 1 represent a late arriving block. Due to the delay, b1 only has 19% of the proof weight of the nth slot. The remaining 81% of the proof weight is assigned to the parent block HEAD because many validators did not see b1 before the certification deadline.

If there is no honest reorganization, in slot n+1 the proposer considers b 1 as the head of the chain and constructs subblock b 2 . Even though it has only 19% proof weight, the proposer makes no effort to reorganize b1. During slot n+1, b 2 has a proposer enhancement and, assuming it is delivered on time, becomes the norm by accumulating a majority of certifications for that slot.

By honestly reorganizing, the situation is quite different. Now the proposers of epoch n+1 find that the 19% authentication weight for b 1 is below the reorganization threshold, so they build a new block with HEAD as the parent of b 2 and force a reorganization of b 1. When we reach the certification deadline for n+1 epochs, honest validators will compare the relative weights of b 2 (40% from proposer augmentation) to b 1 (19%). All clients perform proposer augmentation, so b2 will be considered the head of the chain and will accumulate certifications for slot n+1.

Relay and beacon node fixes against unbundling attacks

In the April 2 unbundling attack, the proposer exploited a relay vulnerability by sending an invalid signature header to the relay. In the following days, the relay and core development teams released numerous software patches to mitigate the risk of repeated attacks. The five main changes are as follows:

1. Relay changes:
Check the database for known malicious proposers (only used in production by ultrasonic relay and has been removed).
Checks if a full block has been delivered to the P2P network within that time period.

Introduces a uniformly random delay in the range 0-500 ms (removed from all relays) before publishing a block.

2. Beacon chain node changes (only applicable to relay beacon chain nodes):
Verify the validity of a beacon block before broadcasting it.

Check for equivalents on the network before publishing a block.

The combination of these changes leads to consensus instability, which is further exacerbated by the majority of validators now using the honest reorganization strategy described above.

unintended consequences

Each of the above 5 changes will increase the delay time on the relay block release hot path, thereby increasing the probability that the relay block may exceed the proof deadline and be broadcast. The diagram below shows the sequence of these five checks and how introducing delays can cause block publication to exceed the proof deadline.

Until these checks are implemented, signing headers arriving significantly later than t=0 (eg t=3) are usually not a problem. Relay overhead is very low, so blocks are published before t=4.

However, with the increased latency introduced by these five patches, relays may now be partially responsible for delaying broadcasts. Let's look at the block publication in the following hypothetical scenario.

The relay receives the signed header from the proposer at t=3. By t=4, the relay is still performing checks, so the broadcast happens after the proof deadline. In this case, the combination of the proposer sending the signed header late and the relay introducing some additional delay caused the attestation deadline to be missed. If there is no honest reorganization, these blocks will likely go on-chain. As we saw in Figure 2, honest proposers of subsequent slots do not intentionally reorganize blocks that were rejected due to being too late. However, in the case of honest reorganization, missing the proof deadline means that the block will be reorganized by the next proposer.

As a result, the number of forked blocks increased dramatically in the days following the attack.

Metrika’s 2-week data shows that in the worst case, 13 blocks (4.3%) were reorganized within an hour, which is about 5 times more than normal. The dramatic increase in the number of forked blocks became apparent as the relay rolled out various changes. Thanks to a huge community effort by the relay operators and core developers, many of the changes were undone once the impact was understood and the network returned to a healthy state.

As of today, the most useful changes are beacon node block validation and equivalence checks before broadcasting. Malicious proposers can no longer perform attacks by sending invalid headers to relays and ensuring that relay beacon nodes do not see equivalent blocks before publishing. Nonetheless, the relay remains vulnerable to the more general equivalence attacks presented by Mev-boost and ePBS intermediary attacks.

So what should we do?

Given this, the research community should evaluate what is an "acceptable" amount of reorganization and consider the risk posed by equivalence attacks in general to determine whether mitigations need to be implemented.

Additionally, several future directions are currently being actively explored:

Additionally, several future directions are currently being actively explored:
Implement a "headlock" to protect mev-boost from equivalence attacks. This also requires changes to the consensus client software and possibly specification changes to extend the proof deadline.
Increase the number and visibility of bug bounty programs for mev-boost software.
Extend simulation software to explore how subslot timing affects network stability. This can be used to assess how reorganization can be reduced by adjusting proof deadlines.
Optimize the block publishing path on the relay to reduce unnecessary delays. This is already under study.
Recognize mev-boost as a core protocol feature and absorb it into the consensus client, enshrined-PBS (ePBS). Two-slot ePBS is vulnerable to equivalence attacks, so implementing a "headlock" is still an option.
Add more hive and/or specification tests based on latency and proof deadline issues.
Encourage relay client diversity by building additional implementations of the relay specification.

Consider adjusting equivalence penalties, but keep in mind that even a full 32 ETH cut may not deter malicious behavior when the extreme MEV opportunity exists.

Overall, we're excited about the renewed energy around the MEV and mev-boost ecosystem. By splitting attacks and mitigations, we have learned the critical relationship between latency, mev-boost, and consensus mechanisms; we hope the protocol will continue to harden.