Web3 research: XCMP cross-chain transmission protocol

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

Web3 research: XCMP cross-chain transmission protocol

PolkaBase

2020-09-10 09:02

本文约6816字，阅读全文需要约27分钟

In the Polkadot network, we need to submit information to some entities (entities). First, let us briefly sort out the types of these entities: (1) users, (2) collectors, (3) verifiers

Table of contents

Table of contents
Interchain messaging

A Simple Gossip Protocol for ICMP Queue Routing

on the new tree structure (leaves) B

The new chain block header declaration of node (peer) k

k from module t is on Queue message m

Definition of peer layer node (peer) k on 〖allowed〗_k (m)

Periodically
future improvement

secondary title

XCMP cross-chain messaging

In the Polkadot network, we need to submit information to some entities (entities). First, let us briefly sort out the types of these entities: (1) users, (2) collectors, (3) verifiers

1. User - creates a transaction and provides information to a parachain or relay chain

2. Collator - Affiliated to a specific parachain. They are responsible for organizing and packaging the transaction data of the parachain into blocks to generate valid certificates, and submitting them to the validators on the parachain as part of the next block. The next two links related to validity are part of the basic rules of the Polkadot network, but how to carry out specific sorting operations is independently selected by the parachain.

3. Validator - belongs to the relay chain and obeys the rules of the Polkadot network. Different subsets of validators are assigned to different parachains to validate those chains, hence we refer to these subsets as "parachain validators". They will also sort out the transaction information and submit it to the relay chain.

Next, we need to understand several sub-protocols between these entities, each sub-protocol serves the corresponding abstraction of executing transactions:

1. Transaction submission

Where related entity access is available, the user needs to find the related entity in order to submit the transaction.

2. Parachain collation

The collector of the parallel chain collects and packages the transaction information into blocks, and the data in the block comes from the external chain outside the scope of the Polkadot network selected by each parallel chain.

3. Parachain block attestation

The collator also generates other attestation data and submits it to the validators of the parachain. The ultimate goal of this is for parachain validators to effectively check that each parachain block satisfies its on-chain verification function. In order to generate proof data, collators also need data sent back by parachain validators earlier on the relay chain.

4. Relay-chain protocols

Parachain validators attest to the validity of any parachain blocks that have been sent and distribute these proofs to other validators for consensus. Then, all validators organize and package the proven blocks and transaction information on the relay chain into relay chain blocks, and finalize the blocks.

5. Inter-chain messaging

After the relay chain blocks are finalized and this fact is committed back to the parachain, the parachain checks whether these blocks contain new information from other parachains, and if so, corrects and reacts.

（Parachain synchronisation）

6. Information integration of parachains

When the collator connects for the first time or disconnects for a long time, it must re-retrieve and verify the latest status of the parachain after the connection is restored. The verification content includes temporarily submitted transaction information and information about the latest status of the relay chain.

7. Information integration of the relay chain

In the above description, we have explained the relevant details in the abstract and tried to keep it valid even if these details change. When writing the program, we found that among the services that the Polkadot network layer needs to provide, sub-protocols (3-5) and 7 are still the most demanded and most complex components, and are expected to be retained.

XCMP

secondary title

Inter-chain information transfer: Egress Queue data acquisition

Every parachain block in Polkadot generates a possibly empty list of messages to every other block. These are called "egress queues". E_(x,y)^B is used to represent the egress queues from chain x to chain y in block B. R(E_(x,y)^B) This is the hash root of the Merkle Patricia trie[], which is obtained by drawing a dendrogram for E_(x,y)^B in the message data.

Unprocessed messages delivered to a chain that are pending should be processed in the next block of that chain. If no new blocks are produced on-chain for a period of time, messages may start to pile up.

Collator nodes and full nodes on parachain p have executed all blocks on that parachain and should know all B blocks and E_(p,y)^B in chain x

The set of access points of the block egress on the parallel chain p on block B is:

The ingress collection root of the block is

The total accumulation entry (ingress) of block B on the parachain p can be defined by a recursive function:

This is a block that includes all ingresses from block B to each parallel chain to each block in the p chain

Parachains must run 〖Ingress〗_(B,P) after 〖ingress〗_(parent(B),p), and any data and instructions from 〖Ingress〗_(B,P) must be run .

Each parachain has a watermark value p generated by the relay chain that reflects the most recently run ingress operation. Initially set up as Genesis, in order to define a structure containing all outstanding messages as a parachain, we introduce pending ingresses, which are defined by recursive functions.

These pending roots R(P_INGRESS (B,P)) can also be approximated as R(T_INGRESS (B,P)). A parachain candidate chain allows building P_INGRESS (B,P ) with any prefix.

CandidateReceipts

All information about the parachain at runtime comes from

This candidate source is generated by validating parachain candidate blocks and included in relay chain blocks. Candidates have many fields. Here are some relevant ones:

Vec<(ParaId, Hash)>

Egress roots:

When relay chain block B is included in parachain p, each hash value paired with unique parachain y is R(E_(p,y)^B)

watermark p

When the received data is for the parachain p, a new

value. The runtime considers the value it receives from the candidate chain of the nearest parachain as the current value. The original value of B in any block containing the candidate must be at least as high as the previous value of watermark p.

(rob: non-empty pending egress in empty list not allowed)

Both the verifier and the collector try to collect block queue information on block B, or the verifier just calls 〖Ingress〗_(B,P) and searches for relevant messages in the propagation pool, waiting for messages that have not yet been released. The goal of the collator on P is to build on the relay chain parent B, so as to obtain the prefix of P_INGRESS (B,P) as much as possible.

R(P_INGRESS (B,The easiest way is to use . Messages are gossiped from one parachain network to another. If there is a common node between these two networks broadcasting the message, the message will cause the parallel link that is receiving the message to receive its message. However, if the validators of the target parachain realize that the message was not propagated on the receiver of the parachain, they will re-request the message from the validator node of the parachain that sent the message, and then they will receive it on the parachain themselves. device to propagate.

P)) is available at every block B and parachain p at runtime.

The availability of parachain p and block B is due to the fact that p and B are the ingress lists of the pending ingress roots of the block and each list and root are first paired with the routed block number. Omit R(∅) from the ingress list, and omit the empty list and sort by block number ascending. where all block numbers are less than num(B) and refer to blocks in the same chain.

fn ingress(B, p) -> Vec<(BlockNumber, Vec<(ParaId, Hash)>)>

Pseudo-code in RUST (TODO: Transcribe to LaTeX)

fn egress(B, p) -> Vec<(BlockNumber, Vec<(ParaId, Hash)>)>

Data from egress queues for a given B and p is still available at runtime, subject to the same constraints as ingress lists w.r.t. ordering and omitting of empty lists. ParaId here is the receiving chain, and in the entry (ingress) function, it is the sending chain.

Let's assume that the tree structure leaves the block as the latest parachain block in the relay chain.

We make the following assumptions about nodes:

1. The block-state of leaves at the end of the tree structure is available, but the state of the old block cannot be guaranteed

2. It can be expected that the collators and full nodes on the parachain will retain all blocks that have been executed on the parachain.

3. Validators do not need to keep any blocks. Also note that information can be recovered from erasure coded snippets saved by validators

Assuming we build on a system that admits gossip protocol, peers can communicate with each other which tree structure endpoints (leaves) they think are best

This chapter describes a limited gossip protocol for propagating queues of ICMP messages (see Overview for a definition).

fn ingress(B, p) -> Vec<(BlockNumber, Vec<(ParaId, Hash)>)>

and

fn egress(B, p) -> Vec<(BlockNumber, Vec<(ParaId, Hash)>)>

and

BlockNumber -> BlockHash

Since ingress has been called on block B, we can easily put

to convert

Messages start in a non-routing state and end in a routing state

We propose to define a gossip propagation system

struct Queue { root: Hash, messages: Vec, }

Messages about this module are of the form

We keep local information as:

MAX_CHAIN_HEADS

1. 〖leaves〗_k is the best list of hash functions for the endpoints (leaves) of our DAG block tree structure that needs to execute the code

2. 〖leaves〗_k, for each peer k the best latest list needs to execute the code MAX_CHAIN_HEADSDAG hash value of the block (based on what they sent to us)

3. leafTopics(l)→{queueTopic(h)} means for the unexpanded root h on the parallel chain on the tree node l on all tree structures (leaves)

4. expectedQueues(t)→H indicates the derivation process of the corresponding hash root of all calling modules, and t∈∪l_(∈leaves)leafTopics(l) in all entries

on the new tree structure (leaves) B

1. Upgrade the tree structure (leaves), tree module (leafTopics), and expected queue (benchmark testing has not been performed, but a conservative estimate will have a running time of 100ms)

2. Send the new tree structure (leaves) of the same layer

3. If the collector of the parallel chain p executes the code egress(B,p) online. For any known message queue root that has not been propagated, we will put the corresponding Queue message into the propagation pool.

The new chain block header declaration of node (peer) k

1. Upgrade tree structure (leaves) 〖leaves〗_k

2. To meet the requirements of ∀H∈leaves ∩ 〖leaves〗_k, each t in broadcastTopic(k,t) is mapped on leafTopics(H)

k from module t is on Queue message m

good(m)

we define the function

As a local local confidence standard (local acceptance criterion).

root

the hash root of the message

On the function expectedQueues(t) defined, the real root of the given message is equal to root

If the function good(m) proves that k is positive, put m in the propagation pool, otherwise, prove that m is unavailable, which is useful for peer-set cultivation.

Definition of peer layer node (peer) k on 〖allowed〗_k (m), and Queue message m on module t

If message k is sent to us before we finish sending the message, message k should be rejected. Conversely, if the following

∃l∈leaves∩〖〖leaves〗_k|t∈leafTopics(l)∃l∈leaves∩〖leaves〗_k | t∈leafTopics(l), then message k is allowed and other methods are forbidden.

Periodically

Except expectedQueues we need to mark all modules as expired modules and clear them in the propagation pool. In fact, such operations are performed every few seconds. This prevents our propagation pool from growing indefinitely.

The decision to only propagate unrouted messages to peers who share the same view of the current mod might be controversial, but some of the prerequisites we came up with make this point quite well.

First, we don't want the node to have to process an infinite number of messages, which means that the H in queueTopic(H) is unknown to the node because there are infinite such H's. Second, nodes don't have to do a lot of work to decide whether to propagate a message to a particular peer. Suppose leaves∩〖leaves〗_k=∅, but some entries of 〖leaves〗_k are the source code of leaves

(ancestors). We have to run every l ∈ 〖leaves〗_k in O(n) to prove. We then have to figure out whether a given message was unrouted on the previous block. We naively think that if a message is still unrouted on some later block in the same chain, instead of being routed in an earlier chain, it is unlikely but the state of the chain may be phishing rollback at the human node.

Since the assumed chain state is not available from previous blocks, we have no good way of determining whether to actually send an egress to a peer of that previous block. In a future improvements section, we will discuss relaxing this limitation by scaling to a fixed number of origins (ancestors).

Nevertheless, it is only reasonable to propagate to nodes (peers) that have been synchronized to the same chain, assuming the following (some empirical but reasonable, but also likely to be overestimated):

1. New valid blocks are issued at an average interval of at least 5 seconds (our goal is actually 10-15 seconds)

2. The propagation time of the block in the "useful" part of the gossip propagation structure is within 2 seconds.

3. Adjacent subjects in the gossip propagation structure have a delay of less than or equal to 500ms

4. Propagating messages meaningfully before syncing to the head of the DAG may not be worth it

The actual value will almost certainly be better. The good news is that not all blocks have to be propagated within one block time - as time goes by, it becomes increasingly possible for participants to get earlier messages. This is a scheme where all messages are seen by all participants. It will almost certainly not scale beyond a handful of initial chains, but will instead function functionally as a starter protocol.

secondary title

Overview of XCMP Routing

To send a message from one parachain (sending parachain) to another parachain (receiving parachain) according to the setup, the following steps will be performed.

Messages are sufficient when the full node of the sending parachain is also part of the domain of the receiving parachain

The parachain validator of the receiving parachain does not find that the message is propagated, and then it directly requests the message from the parachain validator of the sending parachain (sending real-time PV). The sending parachain's PV is responsible for keeping the message available. The parachain verifier sending the parachain directly sends the message to the PoV of the receiving parachain, and finally the PV of the receiving parachain spreads gossip (gossip) on the message in the receiving parachain network.

secondary title

future improvement

1. The above section describes why it is a bad idea to propagate egress queues arbitrarily far into the peer layer, but once we sync up and follow normal block production, we can reasonably track all trees The final source code (ancestors) α of the structure (leaves). The initial reasonable value of α should be consistent with that of the parent and take 1. This might get us 90% of the gain we need, simply because there is a "stutter" when treeset intersections are required, and two peers need to Update each other with information about new subkeys.

2. Expanding the definition of E_(x,y)^B can allow mutual inspection between chains. For example, parachain y could tell the relay chain not to route messages from x at block B (and then tell it to start routing again at block B'). For any block between B and B', we will have running time running E_(x,y)^b=∅ when disregarding the x of b on CandidateReceipt. In fact, since the runtime only processes the hash root hash, it really just ignores R(E_(x,y)^b) from the candidate receipt and sets it to R(∅)

3. Scale to support smarter topologies where not everyone can see everything. Maybe having two kinds of topic arrays (Topics), (B, (B,〖chain〗_from)) based topic array (Topics) and (B,〖chain〗_to) based mod will improve the usability of this mod.https://github.com/sipa/minisketch4. Use some kind of smart setup reconciliation (e.g.

) to minimize gossip propagation bandwidth.

5. Incentivize distribution in a manner similar to probabilistic micropayments.

6. The parachain validator node will hold the egress queue until it is confirmed that the message has been contained. This is a rollback when two parachain networks do not have identical full nodes and messages do not arrive via chat. The parachain validator node of the receiving parachain will notice the missing message and ask the parachain validator node of the sending chain for the message. Once they receive that request, it propagates through the receiver's parachain network.

All the information that the blockchain has to the business logic is in the form of CandidateReceipts. Block signers can submit at most one CandidateReceipt from each parachain in the block (actually, only those chains that are attested by multiple validators, although this assumption is not relevant here).

Compilation / Stealth Yao