
first level title
This paper analyzes in detail the vulnerability problem caused by the wrong handling of fixed-length uint and bytes 32 type arrays in the process of ABIReencoding of Solidity compiler (0.5.8<=version <0.8.16) from the source code level, and Propose relevant solutions and avoidance measures.
first level title
Vulnerability detailsThe ABI encoding format is a standard encoding method used when users or contracts make function calls to contracts and pass parameters. For details, please refer to Solidity official aboutABI encoding
detailed description of .
During the contract development process, the required data will be obtained from the calldata data sent by the user or other contracts, and then the obtained data may be forwarded or emitted. All opcode operations limited to the evm virtual machine are based on memory, stack, and storage, so in Solidity, when it comes to operations that require ABI encoding of data, the data in calldata will be encoded in the ABI format according to the new order, and Stored in memory.The process itself has no major logic problems, but when combined with Solidity'scleanup mechanism
When combined, due to the omissions of the Solidity compiler code itself, there are vulnerabilities.
According to the ABI encoding rules, after removing the function selector, the ABI-encoded data is divided into two parts: head and tail. When the data format is a fixed-length uint or bytes 32 array, the ABI will store the data of this type in the head section. Solidity's implementation of the cleanup mechanism in memory is to empty the memory of the next index after the memory of the current index is used, so as to prevent the memory of the next index from being affected by dirty data. And, when Solidity ABI-encodes a set of parameter data, it encodes in order from left to right! !
contract Eocene {
event VerifyABI( bytes[], uint[ 2 ]);
function verifyABI(bytes[] calldata a, uint[ 2 ] calldata b) public {
emit VerifyABI(a,In order to facilitate the exploration of the vulnerability principle later, consider the contract code in the following form:
}
}
b); //Event data will be encoded in ABI format and stored on the chain
The function of the verifyABI function in contract Eocene is only to emit the variable-length bytes[] a and fixed-length uint[2] b in the function parameters.
It should be noted here that the event event will also trigger ABI encoding. Here parameters a, b will be encoded into ABI format and then stored on the chain.verifyABI(['0x aaaaaa','0x bbbbbb'],[0x 11111, 0x 22222 ])。
We use v 0.8.14 version of Solidity to compile the contract code, deploy it through remix, and pass inverifyABI(['0x aaaaaa','0x bbbbbb'],[0x 11111, 0x 22222 ])First, let's take a look at the
0x 5 2c d 1 a 9 c // bytes 4(sha 3("verify(btyes[], uint[ 2 ])"))
0000000000000000000000000000000000000000000000000000000000000060 // index of a
0000000000000000000000000000000000000000000000000000000000011111 // b[0 ]
0000000000000000000000000000000000000000000000000000000000022222 // b[1 ]
0000000000000000000000000000000000000000000000000000000000000002 // length of a
0000000000000000000000000000000000000000000000000000000000000040 // index of a[0 ]
0000000000000000000000000000000000000000000000000000000000000080 // index of a[1 ]
0000000000000000000000000000000000000000000000000000000000000003 // length of a[0 ]
aaaaaa 0000000000000000000000000000000000000000000000000000000000 // a[0 ]
0000000000000000000000000000000000000000000000000000000000000003 // length of a[1 ]
bbbbbb 0000000000000000000000000000000000000000000000000000000000 // a[1 ]
The correct encoding format for :a, bIf the Solidity compiler is normal, when the parameterTX。
When the event event is recorded on the chain, the data format should be the same as what we sent. Let's actually call the contract and check the log on the chain. If you want to compare yourself, you can check the
After a successful call, the contract event event is recorded as follows:
0000000000000000000000000000000000000000000000000000000000000060 // index of a
0000000000000000000000000000000000000000000000000000000000011111 // b[0 ]
0000000000000000000000000000000000000000000000000000000000022222 // b[1 ]
0000000000000000000000000000000000000000000000000000000000000000 // length of a?? why become 0??
0000000000000000000000000000000000000000000000000000000000000040 // index of a[0 ]
0000000000000000000000000000000000000000000000000000000000000080 // index of a[1 ]
0000000000000000000000000000000000000000000000000000000000000003 // length of a[0 ]
aaaaaa 0000000000000000000000000000000000000000000000000000000000 // a[0 ]
0000000000000000000000000000000000000000000000000000000000000003 // length of a[1 ]
bbbbbb 0000000000000000000000000000000000000000000000000000000000 // a[1 ]
! ! Shocking, right after b[1 ], the value storing the length of the a parameter was mistakenly deleted! !
why?
As we said earlier, when Solidity encounters a series of parameters that need to be ABI-encoded, the generation order of the parameters is from left to right. The specific encoding logic for a and b is as follows
Solidity first performs ABI encoding on a. According to the encoding rules, the index of a is placed in the head, and the element length and specific value of a are stored in the tail.
Process the b data, because the b data type is in uint[2] format, so the specific value of the data is stored in the head part. However, due to Solidity's own cleanup mechanism, after b[1] is stored in the memory, the value of the next memory address where the b[1] data is located (the memory address used to store the length of the a element) is set to 0.
The ABI encoding operation ends, incorrectly encoded data is stored on-chain, and the SOL-2022-6 vulnerability emerges.
At the source code level, the specific error logic is also obvious. When it is necessary to obtain fixed-length bytes 32 or uint array data from calldata into memory, Solidity will always set the latter memory index data to 0 after the data is copied. . And because there are two parts of head and tail in ABI encoding, and the encoding order is also from left to right, which leads to the existence of vulnerabilities.
The Solidity compiled code for the specific vulnerability is as follows:ABIFunctions::abiEncodingFunctionCalldataArrayWithoutCleanup()
Enter when the source data storage location is Calldata, and the source data type is ByteArray, String, or the source array basic type is uint or bytes 32fromArrayType.isDynamicallySized()After entering, you will first pass the
Determine whether the source data is a fixed-length array. Only fixed-length arrays meet the vulnerability triggering conditions.isByteArrayOrString()WillYulUtilFunctions::copyToMemoryFunction(),Judgment results are delivered to
Determine whether to perform cleanup on the next index position after the calldatacopy operation is completed according to the judgment result.
Combining several constraints above, the vulnerability can only be triggered when the source data in calldata format is a fixed-length uint or bytes 32 array copied to memory. That is, the reason for the constraints triggered by the vulnerability.
The reason is obvious, if the fixed-length data is not located at the last parameter position to be encoded, then setting 0 to the next memory position will not have any effect, because the next encoding parameter will overwrite this position. If there is no data before the fixed-length data that needs to be stored in the tail part, it doesn't matter even if the latter memory location is set to 0, because this location is not used by the ABI code.
in addition,in addition,。
The specific operations involved are as follows:
event
error
abi.encode*
returns //the return of function
struct //the user defined struct
all external call
first level title
solution
When there is an appeal-affected operation in the contract code, ensure that the last parameter is not a fixed-length uint or bytes 32 array
first level title
about Us
At Eocene Research, we provide the insights of intentions and security behind everything you know or don't know of blockchain, and empower every individual and organization to answer complex questions we hadn't even dreamed of back then.