
Part 1 Preface
Imagine that you are driving a 1969 Mustang Mach fast on the roads of the western United States, the sun shines on the gorgeous gold-plated rims, the entire road is only you and the desert, and the endless horizon witnesses the pursuit of you and the setting sun.. . . .
While feeling relaxed and happy, suddenly there was a loud noise, and your 335-horsepower fast horse was engulfed by billowing white smoke, and instantly turned into a steam locomotive, so you were forced to stop on the side of the road.
You're about to see what's wrong, but when you lift the hood, you can't read it. You have no idea how the damn machine works, so you pick up your phone to call for help, only to find that there is no signal nearby...
Is the situation described above very similar to the DApp development you are doing? In the process of developing a Dapp, in an analogy, the luxury car is your smart contract, and the rims and modified places are those small details that are well thought out. And once there is a problem, you need to find the answer in the smart contract EVM bytecode. In most cases, you have no idea what happened.
If you are a Dapp developer and have encountered the embarrassing situation above, then you don’t have to worry about it anymore!
Because, the purpose of this series of articles is to deconstruct a simple Solidity contract, look at its bytecode, and break it down into recognizable structures, down to the lowest level. We're going to pop the hood of Solidity's sports car. By the end of this series, you should feel comfortable viewing or debugging EVM bytecode. This series focuses on demystifying the EVM bytecode generated by the Solidity compiler, which is really much simpler than it looks.
The following is the smart contract code we will use when deconstructing:
pragma solidity ^0.4.24;
contract BasicToken {
uint256 totalSupply_;
mapping(address => uint256) balances;
constructor(uint256 _initialSupply) public {
totalSupply_ = _initialSupply;
balances[msg.sender] = _initialSupply;
}
function totalSupply() public view returns (uint256) {
return totalSupply_;
}
function transfer(address _to, uint256 _value) public returns (bool) {
require(_to != address(0));
require(_value <= balances[msg.sender]);
balances[msg.sender] = balances[msg.sender] - _value;
balances[_to] = balances[_to] + _value;
return true;
}
function balanceOf(address _owner) public view returns (uint256) {
return balances[_owner];
}
}
first level title
Compile the contract
To compile the contract, we will use Remix (address: https://remix.ethereum.org).
When you open the Remix compiler, click the + button in the upper left corner above the file browser area to create a new smart contract. Set the filename to BasicToken.sol. Once created, paste the above code into the editor.
On the right, go to the options for "Setting" and make sure "Enable Personal Mode" is checked. Also, note that the selected Solidity compiler version is
“ version:0.4.24 +commit.e67f0147.Emscripten.clang ”。
These two details are very important, otherwise you will not be able to view the bytecode discussed in this article.
Next, you can go to the Compile option and click the Details button, and you'll see a popup with everything generated by the Solidity compiler, one of which is a JSON object called BYTECODE, which has an "object" property, this It is the compiled contract code, and its code is as follows:
608060405234801561001057600080fd5b5060405160208061021783398101604090815290516000818155338152600160205291909120556101d1806100466000396000f3006080604052600436106100565763ffffffff7c010000000000000000000000000000000000000000000000000000000060003504166318160ddd811461005b57806370a0823114610082578063a9059cbb146100b0575b600080fd5b34801561006757600080fd5b506100706100f5565b60408051918252519081900360200190f35b34801561008e57600080fd5b5061007073ffffffffffffffffffffffffffffffffffffffff600435166100fb565b3480156100bc57600080fd5b506100e173ffffffffffffffffffffffffffffffffffffffff60043516602435610123565b604080519115158252519081900360200190f35b60005490565b73ffffffffffffffffffffffffffffffffffffffff1660009081526001602052604090205490565b600073ffffffffffffffffffffffffffffffffffffffff8316151561014757600080fd5b3360009081526001602052604090205482111561016357600080fd5b503360009081526001602081905260408083208054859003905573ffffffffffffffffffffffffffffffffffffffff85168352909120805483019055929150505600a165627a7a72305820a5d999f4459642872a29be93a490575d345e40fc91a7cccb2cf29c88bcdaf3be0029
first level title
deploy contract
Next, go to the Run section in Remix. First, make sure you're using a Javascript VM. This is basically an embedded Javascript EVM + network, the ideal Ethereum training ground. Make sure BasicToken is selected in the ComboBox and enter the number 10000 in the Deploy input box. Next, click the "Deploy" button to deploy. This deploys an instance of the BasicToken smart contract we created, with an initial supply of 10,000 tokens owned by the account currently selected at the top of the ComboBox account, which holds all the token supply we set.
In the "Deployed Contracts" of the "Run" tab, you can see the deployed smart contract, which contains three fields to interact with the contract: transfer, balanceOf, and totalSupply. Here we are able to interact with the smart contract instance we just deployed.
But before that, let's take a look at what "Deploy" of a contract actually means:
In the console area at the bottom of the page, you can see a log "creation of BasicToken pending ..." followed by a transaction entry with various fields: from, to, value, data, logs and hash. Click on this entry to expand the transaction information, and you should see the transaction's date, input, and the bytecode we mentioned above. So, create a smart contract instance, which will contain its own address and code.
first level title
disassemble bytecode
In the center of the console, to the right of the transaction box, there is a "debug" button. Click this button and you will activate the Debugger option in the right area of Remix. We can look at the Instructions section together. If we scroll down, the following should appear:
000 PUSH1 80
002 PUSH1 40
004 MSTORE
005 CALLVALUE
006 DUP1
007 ISZERO
008 PUSH2 0010
011 JUMPI
012 PUSH1 00
014 DUP1
015 REVERT
016 JUMPDEST
017 POP
018 PUSH1 40
020 MLOAD
021 PUSH1 20
023 DUP1
024 PUSH2 0217
027 DUP4
028 CODECOPY
029 DUP2
030 ADD
031 PUSH1 40
033 SWAP1
034 DUP2
035 MSTORE
036 SWAP1
037 MLOAD
038 PUSH1 00
040 DUP2
041 DUP2
042 SSTORE
043 CALLER
044 DUP2
045 MSTORE
046 PUSH1 01
048 PUSH1 20
050 MSTORE
051 SWAP2
...(abbreviation)
To make sure you didn't go wrong, compare what you see in the Remix compiler you're running with the above.
This is actually the disassembled bytecode of the contract. If you scan the raw bytecode byte-by-byte (two characters at a time), the EVM recognizes a specific opcode associated with a specific operation. For example:
0x60 => PUSH
0x01 => ADD
0x02 => MUL
0x00 => STOP
...
first level title
Opcode
Before starting to deconstruct the smart contract code, you will need a basic toolset to understand a single opcode, such as the opcode of PUSH, ADD, SWAP, DUP, etc. At the end, each operation can only be accessed from the EVM's stack, memory or belonging to Pushes an item or consumes an item from the contract's storage.
To see all available opcodes that the EVM can handle, you can check Pyethereum, which shows a list of opcodes. To understand how each opcode works, Solidity's official assembly documentation is also a good reference. Even though it's not a one-to-one correspondence with raw opcodes, it's pretty close (it's actually Yul, an intermediate language between Solidity and EVM bytecode). If you can read technical documents, you can read the Ethereum Yellow Paper. In fact, it all boils down to the above content.
Although there are so many documents recommended to everyone, there is no point in reading these resources from cover to cover now, just remember that there are such materials, and we will use them when needed.
instruction
Each line in the disassembly above is an operation instruction executed by the EVM, and each instruction contains an opcode, for example, let us take one of these instructions, instruction 88, which pushes the number 4 onto the stack. This particular disassembler explains the following:
88 PUSH1 0x04
| | |
| | Hex value for push.
| Opcode
Instruction number
Strategy
Strategy
Any task that seems impossible at the beginning can actually be broken down into solvable tasks through continuous dismantling, and the problems we encounter are no exception. Faced with this problem, the strategy I adopt is "divide and conquer". .
We can try to find the bifurcation points of the disassembled code and gradually break it down until it breaks down into very small chunks, which we will do step by step in Remix's debugger.
In the image below, we can see our first split of the disassembled code (which I will fully analyze in the next post).
Cheetah blockchain security is based on the technology of Kingsoft Internet Security, combined with artificial intelligence, nlp and other technologies, to provide blockchain users with ecological security services such as contract audit and sentiment analysis.
*This article was first published on medium by Alejandro Santander, translated and organized by Cheetah Blockchain*
Cheetah blockchain security is based on the technology of Kingsoft Internet Security, combined with artificial intelligence, nlp and other technologies, to provide blockchain users with ecological security services such as contract audit and sentiment analysis.
You can visit the official website of Ratingtokenlearn more