Deconstructing smart contracts
猎豹区块链安全
2018-12-03 11:51
本文约4852字,阅读全文需要约19分钟
Uncover the mystery of smart contracts with the author

Part 1 Preface


Imagine that you are driving a 1969 Mustang Mach fast on the roads of the western United States, the sun shines on the gorgeous gold-plated rims, the entire road is only you and the desert, and the endless horizon witnesses the pursuit of you and the setting sun.. . . .

While feeling relaxed and happy, suddenly there was a loud noise, and your 335-horsepower fast horse was engulfed by billowing white smoke, and instantly turned into a steam locomotive, so you were forced to stop on the side of the road.

You're about to see what's wrong, but when you lift the hood, you can't read it. You have no idea how the damn machine works, so you pick up your phone to call for help, only to find that there is no signal nearby...  



Is the situation described above very similar to the DApp development you are doing? In the process of developing a Dapp, in an analogy, the luxury car is your smart contract, and the rims and modified places are those small details that are well thought out. And once there is a problem, you need to find the answer in the smart contract EVM bytecode. In most cases, you have no idea what happened.

If you are a Dapp developer and have encountered the embarrassing situation above, then you don’t have to worry about it anymore!

Because, the purpose of this series of articles is to deconstruct a simple Solidity contract, look at its bytecode, and break it down into recognizable structures, down to the lowest level. We're going to pop the hood of Solidity's sports car. By the end of this series, you should feel comfortable viewing or debugging EVM bytecode. This series focuses on demystifying the EVM bytecode generated by the Solidity compiler, which is really much simpler than it looks.


The following is the smart contract code we will use when deconstructing:

pragma solidity ^0.4.24;

    contract BasicToken {


    uint256 totalSupply_;

    mapping(address => uint256) balances;

    constructor(uint256 _initialSupply) public {

    totalSupply_ = _initialSupply;

    balances[msg.sender] = _initialSupply;

    }


    function totalSupply() public view returns (uint256) {

    return totalSupply_;

    }


    function transfer(address _to, uint256 _value) public returns (bool) {

    require(_to != address(0));

    require(_value <= balances[msg.sender]);

    balances[msg.sender] = balances[msg.sender] - _value;

    balances[_to] = balances[_to] + _value;

    return true;

    }


   function balanceOf(address _owner) public view returns (uint256) {

   return balances[_owner];

   }

}

first level title


Compile the contract

To compile the contract, we will use Remix (address: https://remix.ethereum.org).

When you open the Remix compiler, click the + button in the upper left corner above the file browser area to create a new smart contract. Set the filename to BasicToken.sol. Once created, paste the above code into the editor.

On the right, go to the options for "Setting" and make sure "Enable Personal Mode" is checked. Also, note that the selected Solidity compiler version is

“ version:0.4.24 +commit.e67f0147.Emscripten.clang ”。

These two details are very important, otherwise you will not be able to view the bytecode discussed in this article.

Next, you can go to the Compile option and click the Details button, and you'll see a popup with everything generated by the Solidity compiler, one of which is a JSON object called BYTECODE, which has an "object" property, this It is the compiled contract code, and its code is as follows:

608060405234801561001057600080fd5b5060405160208061021783398101604090815290516000818155338152600160205291909120556101d1806100466000396000f3006080604052600436106100565763ffffffff7c010000000000000000000000000000000000000000000000000000000060003504166318160ddd811461005b57806370a0823114610082578063a9059cbb146100b0575b600080fd5b34801561006757600080fd5b506100706100f5565b60408051918252519081900360200190f35b34801561008e57600080fd5b5061007073ffffffffffffffffffffffffffffffffffffffff600435166100fb565b3480156100bc57600080fd5b506100e173ffffffffffffffffffffffffffffffffffffffff60043516602435610123565b604080519115158252519081900360200190f35b60005490565b73ffffffffffffffffffffffffffffffffffffffff1660009081526001602052604090205490565b600073ffffffffffffffffffffffffffffffffffffffff8316151561014757600080fd5b3360009081526001602052604090205482111561016357600080fd5b503360009081526001602081905260408083208054859003905573ffffffffffffffffffffffffffffffffffffffff85168352909120805483019055929150505600a165627a7a72305820a5d999f4459642872a29be93a490575d345e40fc91a7cccb2cf29c88bcdaf3be0029

first level title


deploy contract

Next, go to the Run section in Remix. First, make sure you're using a Javascript VM. This is basically an embedded Javascript EVM + network, the ideal Ethereum training ground. Make sure BasicToken is selected in the ComboBox and enter the number 10000 in the Deploy input box. Next, click the "Deploy" button to deploy. This deploys an instance of the BasicToken smart contract we created, with an initial supply of 10,000 tokens owned by the account currently selected at the top of the ComboBox account, which holds all the token supply we set.

In the "Deployed Contracts" of the "Run" tab, you can see the deployed smart contract, which contains three fields to interact with the contract: transfer, balanceOf, and totalSupply. Here we are able to interact with the smart contract instance we just deployed.

But before that, let's take a look at what "Deploy" of a contract actually means:

In the console area at the bottom of the page, you can see a log "creation of BasicToken pending ..." followed by a transaction entry with various fields: from, to, value, data, logs and hash. Click on this entry to expand the transaction information, and you should see the transaction's date, input, and the bytecode we mentioned above. So, create a smart contract instance, which will contain its own address and code.

first level title


disassemble bytecode

In the center of the console, to the right of the transaction box, there is a "debug" button. Click this button and you will activate the Debugger option in the right area of ​​Remix. We can look at the Instructions section together. If we scroll down, the following should appear:

000 PUSH1 80

002 PUSH1 40

004 MSTORE

005 CALLVALUE

006 DUP1

007 ISZERO

008 PUSH2 0010

011 JUMPI

012 PUSH1 00

014 DUP1

015 REVERT

016 JUMPDEST

017 POP

018 PUSH1 40

020 MLOAD

021 PUSH1 20

023 DUP1

024 PUSH2 0217

027 DUP4

028 CODECOPY

029 DUP2

030 ADD

031 PUSH1 40

033 SWAP1

034 DUP2

035 MSTORE

036 SWAP1

037 MLOAD

038 PUSH1 00

040 DUP2

041 DUP2

042 SSTORE

043 CALLER

044 DUP2

045 MSTORE

046 PUSH1 01

048 PUSH1 20

050 MSTORE

051 SWAP2

...(abbreviation)

To make sure you didn't go wrong, compare what you see in the Remix compiler you're running with the above.

This is actually the disassembled bytecode of the contract. If you scan the raw bytecode byte-by-byte (two characters at a time), the EVM recognizes a specific opcode associated with a specific operation. For example:

0x60 => PUSH

0x01 => ADD

0x02 => MUL

0x00 => STOP

...

first level title


Opcode

Before starting to deconstruct the smart contract code, you will need a basic toolset to understand a single opcode, such as the opcode of PUSH, ADD, SWAP, DUP, etc. At the end, each operation can only be accessed from the EVM's stack, memory or belonging to Pushes an item or consumes an item from the contract's storage.

To see all available opcodes that the EVM can handle, you can check Pyethereum, which shows a list of opcodes. To understand how each opcode works, Solidity's official assembly documentation is also a good reference. Even though it's not a one-to-one correspondence with raw opcodes, it's pretty close (it's actually Yul, an intermediate language between Solidity and EVM bytecode). If you can read technical documents, you can read the Ethereum Yellow Paper. In fact, it all boils down to the above content.

Although there are so many documents recommended to everyone, there is no point in reading these resources from cover to cover now, just remember that there are such materials, and we will use them when needed.

instruction

Each line in the disassembly above is an operation instruction executed by the EVM, and each instruction contains an opcode, for example, let us take one of these instructions, instruction 88, which pushes the number 4 onto the stack. This particular disassembler explains the following:

88 PUSH1 0x04

| | |

| | Hex value for push.

| Opcode

Instruction number

Strategy


Strategy

Any task that seems impossible at the beginning can actually be broken down into solvable tasks through continuous dismantling, and the problems we encounter are no exception. Faced with this problem, the strategy I adopt is "divide and conquer". .

We can try to find the bifurcation points of the disassembled code and gradually break it down until it breaks down into very small chunks, which we will do step by step in Remix's debugger.

In the image below, we can see our first split of the disassembled code (which I will fully analyze in the next post).


Cheetah blockchain security is based on the technology of Kingsoft Internet Security, combined with artificial intelligence, nlp and other technologies, to provide blockchain users with ecological security services such as contract audit and sentiment analysis.

*This article was first published on medium by Alejandro Santander, translated and organized by Cheetah Blockchain*

Cheetah blockchain security is based on the technology of Kingsoft Internet Security, combined with artificial intelligence, nlp and other technologies, to provide blockchain users with ecological security services such as contract audit and sentiment analysis.

You can visit the official website of Ratingtokenlearn more


猎豹区块链安全
作者文库