
Editor's Note: This article comes fromWanxiang Blockchain (ID: gh_1b8639a25429), reprinted by Odaily with authorization.
Editor's Note: This article comes from
Wanxiang Blockchain (ID: gh_1b8639a25429)
Blockchain and the data element market are two areas that are getting a lot of attention right now. In April of this year, the Central Committee of the Communist Party of China and the State Council’s “Opinions on Building a More Complete System and Mechanism for Market-Oriented Allocation of Elements” listed data as one of the elements for the first time, and the National Development and Reform Commission defined the “new infrastructure”. technical infrastructure. Many professionals and scholars have discussed the application of blockchain in the data element market, highly affirming the significance of this application in protecting and using personal data and improving the data foundation for AI development. However, unlike the application of blockchain in central bank digital currency, stable currency, supply chain finance, certificate deposit and anti-counterfeiting traceability and other fields, the data element market itself is in the early stage of development, and there is still no conclusion on many core issues, which makes the blockchain The discussion of the application of chains in the data element market is difficult to go deep.
This paper builds on previous research to discuss the role blockchain can play in different links of the data value chain. According to the 2018 report of the Global System for Mobile Communications Association [1], the data value chain can be mainly divided into four links (Figure 1): The first is data generation, which refers to data recording and acquisition. The second is data collection, verification and storage. The third is data analysis, which refers to processing and analyzing data to generate new insights and knowledge. The fourth is exchange, which refers to the use of data analysis results, which can be used internally or transferred externally. This link is more appropriately called "data element allocation". This article is divided into 5 parts. The first 4 parts are carried out in turn according to the above 4 links, the focus is on the discussion of the 4th link, and the fifth part summarizes the full text.
image description
Figure 1: The main links of data value
first level title
Application of Blockchain in Data Recording and Acquisition
Blockchain is a distributed ledger about Token, which is essentially a state variable defined within the blockchain (Part 4 will discuss another meaning of Token in the payment field). There are both data related to Token and its transactions in the blockchain, and data unrelated to Token and its transactions.
Data unrelated to Token and its transactions are written into the blockchain as an addition to Token transactions. Writing to the blockchain means that the entire network is visible, cannot be tampered with, and will not make mistakes in copying and dissemination, but the blockchain itself cannot guarantee the authenticity and accuracy of these data at the source and at the writing stage. Due to the limitation of storage capacity in the blockchain, this part of data can only be written into the blockchain in the form of a hash summary in many cases, and only a small amount of structured information can be uploaded in the form of raw data. Therefore, in the vast amount of data that is generated all the time in the real world, the proportion that can be uploaded in the form of raw data is almost negligible. This shows that blockchain is not a general-purpose ledger or database, and its strengths should be used. Only data with a high enough value is worth uploading to the chain in the form of raw data.
The main function of the hash summary on-chain is to store evidence [2], to increase the credit of the original data stored on the local device or on the cloud—by revealing the original data afterwards (for example, allowing external organizations to penetrate to the local device storing the original data) , to prove two points: one is that the original data does exist at the upload time of the blockchain record; the other is that the uploader does know the original data. However, it is not advisable to overestimate the role of blockchain in storing evidence and increasing credit for data. In particular, for data that is not native to the blockchain, its credibility cannot be separated from the support of specialized data recording and acquisition technologies and related systems, such as the "blockchain + Internet of Things" discussed next. data management.
IoT devices are constantly acquiring data such as geographic location, temperature and humidity, speed, and altitude from their surroundings. Under the current end-side anti-attack technology, the authenticity and accuracy of IoT data at the source is guaranteed to a considerable extent. IoT data is mainly stored on the cloud and locally on IoT devices. Most of the IoT is capable of running hash algorithms and public-private key signature operations. In IoT data on-chain, only a small amount of structured data can be directly written into the blockchain, and most of the data is on-chain in the form of hash digests. Therefore, in the "blockchain + Internet of Things" management of IoT data, relevant operations are automatically performed by IoT devices, which is very efficient and reduces human intervention.
first level title
Application of blockchain in data collection, verification and storage
Data collection, verification and storage mainly rely on database technology, and the role that blockchain can directly play is limited. For example, the management of personal data in the financial field now generally emphasizes the application of API technology to generate compound value through data aggregation.
first level title
Application of blockchain in data analysis
If data analysis is also carried out through a market division network composed of different institutions (for example, some institutions provide computing power and others provide algorithms), then in theory, a distributed data economy based on blockchain can also be introduced. For example, the PlatON project is committed to building a high-performance computing network to facilitate the circulation of data and computing power. The main market participants include computing coordinators, data providers, and computing power providers [4].
first level title
Application of blockchain in data element configuration
As an integrated technology with the color of production relationship, the application of blockchain in the data element market will be mainly reflected in the configuration of data elements. Next, this issue will be discussed from the two levels of data element ownership confirmation and the organizational form of the data element market.
secondary title
(1) Confirmation of Data Element Rights
Economic research shows that the prerequisite for any effective allocation of resources is to determine the property rights of resources, and data elements are no exception. Property rights are a complex economic concept that refers to an enforceable social structure that determines how resources are used or owned. Property rights have three core dimensions: first, the right to use resources; second, the right to obtain benefits from resources; third, the right to transfer resources to others, change resources, give up resources, and destroy resources. Property rights can be subdivided into "bundles of rights" such as ownership, possession, control, use, income, and disposal.
Data has the characteristics of both goods and services. Much data is non-exclusive and non-competitive. Data ownership is a complex issue both in law and in practice, especially for personal data. In reality, a typical representative of data that can clearly define ownership is a patent, but the complexity of data ownership can be seen more clearly from patents.
The premise of obtaining a patent right is to disclose the technical content of the invention so that the public can make further improvements and avoid waste of resources for repeated research and development. For example, the patent trial authority will generally disclose the content of the patent specification about 18 months after the invention patent application. The patentee enjoys the exclusive right of the patented technology and enjoys commercial privileged interests within the statutory period. This is to protect the rights of inventors and encourage the public to invent. When the statutory period of the patent right expires, the patent right will be extinguished, and the public can freely use the patented technology according to the content disclosed in the patent specification.
From the perspective of global practice, the confirmation of data element rights is the product of the joint action of law and technology. Generally, the law first determines the institutional framework of data property rights, and then technology ensures the enforceability of these institutional frameworks. For example, many newspapers and magazines are now paid. Only paid accounts can read articles, and technology is used to restrict copying and screenshots of articles. If someone plagiarizes, the law will be adopted to protect rights and interests. In many occasions, it is impossible to confirm the rights of data elements only by technology. The first part discusses the role of blockchain in depositing evidence. Data storage does not mean data confirmation. For example, the inventor can put the hash summary of the invention document on the blockchain to prove that he was the first to make the relevant invention and have the function of "self-certification of innocence" in the event of future disputes. However, if it is not approved by the patent examination authority, the chaining of invention documents does not mean patent rights.
In the blockchain, the address can hide the identity of the actual controller, and the hash summary can hide the original data, but the blockchain itself is not a privacy management technology. In particular, the data in the public chain is visible to the entire network, and technologies such as ring signature, coin mixing, and coin combination are needed to hide the flow of funds in the chain. The consortium chain can achieve differential opening of data, allowing different users to have different permissions to read data in the blockchain. But as discussed in the first part, the data stored in the blockchain is limited after all, and the direct role of the blockchain in data control is also limited. For example, in the "blockchain + government data sharing" project, government data is stored on local devices (generally a confidential network within government departments), and data calls across government departments are still carried out through traditional methods, and the original data cannot be stored in Circulation on the blockchain, but the blockchain will record data application, authorization, call and access records, so as to be non-repudiable, mainly to leave traces for post-audit.
Among various data control technologies, the most closely related to blockchain is cryptography, including verifiable computing, homomorphic encryption, and secure multi-party computing. For complex computational tasks, verifiable computations generate a short proof. As long as this short proof is verified, it can be judged whether the computing task has been executed accurately, and there is no need to repeat the computing task. Under homomorphic encryption and secure multi-party computation, when data is provided externally, it is in ciphertext rather than plaintext. These cryptography techniques make "data available and invisible", but because of the high requirements for computing resources, it can only be done outside the blockchain.
Among various data control technologies, the most easily confused with blockchain is payment tokenization, which is also briefly explained here. The English of payment tokenization is Tokenization[5], which refers to the use of specific payment tokens (Payment Token in English) to replace payment elements such as bank card numbers and payment accounts of non-bank payment institutions, and to limit the scope of application of the tokens, reducing the cost in merchants and merchants. The risk of bank account and payment account information leakage on the acceptance agency side reduces transaction fraud and ensures user transaction security. There is a mapping relationship between payment tokens, bank accounts, and payment accounts. This mapping relationship is managed by the token service provider through two processes of payment tokenization and de-tokenization. Payment tokenization is a fundamental core element of digital payments. For example, in mobile payment, the user uses the Token number as the device card number stored in mobile devices such as mobile phones, and can use mobile devices to make non-contact near-field payments on terminals such as offline POS machines and ATMs, and can also be used by mobile phone customers. Initiate remote payment directly in the terminal.
Currently, UnionPay mobile QuickPass and online payment products have fully applied payment tokenization technology. From the above introduction, it can be seen that Token in payment tokenization represents sensitive information such as bank accounts and payment accounts, has standardized compilation standards, and does not rely on complex cryptography technology; Applications such as stablecoins represent fiat currency reserve assets, but Token itself is a product of blockchain technology.
secondary title
(2) The organizational form of the data element market
Due to the variety of types and characteristics of data elements, there is a lack of objective valuation standards, and the buyout transaction model will not be adopted on many occasions, so the data element market will not become a centralized and liquid trading market like the stock market . This can be verified from the trials of big data trading centers or big data exchanges in many provinces and cities in the past few years. None of these trials had the expected success. Although there are reasons such as insufficient policy support and supporting technology not keeping up, the more important reason is that the economic attributes of data elements do not support a transaction model with a high degree of standardization, bidding matching and active transactions.
In the big picture, the data element market will be closer to the OTC market such as the bond market and the OTC derivatives market, with a low degree of standardization, point-to-point transactions and negotiated pricing, and the transaction frequency is low but will continue to occur. But this does not mean that the ultimate data providers (such as individuals and IoT devices) and the ultimate data demanders (such as AI algorithm companies) will directly enter the market. The data element market will evolve some "data intermediaries" to allow data to flow better from the final provider to the final demander.
Therefore, the overall structure of the data element market will be distributed, but there will be some "data intermediaries" as core nodes. The application of the blockchain in the organizational form of the data element market must be analyzed within this framework.
First, the main functions of "data intermediaries" are data collection, verification, storage and analysis. How these "data intermediaries" use the blockchain has been analyzed in the second and third parts. It should be added that the blockchain can be used to improve the data distribution process. For example, in 2018, in the central bank digital currency prototype system [6], Yao Qian proposed to apply the blockchain to the central bank digital currency confirmation registration. His idea is that the central bank and commercial banks will build a distributed title confirmation ledger for the central bank's digital currency, provide a website for external confirmation inquiries through the Internet, and realize the online money detector function of the central bank's digital currency. This is to use the non-tamperable and non-forgeable characteristics of the blockchain to improve the data and system security of the right confirmation query.
Second, as discussed above, most of the data in the real world will not be stored and transferred through the blockchain, but the blockchain can record activities such as data authorization, call and access, which is similar to the blockchain in the supply chain Applications in scenarios such as management and product traceability. This application direction is valuable, but the significance of innovation is not very strong. First of all, data analysis and use will generate new data, making the traceability of data circulation less meaningful. Secondly, if you want to track and trace data circulation from the perspective of data confidentiality and leakage prevention, analyzing TCP/IP data packets is a more direct and effective method than blockchain.
Third, the blockchain serves as an organizational tool for the data element market, which is the concept of a distributed data economy introduced earlier:
The basis of a distributed data economy is data right confirmation, which is reflected in the fact that data providers can effectively control the use of data by data demanders.
In a distributed data economy, the medium of exchange adopts central bank digital currency or stable currency. The reason is that some participants in the distributed data economy can be impersonal, such as IoT devices as data providers and AI algorithms as data demanders. The central bank's digital currency and stable currency can be compatible with the openness of the distributed data economy, and can guarantee the security and efficiency of payment.
summary
The distributed data economy has many interesting application scenarios. For example, in "Blockchain + Internet of Things", the IoT device ID is bound to the digital currency wallet address, and the data storage, transmission, mining and value interaction in the Internet of Things can be carried out in a credible manner. Relevant economic activities are accounted for through central bank digital currency or stable currency. It is conceivable that when an IoT device continues to provide high-quality data, it will receive more central bank digital currency or stable currency as a "reward" (actually belonging to the owner of the IoT device). This economic incentive will significantly boost the collection and use of IoT data.
This direction is conducive to the realization of the distributed cognitive industrial Internet proposed by Dr. Xiao Feng [7]. The distributed cognitive industrial Internet adopts a distributed governance structure, and all enterprises can join with confidence. It adopts cognitive intelligence technology based on knowledge graphs and data collaboration based on privacy computing, and integrates manufacturing and services based on full lifecycle management.
first level title
summary
Blockchain is of great significance to the construction of data element market. However, because the data element market itself is in the early stage of development, there are still no conclusions on many core issues, which makes it difficult to discuss the application of blockchain in the data element market. This article takes a “break down” approach to discuss the role that blockchain can play in different links of the data value chain.
Second, data collection, verification, storage and analysis. The role that the blockchain can directly play in these links is limited. But if these links are carried out through a market division network composed of different institutions, then they can be built on the blockchain and become a distributed data economy.
Note:
[1] GSMA, 2018, "The Data Value Chain".
Third, the link of data ownership confirmation. Data right confirmation is the basis of data element configuration. Data element rights confirmation is the product of the joint action of law and technology. Depositing data through the blockchain does not mean confirming data rights. In practice, data right confirmation is mainly reflected in the fact that the data provider can effectively control the use of data by the data demander. In this sense, blockchain (especially public chain) is not a privacy management technology. The consortium chain can be open to data with differences, so that different users have different permissions to read the data in the blockchain. However, the data stored in the blockchain is limited, and the direct role of the blockchain in data control is also limited. Cryptographic technologies such as verifiable computing, homomorphic encryption, and secure multi-party computing make "data available and invisible", but because of the high requirements for computing resources, it can only be done outside the blockchain.Fourth, the configuration link of data elements. The overall structure of the data element market will be distributed, but there will be some "data intermediaries" as core nodes. The non-tamperable and non-forgeable features of the blockchain help to improve the data release process. The blockchain can record activities such as data authorization, call and access, which has certain value, but its innovative significance is limited. The innovative value of the blockchain in this link is mainly reflected in the distributed data economy, which essentially conducts large-scale collaborative computing through market mechanisms, and realizes the effective allocation of data elements while protecting data property rights. The distributed data economy helps realize the distributed cognitive industrial Internet.At the end of this article, if you want to get more in-depth insights into the blockchain, click to read the original text and sign up for the 6th Blockchain Global Summit!
Note:[2] Another major use of the hash digest is to cooperate with the preimage (Preimage) as a multi-party coordination tool in the hash time lock contract (HTLC) and discrete log contract (DLC). can refer to "Hash Timelock Application
"(Wanxiang Blockchain Research Report No. 12, 2020).[3] For the analysis of the Filecoin economic model, see "。
[5] Brief introduction of Filecoin economic model。
"(Wanxiang Blockchain Research Report, Issue 29, 2020).