In-depth analysis of the ZKML track: the next step in the intelligence of smart contracts

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

BTC0.0₂₀

ETH0.0₂₀

HTX0.0₂₀

SOL0.0₂₀

BNB0.0₂₀

In-depth analysis of the ZKML track: the next step in the intelligence of smart contracts

DAOrayaki

2023-06-23 04:00

本文约6931字，阅读全文需要约28分钟

Expect to see more innovative ZKML use cases on-chain.

Proving machine learning (ML) model inference via zkSNARKs promises to be one of the most important advances in smart contracts this decade. This development opens up an excitingly large design space, allowing applications and infrastructure to evolve into more complex and intelligent systems.

By adding ML capabilities, smart contracts can become more autonomous and dynamic, allowing them to make decisions based on real-time on-chain data rather than static rules. Smart contracts will be flexible and can accommodate a variety of scenarios, including those that may not have been anticipated when the contract was originally created. In short, ML capabilities will amplify the automation, accuracy, efficiency, and flexibility of any smart contract we put on-chain.

ML is widely used in most applications outside of web3, and its application in smart contracts is almost zero. This is mainly due to the high computational cost of running these models on-chain. For example, FastBERT is a computationally optimized language model that uses about 1800 MFLOPS (million floating point operations), which cannot be run directly on the EVM.

In this article, we will:

See potential applications and use cases for on-chain ML
Explore emerging projects and infrastructure at the core of zkML
Discuss some of the challenges of existing implementations and what the future of zkML might look like

Introduction to Machine Learning (ML)

Machine learning (ML) is a subfield of artificial intelligence (AI) that focuses on developing algorithms and statistical models that enable computers to learn from data and make predictions or decisions. ML models typically have three main components:

Training data: A set of input data used to train a machine learning algorithm to make predictions or classify new data. Training data can take many forms such as images, text, audio, numerical data, or combinations thereof.
Model Architecture: The overall structure or design of a machine learning model. It defines the hierarchy, activation functions, and the type and number of connections between nodes or neurons. The choice of architecture depends on the specific problem and the data used.
Model parameters: The values or weights that the model learns during training to make predictions. These values are iteratively adjusted by an optimization algorithm to minimize the error between predicted and actual results.

The generation and deployment of the model is divided into two phases:

Training phase: During the training phase, the model is exposed to a labeled dataset and its parameters are tuned to minimize the error between the predicted and actual results. The training process usually involves multiple iterations or cycles, and the accuracy of the model is evaluated on a separate validation set.
Inference phase: The inference phase is the phase where the trained machine learning model is used to make predictions on new unseen data. A model takes input data and applies learned parameters to generate an output, such as a classification or regression prediction.

Currently, zkML mainly focuses on the inference phase of machine learning models, rather than the training phase, mainly due to the computational complexity of training in verification circuits. However, zkML's focus on inference is not a limitation: we anticipate some very interesting use cases and applications.

Verified Inference Scenarios

There are four possible scenarios for validating reasoning:

Private input, public model. Model Consumers (MC) may wish to keep their input confidential from Model Providers (MP). For example, an MC may wish to certify to lenders the results of a credit scoring model without disclosing their personal financial information. This can be done using a pre-commitment scheme and running the model locally.
Public input, private model. A common problem with ML-as-a-Service is that an MP may wish to hide its parameters or weights to protect its IP, while an MC wants to verify that the generated inferences are indeed from the specified model in an adversarial setting. Think of it this way: when providing an inference to the MC When , MP has an incentive to run a lighter model to save money. Using the promise of on-chain model weights, MCs can audit private models at any time.
Private input, private model. This occurs when the data used for inference is highly sensitive or confidential, and the model itself is hidden to protect IP. An example of this might include auditing healthcare models using private patient information. Composition techniques in zk or variants using multi-party computation (MPC) or FHE can be used to serve this scenario.
Public input, public model. While all aspects of the model can be made public, zkML addresses a different use case: compressing and validating off-chain computations to on-chain environments. For larger models, it is more cost-effective to verify the succinct zk of the inference than to re-run the model yourself.

Verified ML inference opens up new design spaces for smart contracts. Some crypto-native applications include:

1、DeFi

Verifiable off-chain ML oracles. Continued adoption of generative AI may drive the industry to implement signing schemes for its content (for example, news publications signing articles or images). Signed data is ready for zero-knowledge proofs, making the data composable and trustworthy. ML models can process this signed data off-chain to make predictions and classifications (for example, classifying election results or weather events). These off-chain ML oracles can trustlessly solve real-world prediction markets, insurance protocol contracts, etc. by verifying reasoning and publishing proofs on-chain.

DeFi applications based on ML parameters. Many aspects of DeFi can be more automated. For example, lending protocols can use ML models to update parameters in real time. Currently, lending protocols mainly rely on off-chain models run by organizations to determine collateral factors, loan-to-value ratios, liquidation thresholds, etc., but a better option might be a community-trained open-source model that anyone can run and verify.

Automated trading strategies. A common way to demonstrate the return characteristics of a financial model strategy is for MP to provide various backtest data to investors. However, there is no way to verify that the strategist is following the model when executing a trade - the investor has to trust that the strategist is indeed following the model. zkML provides a solution where MP can provide proofs of financial model reasoning when deployed to specific positions. This could be especially useful for DeFi-managed vaults.

2. Security

Fraud monitoring for smart contracts. Rather than letting slow human governance or centralized actors control the ability to suspend contracts, ML models can be used to detect possible malicious behavior and suspend contracts.

3. Traditional ML

A decentralized, trustless implementation of Kaggle. A protocol or market could be created that allows MCs or other interested parties to verify the accuracy of models without requiring MPs to disclose model weights. This is useful for selling models, running competitions around model accuracy, etc.

A decentralized prompt marketplace for generative AI. Prompt authoring for generative AI has grown to be a complex craft, with optimal output generating prompts often having multiple modifiers. External parties may be willing to purchase these complex hints from creators. zkML can be used in two ways here: 1) to validate the output of the hint, to assure potential buyers that the hint does indeed create the desired image;

2) Allow the tip owner to maintain ownership of the tip after purchase, while remaining obscure to the buyer, but still generating verified images for it.

4. Identity

Replace private keys with privacy-preserving biometric authentication. Private key management remains one of the biggest hurdles in the web3 user experience. Abstracting private keys via facial recognition or other unique factors is one possible solution for zkML.

Fair airdrops and contributor rewards. ML models can be used to create detailed personas of users to determine airdrop allocations or contribution rewards based on multiple factors. This can be especially useful when combined with identity solutions. In this case, one possibility is to have users run an open-source model that evaluates their engagement in the app as well as higher-level engagement, such as governance forum posts, to reason about their assignments. This proof is then provided to the contract for the corresponding token allocation.

5. Web3 Social

Filtering for web3 social media. The decentralized nature of web3 social applications will lead to an increase in spam and malicious content. Ideally, a social media platform could use a community-consensus open-source ML model and publish proofs of the model's reasoning when it chooses to filter posts. Case in point: zkML analysis on the Twitter algorithm.

Advertising/Recommendation. As a social media user, I may be willing to see personalized advertising, but wish to keep my preferences and interests private from advertisers. I can choose to run a model locally about my interests, feed it into a media application to provide me with content. In this case, advertisers may be willing to pay end users to make this happen, however, these models may be far less sophisticated than targeted advertising models currently in production.

6. Creator economy/games

In-game economy rebalancing. Token issuance, supply, burn, voting thresholds, etc. can be dynamically adjusted using ML models. One possible model is an incentive contract that rebalances the in-game economy if a certain rebalance threshold is reached and a proof of reasoning is verified.

New types of on-chain games. Cooperative humans versus AI games and other innovative on-chain games can be created where the trustless AI model acts as a non-playable character. Every action taken by the NPC is posted on-chain along with a proof anyone can verify that the correct model is being run. In Modulus Labs' Leela vs. the World, the verifier wanted to ensure that the stated 1900 ELO AI chose moves, not Magnus Carlson. Another example is AI Arena, an AI fighting game similar to Super Smash Brothers. In a high-stakes competitive environment, players want to ensure that the models they train are not interfering or cheating.

Emerging projects and infrastructure

The zkML ecosystem can be broadly divided into four main categories:

Model-to-Proof Compiler: Infrastructure for compiling models in existing formats (e.g., Pytorch, ONNX, etc.) into verifiable computational circuits.
Generalized Proof Systems: Construct proof systems for verifying arbitrary computational trajectories.
zkML-specific proof systems: Proof systems specifically built to verify the computational trajectories of ML models.
Applications: Projects working on unique zkML use cases.

01 Model-to-Proof Compilers

In the zkML ecosystem, most of the attention has been on creating model-to-proof compilers. Typically, these compilers will convert advanced ML models into zk circuits using Pytorch, Tensorflow, etc.

EZKL is a library and command-line tool for inference of deep learning models in zk-SNARKs. Using EZKL, you can define a computational graph in Pytorch or TensorFlow and export it as an ONNX file with some example inputs in JSON files, then point EZKL at these files to generate zkSNARK circuits. With the latest round of performance improvements, EZKL can now prove an MNIST-sized model in ~6 seconds and 1.1 GB of RAM. To date, EZKL has seen some notable early adoption, being used as infrastructure for various hackathon projects.

Cathie So's circomlib-ml library contains various ML circuit templates for Circom. Circuits includes some of the most common ML functions. Keras 2c ircom, developed by Cathie, is a Python tool that converts Keras models to Circom circuits using the underlying circomlib-ml library.

LinearA has developed two frameworks for zkML: Tachikoma and Uchikoma. Tachikoma is used to convert the neural network to an integer-only form and generate computational trajectories. Uchikoma is a tool that converts the intermediate representation of TVM to programming languages that do not support floating point operations. LinearA plans to support Circom with domain arithmetic and Solidity with signed and unsigned integer arithmetic.

Daniel Kang's zkml is a framework for proofs of execution of ML models built on his work in the "Scaling up Trustless DNN Inference with Zero-Knowledge Proofs" paper. At the time of writing, it was able to prove an MNIST circuit in about 5 GB of memory and in about 16 seconds of runtime.

In terms of more generalized model-to-proof compilers, there's Nil Foundation and Risc Zero. The Nil Foundation's zkLLVM is an LLVM-based circuit compiler capable of validating computational models written in popular programming languages such as C++, Rust, and JavaScript/TypeScript. Compared to the other model-to-proof compilers mentioned here, it is general-purpose infrastructure, but still suitable for complex computations such as zkML. This can be especially powerful when combined with their proof market.

Risc Zero builds a general-purpose zkVM targeting the open-source RISC-V instruction set, thus supporting existing mature languages such as C++ and Rust, as well as the LLVM toolchain. This allows seamless integration between host and guest zkVM code, similar to Nvidia's CUDA C++ toolchain, but using a ZKP engine instead of a GPU. Similar to Nil, use Risc Zero to verify the computational trajectory of an ML model.

02 Generalized proof system

Improvements to the proof system were the main driving force behind bringing zkML to fruition, in particular the introduction of custom gates and lookup tables. This is mainly due to ML's reliance on nonlinearity. In short, non-linearity is introduced through activation functions (such as ReLU, sigmoid, and tanh) that are applied to the output of a linear transformation in a neural network. These nonlinearities are challenging to implement in zk circuits due to the constraints of mathematical operation gates. Bitwise factorization and lookup tables can help with this by precomputing the non-linear possible outcomes into the lookup table, which is interestingly more computationally efficient in zk.

For this reason, the Plonkish proof system tends to be the most popular backend for zkML. Halo 2 and Plonky 2 and their table arithmetic schemes handle neural network nonlinearities well with lookup parameters. Additionally, the former has a vibrant ecosystem of developer tools and flexibility, making it a de facto backend for many projects, including EZKL.

Other proof systems also have their advantages. Proof systems based on R 1 CS include Groth 16, known for its small proof size, and Gemini, known for its handling of extremely large circuits and linear-time verifiers. STARK-based systems, such as the Winterfell prover/verifier library, are especially useful when taking a trace of a Cairo program as input via Giza's tool, and using Winterfell to generate STARK proofs to verify the correctness of the output.

03 zkML specific proof system

Some progress has been made in designing efficient proof systems that can handle complex, circuit-unfriendly operations of advanced machine learning models. Systems like zkCNN based on the GKR proof system and Zator based on combinatorial techniques tend to outperform universal proof systems, as reflected in Modulus Labs' benchmarking reports.

zkCNN is a method for proving the correctness of convolutional neural networks using zero-knowledge proofs. It uses the sumcheck protocol to prove fast Fourier transforms and convolutions, with linear proof time that is faster than asymptotically computing results. Several interactive proofs of improvement and generalization have been introduced, including validation convolutional layers, ReLU activation functions, and max pooling. According to a benchmark report by Modulus Labs, zkCNN is particularly interesting in that it outperforms other general-purpose proof systems in terms of proof generation speed and RAM consumption.

Zator is a project that aims to explore the use of recursive SNARKs to verify deep neural networks. A current limitation in validating deeper models is fitting entire computational trajectories into a single circuit. Zator proposed to use recursive SNARK to verify layer by layer, which can gradually verify N-step repeated calculations. They use Nova to reduce N compute instances to one instance that can be verified in a single step. Using this approach, Zator was able to SNARK networks with 512 layers, which is as deep as most current production AI models. Zator's proof generation and verification times are still too long for mainstream use cases, but their combined techniques are still very interesting.

Application field

Given zkML's early days, its focus has largely been on the aforementioned infrastructure. However, there are currently several projects dedicated to application development.

Modulus Labs is one of the most diverse projects in the zkML space, working on both example applications and related research. On the applied side, Modulus Labs demonstrated use cases for zkML with RockyBot, an on-chain trading bot, and Leela vs. the World, a board game in which humans play against a proven on-chain Leela chess engine. The team also conducted research and wrote The Cost of Intelligence, which benchmarks the speed and efficiency of various proof systems at different model sizes.

Worldcoin is trying to apply zkML to create a privacy-preserving human identity proof protocol. Worldcoin handles high-resolution iris scanning using custom hardware and plugs it into a Semaphore implementation. This system can then be used to perform useful operations such as proof of membership and voting. They currently use a trusted runtime environment and a secure enclave to verify camera-signed iris scans, but their ultimate goal is to use zero-knowledge proofs to verify the correct reasoning of neural networks to provide cryptographic-level security assurance.

Giza is a protocol that takes a completely trustless approach to deploying AI models on-chain. It uses the ONNX format for representing machine learning models, the Giza Transpiler for converting these models into the Cairo program format, the ONNX Cairo Runtime for executing models in a verifiable and deterministic manner, and the Giza Model smart contract for on-chain deployment and technology stack for executing the model. Although Giza could also be categorized as a model-to-proof compiler, their positioning as a market for ML models is one of the most interesting applications right now.

Gensyn is a distributed network of hardware provisioners for training ML models. Specifically, they are developing a probabilistic auditing system based on gradient descent and using model checkpointing to enable distributed GPU networks to serve training for full-scale models. Although their application of zkML here is very specific to their use case - they want to ensure that when a node downloads and trains a part of the model, they are honest about model updates - it demonstrates the power of combining zk and ML strength.

ZKaptcha focuses on the bot problem in web3 and provides verification code services for smart contracts. Their current implementation has end users generate proofs of human work by completing captchas, which are verified by their on-chain validators and accessed by smart contracts with a few lines of code. Today, they mainly only rely on zk, but they intend to implement zkML in the future, similar to existing web2 captcha services, analyzing behavior such as mouse movements to determine whether the user is human.

Given the early days of the zkML market, many applications have already been experimented with at the hackathon level. Projects include AI Coliseum, an on-chain AI competition using ZK proofs to verify machine learning output, Hunter z Hunter, a photo scavenger hunt using the EZKL library to verify the output of an image classification model with halo 2 circuits, and zk Section 9, It converts AI image generation models into circuits for casting and validating AI art.

zkML challenges

Despite rapid progress in enhancement and optimization, the field of zkML still faces some core challenges. These challenges involve technical and practical aspects, including:

Quantization with minimal loss of precision
Circuit size, especially when the network consists of multiple layers
Efficient Proof of Matrix Multiplication
against attack

Quantization is the process of representing floating-point numbers as fixed-point numbers. Most machine learning models use floating-point numbers to represent model parameters and activation functions. When dealing with domain arithmetic for zk circuits, fixed-point numbers are required. The impact of quantization on the accuracy of a machine learning model depends on the level of precision used. In general, using lower precision (i.e., fewer bits) may result in less accuracy, as this may introduce rounding and approximation errors. However, there are several techniques that can be used to minimize the impact of quantization on accuracy, such as fine-tuning the model after quantization, and using techniques such as quantization-aware training. Additionally, Zero Gravity, a hackathon project at zkSummit 9, showed that alternative neural network architectures developed for edge devices, such as weightless neural networks, can be used to avoid quantization problems in circuits.

Besides quantization, hardware is another key challenge. Once a machine learning model is correctly represented by a circuit, proofs to verify its reasoning become cheap and fast due to the simplicity of zk. The challenge here is not with the verifier, but with the prover, as RAM consumption and proof generation time increase rapidly as the model size grows. Certain proof systems (such as GKR-based systems using the sumcheck protocol and hierarchical arithmetic circuits) or combined techniques (such as combining Plonky 2 with Groth 16, Plonky 2 is efficient in terms of proof time but efficient for large models Poor, and Groth 16 does not lead to proof size growth in the complexity of complex models) is better suited to handle these issues, but managing tradeoffs is a core challenge in zkML projects.

In terms of combating attacks, there is still work to be done. First, if a trustless protocol or DAO chooses to implement a model, there is still a risk of adversarial attacks during the training phase (e.g. training a model to exhibit a certain behavior when it sees a certain input, which could be used to manipulate subsequent reasoning) . Federated learning techniques and zkML in the training phase may be a way to minimize this attack surface.

Another core challenge is the risk of model theft attacks when the model preserves privacy. While it is possible to obfuscate the weights of a model, given enough input-output pairs, it is still theoretically possible to infer the weights backwards. This is mostly a risk to small-scale models, but there are still risks.

Scalability of smart contracts

While there are some challenges in optimizing these models to run within the constraints of zk, improvements are being made at an exponential rate, and some expect that with further hardware acceleration we will soon be able to work with the broader field of machine learning reach the same level. To emphasize the speed of these improvements, zkML went from 0x PARC's demonstration of how to perform a small-scale MNIST image classification model in a verifiable circuit in 2021, to Daniel Kang doing the same for an ImageNet-scale model less than a year later papers. In April 2022, the accuracy of this ImageNet-scale model improved from 79% to 92%, and large models like GPT-2 are expected to be possible in the near future, although the current proof time is longer.

We see zkML as a rich and growing ecosystem designed to extend the capabilities of blockchains and smart contracts, making them more flexible, adaptable, and intelligent.

Although zkML is still in the early stages of development, it has already begun to show promising results. As the technology develops and matures, we can expect to see more innovative zkML use cases on-chain.