Mind Reading: Extracting "Knowledge" from Zero-Knowledge Proofs
安比(SECBIT)实验室
2019-08-28 02:28
本文约9828字,阅读全文需要约39分钟
We will analyze three properties of the next interactive system (security protocol): "completeness", "reliability" and "zero knowledge".

Editor's Note: This article comes fromAmbi Labs (ID: secbitlabs)Editor's Note: This article comes from

, Author: Guo Yu, reproduced by Odaily with authorization.

This article has been updated to Github

https://github.com/

And what, Socrates, is the food of the soul? Surely, I said, knowledge is the food of the soul. 

Some theories are very interesting, and zero-knowledge proof is one of them. After groping for a long time, I want to write something and discuss it with everyone. This article is the third in the series "Exploring Zero-Knowledge Proofs". The full text is about 8,000 words, with a few mathematical formulas.

Socrates, what is soul food? As I said, of course it is knowledge.

— Plato

secondary title

  • "Zero Knowledge" vs. "Reliability"

  • We can see these three properties in many articles introducing zero-knowledge proofs:

  • Completeness - completeness

Soundness - Reliability

Zero-Knowledge —— zero knowledgeBut few articles explain in depth the meaning and insight behind this feature.exist"

Series (2) Understanding "Simulation"

Rather than giving a list of the events that are not allowed to occur, it (the definition of zero-knowledge proof) gives a maximalist simulation condition.

— Boaz Barak

’ In the article, we introduced the concept of “simulator”. Many introduction articles avoid talking about "simulation", but "simulation" can be said to be the core of the security protocol, because it is an important weapon to define "security".

Usually, we define security in such a way that we first list some security events and then state: If a system is secure, none of the listed security events will occur.

To borrow the words of cryptographer Boaz Barak, to translate, "zero-knowledge proof" is not defined by giving a list of events that are not allowed to happen, but directly gives the most extreme "simulation conditions".

  • The so-called "simulation condition" refers to the realization of an "ideal world" through the "simulation" method, making it indistinguishable from the "real world"; and since there is no knowledge in the ideal world, it can be concluded that the real world satisfies "Zero knowledge".

  • We continue to analyze the three properties of the next interactive system (security protocol): "completeness", "reliability" and "zero knowledge".

  • Soundness: Alice cannot pass Bob's verification without knowledge.

Completeness: Alice can pass Bob's verification if she has knowledge.

Zero-knowledge: Alice will not disclose any information about knowledge during the interaction.

We can see that there is a kind of "symmetry" between "reliability" and "completeness". Reliability guarantees that malicious Alice must fail, while completeness guarantees that honest Alice must succeed.

"Completeness" is easier to prove. As long as Alice is honest and Bob is honest, then everyone is happy. This is like writing a piece of code, feeding a test case, and calling it a day.

Let's think about how "reliability" should be defined? The converse of this soundness proposition is: (in the real world) if Alice can pass Bob's verification, then Alice must have knowledge. Or: Alice knows that... "secret"!

The following question is how to prove that Alice knows a "secret"?

This seems to be difficult too, right? If we need to prove that a machine knows a "secret", the easiest way is to find the "secret" in the hard disk of the machine or in the memory, but this exposes the secret. What if the machine is a black box? Or what about Alice? We don't have mind reading skills, so we can't guess the secret in her heart.

How to define "To Know"?

"Zero knowledge" ensures that the verifier Bob has no (computing) ability to "extract" information related to "knowledge". "Knowledge" that cannot be extracted does not mean that it does not exist. "Reliability" guarantees the "existence" of knowledge.

Guaranteeing "zero knowledge" is only meaningful if "knowledge" exists.This article will discuss "Reliability" and "To Know".Zero-knowledge data exchange protocol zkPoD

』[1] One of the core technologies.

secondary title

  • sk = a

  • PK = aG

Compact Schnorr protocol

Alice has a secret number, a. We can think of this number as a "private key", and then "map" it to a point a*G on the elliptic curve group, abbreviated as aG. We regard this point as the "public key".

Please pay attention to the word "mapping". Here we briefly introduce the concept of "homomorphism". There is a homomorphic mapping relationship between elliptic curve group finite fields. We use the symbol Zq to represent a finite field, where the prime number q refers to the size of the finite field, which refers to a set of integers from 0, 1, 2, ..., q-1. On an elliptic curve, we pass a base point, G, to generate a "cyclic group", marked as 0G, G, 2G, ..., (q-1)G, which is exactly a set of q curve points. Any two curve points can just perform a "special binary operation", G + G = 2G, 2G + 3G = 5G, it seems that this binary operation is similar to "addition", satisfying the law of commutation and associativity. So we use the + symbol to represent it. The reason why this group is called a cyclic group is that the last element (q-1)G of the group, plus a G, wraps back to the first element 0G of the group.

Given any integer r over a finite field, we can find a corresponding point rG in the cyclic group, or denote r*G by a scalar multiplication. But the reverse calculation is very "difficult", which is a "cryptographic problem" - known as the discrete logarithm problem [2].

That is to say, if a point R on the elliptic curve cyclic group is given arbitrarily, it is very difficult to calculate which integer in the finite field corresponds to R. If the finite field is large enough, such as 256bit, we It can be considered that this reverse calculation is impossible.

The Schnorr protocol makes full use of the one-way mapping between finite fields and cyclic groups, and realizes the simplest zero-knowledge proof security protocol: Alice proves to Bob that she has the private key sk corresponding to PK.

Step 1: In order to ensure zero-knowledge, Alice needs to generate a random number, r, which is used to protect the private key from being extracted by Bob. This random number also needs to be mapped onto the elliptic curve group, rG.

Step 2: Bob wants to provide a random number for the challenge, let's call it c.

Step 3: Alice calculates z = r + a * c according to the number of challenges, and sends z to Bob at the same time, and Bob checks it by the following formula: z*G ?= R + c*PK = rG + c*(aG )

You can see Bob's calculation process of "homomorphically" checking z in the third step. If this formula holds true, then it can be proved that Alice does have the private key a.

  • But why?

  • The calculation and verification process of z is very interesting, and there are several key techniques:

  • Bob's verification is done on the elliptic curve group. Bob does not know r, but he knows that r is mapped to the point R on the curve; Bob does not know a, but he knows that a is mapped to the point PK on the curve group, that is, a*G. Through homomorphic mapping and the Schwatz-Zippel theorem, Bob can verify whether the calculation process of z is correct, so that he knows that Alice calculated z through r and a, but does not expose the values ​​​​of r and a.

Also, the random number r generated in the first step of the protocol ensures the confidentiality of a. Because the sum of any secret when added to a random number that conforms to the "consistency distribution" still conforms to the "consistency distribution".

secondary title

Prove zero knowledge

Let's take a look at how the Schnorr protocol proves a weaker "zero-knowledge" property - "SHVZK":

Note: What we prove here is only Special Honest Verifier Zero-Knowledge (SHVZK). SHVZK requires that Bob's behavior in the agreement cannot be unreasonable. For example, he must follow the agreement, and in the second step, go to the conveyor belt to get a fresh random number and use it immediately. In the usual sense, "zero knowledge" does not make any requirements for Bob, so we say that this is a weaker property. Although the current Schnorr protocol cannot prove complete "zero-knowledge", it can achieve the goal of complete zero-knowledge by adding some protocol steps. The details will not be expanded here. Interested readers please refer to [4]. We will discuss this issue again later when we discuss the Fiat-Shamir transformation.

First, the "simulator" simulates an "ideal world". In the ideal world, a conversation between Zlice and Bob is simulated. Zlice has no knowledge of the Schnorr protocol, sk, and Bob has the public key PK. Please look at the picture below. Bob needs to produce a random number c in the second step of the Schnorr protocol. There is an additional requirement here, that is, Bob can only "honestly" get a random number from an external "random number conveyor belt" , each random number must be a one-time distributed random number within the range of 2^k generated by tossing the "coin" k times in advance. Bob cannot generate random numbers any other way, which is why we require Bob to be honest.

Here's how Zlice can fool Bob:

Prologue: Note that Zlice has no knowledge of sk, at this point Bob's random number carousel has been pre-placed with some random numbers.

Step 1: Zlice generates a random number c with consistent distribution, and uses a new "superpower" to replace the random number c just generated with the first random number on Bob's random number conveyor belt. At this time, Bob can't detect it.

Step 2: Zlice generates a random number z again, then calculates R'=z*G - c*PK, and sends R' to Bob.

Step 3: At this time, Bob will get c from the random number conveyor belt, and send c to Zlice. Please note that this c is exactly the c generated by Zlice in the first step.

Step 4: Zlice sends the random number z generated in the third step to Bob, and Bob performs verification according to the verification formula of the Schnorr protocol. You can check that this formula is perfectly established.

  • You can compare the Schnorr protocol in the "real world". In both worlds, Bob can pass the verification.

  • But the difference is:

  • In the "ideal world", Zlice does not have sk; while in the "real world", Alice has sk

In the "ideal world", z is a random number and sk is not involved; while in the "real world", z is calculated with sk

Q

In the "ideal world", Zlice uses superpowers to replace Bob's random numbers; in the "real world", Alice cannot see Bob's random number conveyor belt, and cannot change the numbers on the conveyor belt

In the Schnorr protocol, can Bob send the challenge number in the second step and reverse the order of the first step? That is to say, can Bob send the challenge number first, and then Alice sends R = r*G.

The answer is no.

The answer is no.

If Alice knows the random number ahead of time, then Alice (in the real world) can fool Bob as the simulator Zlice did.

secondary title

Encounter Simulator

In fact, the two properties of "reliability" and "zero knowledge" also have a symmetry in another dimension. Reliability guarantees that the malicious Alice must fail, and zero knowledge guarantees that the malicious Bob will never succeed. Interestingly, this symmetry will be reflected in the simulated "ideal world".

Let's analyze the definition of reliability: Alice's lack of knowledge causes Bob's verification to fail. Its converse proposition is: Bob's successful verification leads to Alice must have knowledge.

We turned to the simulator again to test Alice's knowledge in an "ideal world" where superpowers can be exerted.

Again, please imagine that in the parallel universe, there are two worlds, one is called "ideal world" and the other is called "real world". The interesting thing about the ideal world is that it is simulated by a "simulator", and the simulator can put NPCs with superpowers in the ideal world. This time, Alice's two avatars are put into the "ideal world" and "real world" at the same time.

Suppose "you" play the role of Bob, and you want to know whether the Alice you are talking to is really "reliable". So put you into the "ideal world", with the help of an NPC with superpowers, you can "extract" the knowledge of Alice on the opposite side.

What...what? Didn't we just prove: Protocols are zero-knowledge? Zero knowledge means that Bob cannot extract any "knowledge" fragments. Hitting the blackboard here, "zero knowledge" is for the "real world". What we are discussing now is the magical "ideal world".

To repeat, in the "ideal world", you can use an NPC with superpowers to extract Alice's knowledge, so as to ensure that Alice in the "real world" cannot cheat. Imagine a cheating Alice, she must have no knowledge, and without knowledge, it is impossible for NPC to extract anything in the "ideal world".

However, in the "real world", you cannot use NPCs, and of course you cannot see Alice's knowledge, and it will not conflict with the nature of "zero knowledge". Because events in the two worlds are "indistinguishable", we can conclude that in the "real world", Alice must have knowledge.Sort out the ideas: how to prove that Alice cannot cheat in an interactive session? We need to define a "simulation algorithm" for this interactive session, which can simulate an "ideal world", in which there is a special role called "Extractor", which is the NPC we mentioned earlier, it can pass "Super power" to "extract" Alice's knowledge, but make the other party "unaware".Note, superpowers are essential! This is in "

Series (2) Understanding "Simulation"

Last but not least, what are superpowers?

This depends on the proof of the specific interactive system. Let's start with the Schnorr protocol we just talked about.

secondary title

Proof of Knowledge: "Knowledge Proof"

Let's prove the "reliability" of the Schnorr protocol and see how this super NPC extracts Alice's private key in the "ideal world". And this "super power" is still "back in time".

Step 1: Alice chooses a random number r, calculates R=r*G, and sends R to the "extractor"
Step 2: The extractor also selects a random challenge number c and sends it to Alice

Step 3: Alice computes and responds with z, and the extractor checks if z is correct

Step 4: After the extractor finds that there is no problem with z, it activates its super power and reverses the time before the second step

Step 5: The extractor sends a different random challenge number c' to Alice again. At this time, Alice returns to the second step, and there will be a feeling of deja vu, but she cannot perceive the fact that time goes back

Step 6: Alice calculates z' again and sends it to the extractor for checking

Step 7: Now that the extractor has z and z', it can directly calculate the private key a owned by Alice and achieve "knowledge extraction"

To sum up: in the "ideal world", the "extractor" can completely "extract" Alice's "knowledge" through the superpower of going back in time, which ensures that an Alice without knowledge cannot make the extractor achieve its goal , thus proving "reliability".

Note: Not all reliability necessarily requires the presence of an extractor algorithm. A proof system that uses extractors to prove reliability is called "Proof of Knowledge".

secondary title

Interpreting ECDSA Signature Attacks

The ECDSA signature scheme that can be seen everywhere in the blockchain system is also a simple zero-knowledge proof system. The elliptic curve digital signature scheme ECDSA is very close to the Schnorr protocol, and the signature scheme based on the Schnorr protocol was published in "Journal of Cryptography" [5] in 1991. In 1991, when the National Institute of Standards (NIST) selected the digital signature algorithm, the elegant Schnorr signature scheme was actually patented, so NIST proposed another signature scheme DSA (Digital Signature Algorithm), and then this scheme supported the ellipse The curve is then called ECDSA. When Satoshi Nakamoto conceived Bitcoin, he chose ECDSA as the signature algorithm, but the curve did not choose the elliptic curve recommended by the NIST standard - secp256-r1, but secp256-k1. Because of rumors in the rivers and lakes, NIST may have tampered with the selection of elliptic curve parameters, causing some institutions to use unknown methods to solve discrete logarithm problems, and thus have the ability to have superpowers in the "real world". Many people are doubting that perhaps Satoshi Nakamoto also had this consideration when he designed Bitcoin back then, deliberately choosing a curve like secp256-k1 that seems to be slightly less secure.

Let's disassemble the ECDSA signature and define an authentication scheme similar to ECDSA in an interactive way. See the figure below for the interaction.

Step 1: Alice still chooses a random number k, and maps k to the elliptic curve to get point K, and then sends it to Bob

Step 2: Bob needs to generate two random numbers, c and e, and give them to Alice

Step 3: Alice calculates s and sends it to Bob, who verifies whether the calculation process of s is correct

Note: For readers who are familiar with the ECDSA signature scheme, here is a brief explanation. The c generated by Bob corresponds to the Hash value Hash(m) of the signed message, and e is generated by a conversion function F(K). Among them, F(.) is obtained by taking the x-coordinate of the point on the elliptic curve and passing through (mod q)[6].

There is a saying in the world that the ECDSA signature scheme has a serious security risk. If the same random number is used in two signatures, the private key of the signer will be exposed. In fact, the Schnorr signature scheme also has the same problem.

k = (c - c')/(s - s')a = (k * s - c)/e

When Sony PlayStation 3 engineers called the ECDSA library function, they passed in a constant in the parameter position where a random number should have been input. Hackers familiar with cryptography discovered this serious backdoor. In January 2011, the magic boy Geohot publicly released the master private key of Sony PS3, which means that any user can easily get the root privilege of the game console. Sony was furious afterwards... (you can search the follow-up story online)

If Alice uses the same K in two interactions, then Bob can get s and s' by sending two different c and c', and then calculate the private key a by the following formula:

So how should we look at this "security back door"? Think about it, everyone, this security backdoor is almost exactly the same as the reliability proof of the Schnorr protocol we proved earlier! This algorithm is exactly the "extractor" algorithm in the "reliability" proof of the ECDSA authentication protocol. It’s just that in the proof of reliability, in order for Alice to use the same random number k to authenticate twice, the “extractor” needs to use the superpower of “time reversal”.

But in the Sony PS3 system, the random number was written by an unknown engineer to a fixed value, which is equivalent to directly giving hackers "superpowers", and this is in the "real world". In other words, hackers can implement "extractors" without "going back in time".

1: z1 = r1 + c1*a2: z2 = r2 + c2*a

As a reminder, it's not just a matter of non-repeatable random numbers. Instead, the random number must be a cryptographically secure random number.

Imagine that if the random number r is generated by a pseudo-random number generator using the principle of "linear congruence", although the value of r keeps changing, it still cannot prevent "knowledge extraction". Assuming that the linear congruential algorithm is r2= d*r1 + e (mod m), return to the third step of the Schnorr protocol:

If the attacker asks Alice to make two consecutive signatures, then after substituting r2 into r1, there will be two linear equations to solve two unknowns (r1, a), z1, z2, c1, c2, d, e for the attack The latter is known, and this system of equations can be solved with only junior high school mathematics knowledge.

  • Please note that this is not a "design defect" of the Schnorr protocol (or ECDSA protocol). On the contrary, this is a more delicate design of the Schnorr protocol, which guarantees the reliability of the protocol in principle. Similar techniques frequently appear in cryptographic protocols to achieve "succinctness" at a glance. But it has to be said that if the internal mechanism of the protocol is not clear, especially the distinction between the "ideal world" and the "real world", users can easily introduce various fancy "security loopholes".

  • What do we need to know as responsible coders who can write safe and reliable software? It is of course best to thoroughly understand the design mechanism of security protocols, but in most cases this is unrealistic. Generally speaking, we use various cryptography tools as "black boxes", but this may not be enough. We'd better understand:

  • What exactly is a "safety assumption"?

What exactly is the "super power" in "ideal world"?

secondary title

Brain hole: Are we living in a simulated world?

When I read "simulator" for the first time, the first thing I thought of was the movie "The Matrix". The "real world" we live in may be an "ideal world" simulated by a certain simulator. Everything we see, hear and perceive is "simulated". In the "real world", we live in a matrix. Yet we are not aware of this.

As early as the Spring and Autumn Period and the Warring States Period, Zhuangzi was also thinking about similar issues:

In the past, Zhuang Zhou dreamed that he was a butterfly, and the butterfly was so vivid. Self-referring to Zhizhi and! I don't know Zhou Ye. If you suddenly feel it, then you will feel like a week. I don't know if Zhou Zhimeng is Hu Die and? Butterfly's dream is Zhou Yu? Zhou and Hu Die must have their own differences. This is called materialization.

—— "Zhuangzi Equality of Things"

Explain in layman's terms: Zhuangzi fell asleep one day and dreamed that he had become a butterfly dancing lightly. When he woke up, he found that he was still Zhuangzi. In the dream, the butterfly did not know that he was Zhuangzi. So Zhuangzi pondered whether he had turned into a butterfly in his dream, or had the butterfly turned into Zhuangzi in his dream? If the dream is real enough,...

The "brain in a vat" is an idea proposed by American philosopher Gilbert Harman: a person's brain can be put into a container, then plugged in with wires, and by simulating various electrical signal inputs, the brain thinks that it is living in the real world. in the world.

  • This idea comes from the philosopher Descartes' "First Philosophical Meditations" [7], in which he argues that we should doubt everything and need to test all human knowledge, mathematics, geometry, and the perceived world one by one. However, he found that apart from "I think, therefore I am", all knowledge may be unreliable, because our brains are likely to be deceived by an Evil Demon with "superpowers".

  • In 2003 Nick Bostrom, a professor of philosophy at the University of Oxford, solemnly wrote a paper "Are We Living in a Computer Simulation World?" "[8]. Consider at least one of the following three facts to be true:

  • Human civilization is completely extinct.

Human civilization has reached the level of technology that can completely simulate the real world, but for some reason, no one is willing to create a new simulated world to play the role of God.

Our current human civilization lives in a simulated world.

Silicon Valley entrepreneur Elon Musk said in a public interview that the probability that "we live in the basic real world" is only "one in a billion." That is to say, he believes that we live in a computer game (simulated world), and outside the simulated world, there is a programmer who develops and manipulates this world, and each of us is a game character (NPC).

After getting tired of jailbreaking the iPhone and autopilot, the magical boy Geohot gave a speech titled "Jailbreaking the Simulation" at the "South by Southwest" conference in March this year [9]. He believes that we are living in a simulated world, and the so-called gods are the alive and kicking coders in the external world. They programmed to create our "real world". Of course, they may have started more than one copy of the world. However, they may also live in an outer "simulated world".

If we do live in a simulated world, maybe we can find a back door somewhere on the earth - "Simulation Trapdoor", so as to gain the superpower of "simulator" and extract incredible "secret knowledge".

If our world is indeed simulated by a program, there may be bugs in this program. If there are bugs, maybe we can use this bug to escape from the prison, jump out of the "ideal world", and reach the outer world. Let's talk about the code farmer God.

Is this a joke? The following is an excerpt from a paragraph on Zhihu [10]:

If the world is virtual, what examples can prove it?

1. Why are the macroscopically rich and colorful, but the microscopic elementary particles are all exactly the same? This is exactly the same thing as pictures, but pixels are exactly the same thing

2. Why is there an upper limit to the speed of light? Because the machine's operating speed is limited

3. Why is there a Planck constant? Due to the limited precision of the machine data

4. Why are microscopic particles all probability clouds? This is a random perturbation added to avoid the system getting stuck in a loop

5. Why is there the Pauli Exclusion Principle? It seems that the data organization adopted by the system is a multidimensional array

6. Why does a quantum computer run so fast that it can try all possibilities in an instant? Because this essentially calls the interface of the host

8. Why is there an observer effect? This is obviously lazy updating

to be continued

9. Why did time have a beginning? The system has boot time

to be continued

If you would be a real seeker after truth, it is necessary that at least once in your life you doubt, as far as possible, all things. 

Designing a cryptographic protocol is like walking a tightrope. If you want to achieve "zero knowledge" and "reliability" at the same time, it means that you must make the content of the protocol fully random, and ensure that "knowledge" can participate in the interaction of the protocol. If the protocol is not properly designed or implemented correctly, it will lead to the collapse of system security. For example, zero knowledge may be destroyed, causing "knowledge" to be leaked inadvertently; or reliability may be destroyed, so that anyone can forge proofs. Moreover, this kind of security is far more serious than traditional code underlying mechanism vulnerabilities, and it is more difficult to be discovered. For a strictly mathematical argument, this seems essential.

Is our world really simulated by a "Three-Body Civilization"? This possibility cannot be ruled out, perhaps, we need to seriously re-examine our various obsessions. But so what? At least my "thoughts" are real.

安比(SECBIT)实验室
作者文库