Read the status quo and future of distributed computing in one article
星球君的朋友们
2018-08-11 03:05
本文约6052字,阅读全文需要约24分钟
Solutions for how different projects deal with the ever-increasing number of computers connected to the network and how to isolate tasks from the compute nodes on which they run.

This article is from:Chain News ChainNewsThis article is from:

Chain News ChainNews

Chain News ChainNews

(ID: chainnewscom), author: Dani Grant, analyst of UNION SQUARE VENTURES, compiled by: Zhan Juan, forwarded with authorization.

Since the 1990s, people have been trying to build distributed computing networks; in 1996, GIMPS, the Internet Mersenne Prime Search Project, used distributed computing to search for prime numbers, and in 1999, Seti@Home used the computing power of volunteers to search for extraterrestrial life.

Now, 25 years later, the last few small pieces of the puzzle seem to be in place.

One application of cryptocurrencies that we're always excited about is distributed computing.Before cryptocurrencies, I couldn’t use my laptop to send a little money to a stranger who ran a machine learning program on an idle server to say thank you. Cryptocurrencies finally allow us to make machine-to-machine payments to compensate participating nodes for running tasks.

We have been following distributed computing projects and wanted to share

Solutions for how different projects deal with the ever-increasing number of computers connected to the network and how to isolate tasks from the compute nodes on which they run.

The following are our preliminary findings, which we share with you for correction.

secondary title

ways to grow your network

Metcalfe's law applies to computing networks: the more machines on the network, the more likely the machines are to accept new tasks when needed.Growing computing networks is difficult, especially in an increasingly crowded space. I have to clarify that the problem is not what people have installed and don't want to install, but that a project trying to break through naturally faces a lot of noise.

Here are four interesting approaches we saw:Option 1: Make it easy for anyone to join the network.

An example is the KingsDS pre-beta. To join, just visit the URL in your browser and let the tab run in the background.Approach 2: Help other applications get compensated for sharing the user's resources.

One such example is the FREEDcoin pre-beta. They provide a set of software development kit SDK for game developers. When players launch a game running the FREEDcoin SDK, they have the opportunity to contribute their CPU in exchange for in-game rewards. This creates a win-win situation: FREEDcoin can attract high-performance gaming PCs to its network, game developers can monetize games without displaying ads, and players have the opportunity to earn virtual rewards.For example, Golem beta can submit tasks and perform calculations on the client side, which means that each of their end users can simply become a computing node. This helps them grow evenly on both ends of the network.

Method 4: The final method is to provide computing resources to other computing projects.

One example is SONM beta, a project that attempts to help other computing networks scale rapidly. With SONM's open marketplace, machines can display how much RAM, CPU, and GPU they have available in a standardized format. Any project using SONM can then search the entire SONM network for machines with available resources.

secondary title

Methods for isolating tasks from hosts

One challenge is to ensure that tasks cannot read or modify the host's memory, and vice versa. It is also important to isolate multiple tasks from each other if they are running concurrently on a single machine.

In this space, though, two projects are doing something unique that deserve to be singled out.

Enigma pre-beta is designing what they call "secret contracts" - these are computing nodes, much like smart contracts, but because each piece of data is broken up and distributed to multiple nodes working on the same computing task, individual nodes cannot read fetch any data. They implemented the idea using a cryptographic method developed in the 1980s called Multi-Party Computation (MPC). Enigma is building its own chain for storage and computation.

Keep pre-beta is another project that takes a similar approach. They also use multi-party computation to split encrypted data to perform calculations, while computing nodes cannot read any incoming data. Through Keep, private data is stored and calculated in the cluster, and output data is published on the blockchain.

secondary title

Final Thoughts: Narrow Use Cases Vs. Wide Use Cases

For distributed computing projects, there are two approaches: build a general-purpose computing tool that can accept any workload, or accept only a small range of tasks.

Most of the companies that Union Square Ventures invests in start out with one thing in mind, one thing in which they grow, and a network and platform built around it. For example, Cloudflare, Stash, Carta, etc. we invested in are all like this.

I think the same pattern works well for computing networks: start with a narrow use case like training machine learning models, rendering 3D graphics, and protein folding lights, which will help projects get off the ground quickly, and expand to other computing over time field.

One of our partners, Albert Wenger, used the growth of WeChat to illustrate this theory: WeChat started with chat, and the success of the chat app allowed them to expand their network so they could build apps like payments, e-commerce, and gaming program, and now WeChat has developed into an integrated application tool.

There seem to be two paths: one starts with training machine learning tasks, since machine learning is one of the drivers of increased computing resource requirements. Another path is to start with use cases like 3D rendering or academic/scientific computing where there is no overhead of protecting private data.

Overall, it's early days for this field, but the prospects are exciting. The emergence of greater competition among computing suppliers will not only drive down prices and drive innovation, but may also enable a new class of applications, such as VR and self-driving cars, that will only emerge if distributed computing is faster than us from end devices. The -west-2 region is only a few hundred milliseconds faster.

These are some summaries of what I wrote back in June, outlining the types of computing projects we're seeing. In just the past two months, there have been many rapid developments in this field, and here are some observations that I continue to share.secondary titleIsolated Networks vs Open Protocols

There are two approaches to distributed computing.

In one of these models, there is a dominant distributed computing protocol that creates a shared network of computers on which anyone can build interfaces and clients

And in another model, there are a few dominant computing projects, each with its own computer network.

Both models allow for co-existing projects, serving different audiences, but in one, projects are clients sitting on the same shared resource pool, and in the other, they both run their own separate networks. It is possible for these two models to co-exist, but I don't think co-existence is actually feasible given the network effects. Given the opportunity, projects may choose to tap into an existing computer network rather than build their own, because having access to more CPUs initially provides better quality of service to customers than starting from scratch .

We see use cases where both are tried. SONM is a project that attempts to build a shared resource layer. The other is the "Distributed Computing Protocol" DCP built by Distributed Compute Labs. Most other projects are currently building their own networks, though with an open protocol there is nothing stopping anyone from building alternative interfaces for these projects. We may see projects that start out as their own systems and then grow organically to become one of the clients on top of the resource layer they now share. I'm really excited about the possibility of a shared computing layer and the teams and projects trying to build it.

secondary title

The problem with tokens

One of the things we have been thinking about is which tokens will be used by developers and which tokens will be used by end users. That is: if a user interacts with a DApp running code on a distributed computing network, does the user pay the DApp as much as the DApp pays for computing services?

On the other hand, Hypernet and Truebit are two computing projects with a dual token model.

In Truebit, buyers can pay for services in ETH, while Truebit TRU tokens are only used for protocol-specific wagering and dispute resolution functions. This matches the pattern we’ve seen this year with infrastructure projects like The Graph and Augur that use mainstream consumer currencies for transactions while their own tokens are used only for governance, staking, and dispute resolution.

I expect more projects to move to the dual token model in the future, as it allows the price of governance to increase as the network grows, but not the price of using its services.secondary titleEC2 Model vs. Lambda Model

In the existing web2 world,There are two main types of computing services

: In the EC2 model, developers get an environment to run and host services, in the Lambda model, developers write functions that can be invoked on demand.

Distributed computing projects can also be divided into two categories: A category represented by lambda or Cloudflare Workers 😉 where users write scripts and projects run on participating machines. Another approach is EC2, or "other people's computer": a user is matched with someone on the network and can run containers on that person's computer.

Note that Lambda methods are not exactly Lambda machines in a Lambda-like distributed network, and will not store all the functions that have been pushed to them, and invoke them as needed. Instead, these networks are used to run offline and asynchronous scripts for use cases such as scientific computing or drawing graphics. I expect they will become more like serverless computing as latency issues improve.

Hosting a DApp frontend requires a persistent host, while running one-shot computations is better suited to work on a server-like platform.

Two projects that operate on hosted platforms are Akash and DADI. From an end-user perspective, Akash actually looks a lot like a traditional computing service, with developers managing containers on Akash-deployed machines in a Kubernetes cluster that can be federated across machines on the Akash network. It's no coincidence that Akash was founded by Greg Osuri, who is also a Federated Kubernetes contributor. If you want to try Akash, they just recently launched a testnet.

Two projects that operate on serverless platforms are Ankr and DCP.

secondary title

How to utilize the hardware device

Distributed serverless computing projects are unique to cryptocurrency-based distributed computing networks in that they can run code on strangers' phones and laptops because they don't run code other than one small script at a time. A persistent occupancy of the computing environment is required.

The idea here is that these projects can combine all unused end-user CPUs to form one gigantic supercomputer for less than what's currently available in the cloud computing market.A couple of words about pricing: The prevailing opinion is that distributed networks will be cheaper because they don't have to pay for physical space and the hardware capex cost is factored in. However, as Mario Laul, a researcher at venture capital firm Placeholder, pointed out to me, cloud computing pricing has hit rock bottom, and if distributed services emerge and undercut the major players, then cloud service providers may push prices down to just sustaining costs. in order to remain competitive.I'm very interested in current projects that provide high performance computing environments by pooling the available CPUs on end-user devices.

There are three major challenges to running code on an end user's device.The first, is to convince enough people to participate

. It has been discussed before.The second challenge is the relatively low performance of end-user devices.

To address this, we're seeing some projects being built in a parallelized fashion to run code concurrently on multiple machines at the same time. Ankr lets users package their code into chunks and submit them to the network separately, and then the job scheduler distributes them to different machines. DCP automatically distributes application subtasks across machines in the form of JavaScript objects executed in "Web Workers". In addition, DCP is also clever: it uses WebGL to access the graphics processor of the end-user device, which improves efficiency even further.

A third challenge is that the end user's device is not trusted hardware.

Since we published the first half of this article in June, great progress has been made in leveraging SGX, the trusted hardware environment built into Intel chips.

Since then, Enigma released a testnet that utilizes SGX for computing, Golem released Graphene-ng to help developers write SGX-enabled code, and Oasis Labs raised $45 million from institutions such as a16z to build SGX-enabled Distributed computing platform.

I've been a fan of SGX myself because it's fairly secure and easy to implement on consumer laptops.

Besides SGX, another way that distributed computing protocols can verify computation is through dispute resolution. Truebit is a computing project with a dispute resolution protocol, which they call a "verification game". Validators use TRU tokens to challenge calculation results.

In Truebit's dispute resolution mechanism, at each time step of running the program, the state of the "solver" is hashed - in fact, any given instruction may not be executed within Ethereum's gas limit , so TrueBit breaks down each instruction into 16 sub-steps. Validators then query the hash state to find faulty instructions before running the disputed steps or substeps on Ethereum to get the final result. Whichever side is wrong loses their bet and the tokens are paid out to the winning side.secondary title

Where on the stack is the best place to do the computation?

An open question is whether computing services will end up being a layer-1 or layer-2 solution. That is to say:

The calculations are done off-chain because the main blockchains available today are either Bitcoin, which has a limited scripting language, or Ethereum, which is computationally expensive and slow. And in the future, it is likely that a layer 1 blockchain will be able to perform computations in a way that does not require every node in the network to run the same computation, which will make computations cheaper and faster . Perlin is a project that attempts to build this functionality. But even in Perlin, computing services are implemented as sidechains of the main Perlin basechain.

Most projects are either building sidechains of existing blockchains, or off-chain networks that are completely independent of existing base chains. Render is an example of the first approach, a sidechain of an existing blockchain, where Ethereum smart contracts interact with the Render network. Akash is an example of the latter, a standalone off-chain network, which is a completely separate network.

I prefer lightweight horizontal protocols that can be layered on top of each other, rather than forming an omnipotent super-protocol blockchain. This is how the internet works now: small protocols stacked on top of each other SMTP > STARTTLS > TCP > IP. It can lead to reusable modules QUIC and DNS can both use UDP without changes to UDP, with the ability to easily replace and upgrade layers, for example, HTTP can be swapped with SPDY or upgraded from HTTP 1.1 to HTTP 2.0 without changes layer below.secondary title

Open up the regional market

Finally, I would like to say that,

We have seen some projects like to concentrate on one regional market, which may be very clever.


星球君的朋友们
作者文库