Sunday, October 9, 2016

Golem: Generic Trustless, Stateful P2P Application Backends



There is a saying that is often used as justification for the madness of kings. “Absolute power corrupts absolutely.” The costs of maintaining control pushes one to reason that one has a duty to extract every last bit of profitability or utility off of it. Rather than growing the kingdom and profiting from the established means of reward, a king may choose the local minima of exploiting those whom give him power in order to have a secure reward. Modern systems put absolute power in the hands of the person who owns a website. This is partially because it becomes necessary for the author of the website to provide computer hosting for the website. This is both a blessing and a curse.

With power comes the ability to find novel ways to profit from the system in ways outside of the site’s social contract. Even if the system only needs to save a few bits of information about each user to provide service, the system may need to see much more about the user in order to create the inserted data or to create an interface for the user. The user has no idea how much of their data is saved. If they consent to tracking, it tends to be a single opt-in which gives a blanket pardon for collecting all of this data.

On the negative side, power comes with the ability to harm without retribution. Through misrepresentation or denial of service, the server can decide which users should be ostracized. Major social networks are accused of selectively displaying certain kinds of information in order to engineer a worldview in their users. Users point at the identity representation that the application has and says “this is me,” while having no final say on what that representation holds. This is much like a King’s authority to edit the society’s stories to be more flattering.

All of this power comes at a cost. On the internet you may be a king, but someone with a warrant can compel you to use this power in their interests. The more popular your service becomes, the more likely it is that attackers will find any vulnerability you have. The legacy of kings is one of anxiety and bloodshed for a reason: nature abhors a power vacuum.

Golem: Trustless, Stateful P2P Application Backends


The problem of power centralization is caused by placing the duty of data storage and manipulation in the hands of the maintainer-owned server. The Internet at large has been trying to fix this through peer-to-peer systems since the days of Napster. We’ve never quite gotten it right though.

The core issue is one of abuse and data integrity. If anybody can control your database, it becomes impossible to ensure that a bad actor doesn’t remove all of your records and destroy your system. An easy answer is to use encryption to make sure that an updater “owns” the data to be inserted. This works for messaging and file-sharing systems. This is not sufficient in most systems because the data will typically hold semantics that constrain the values. It is the application code’s sequencing of updates that constrains data transitions.  

If the data-manipulating application code is allowed to run on the client machines, it becomes increasingly difficult to ensure that the requests made by the client are requests that are formed by the application. Furthermore, clients have a copy of the code to reverse-engineer and copy. This is the real motivation behind client-server web apps. If data-manipulating application code is run by the maintainer of the website, then they are in a position to be forced to eavesdrop or manipulate it.

What if somebody else ran the code responsible for validating and altering the database? It’s not intuitive why this would improve anything. This person would see the client’s intermediate sensitive data, could use it to generate an invalid data alteration, can reverse-engineer the code, and has less investment in correctness than a good user and a good maintainer.

The Golem project arose from a desire to counteract each of these problems through knowledge limitation, intimate accounting of state changes, pervasive rate-limiting, and cryptographically-validated identities.  

Architecture:


Standard Backend Dataflow:

1. Users query DNS to find a request-responding server behind an endpoint (usually through a load-balancer) and send the API request with arguments to the work server.

2. This work server can contact other work servers in the cluster, as well as database servers. The can make IO at any point during execution, though database manipulation is typically within a transaction. These servers will validate the input and will use it to craft these database updates. 
3. These servers will all manipulate a database, relying on indexes for performance.

4. The user has no access to the database.

5. User gets back a response from their endpoint request.
6. A system looks at the database periodically and creates structures to allow the maintainer to query “business information” out of.

Golem turns the MVC app inside out by making the client the intermediary between the application servers and the database.

In Golem:

1. The user queries a cluster for the last version of a document from the maintainer describing the endpoint structure.

2. The user interacts with the database before contacting the work server. They already know the data the endpoint needs to view, and they pass the data to the endpoint after fetching it. Subsequent queries can replay this fetched data.

3. The user looks at their endpoint structure to see which primitive values (Strings, Integers, etc) must be seen at the same time. These can be for reasons of string comparison, concatenation, predicates, etc. If the data needs to be seen in multiple places, these are considered different observation points. The user counts the number of observation points and chooses a server for each. No server appears twice, and the client maximizes geographic distribution of requests. The user uses the public key in the published document to sign a document containing a decryption key and an encryption key. The decryption key for the server is used by the client to encrypt the observation point. This document prevents the work server from needing to keep track of keys, and removes a need for an unbounded number of negotiation requests.

4. The user sends the request document and database information to a server not chosen yet. This server can see data structures tracking encrypted mini-blobs of data.

5. Work servers validate that the fetched data was inserted before the last database synchronization point by validating the location of its hash in the global Merkle tree.

6. Work servers send fragments of the input to the servers chosen for each function to service, and get a response from them. From the responses, the return value is crafted. Database updates are threaded through in a single transaction.

7. The client examines the database updates to see if the work servers stored information in the updates that they are not alright with. If they consent to the data storage, they push the updates to the keys they control, along with the verification that the data followed a valid path through the program.

The above summary sped through a lot of implementation details, but gives the reader an idea of how the data flows through the system and some of the logic behind the architecture. We will turn to the data structures of the system next, as they dictate the operations of the system.

Data Structures:


The database model that we present is based on the Kappa Architecture (http://milinda.pathirage.org/kappa-architecture.com/).

The Kappa Architecture is a view of databases which arose from the rise of the distributed log. In distributed systems such as databases, we find that a sane cache eviction policy and cluster consistency are difficult to extract from an existing database. As more and more clusters turn to the use of a distributed log for synchronization, its application to databases becomes more attractive. The idea behind the Kappa Architecture is to insert all modifications into a shared log and to create a database by traversing the log from start to finish and executing updates. In this way, the database is a temporary cache of your real data store. Updates can be dropped and the index can be rebuilt, giving your database a lot of flexibility.


Right now, let’s focus on the actual structure of the database. As we said above, the database is a log. Queries are made from the “index” structures which are created by traversing the update log. This means that all we have to do is to offer a distributed log, right? Not so fast. We don’t want everybody appending to the same log, as this allows clients to interfere with one another. Each client has one log per piece of relevant data, and the program maintainer is responsible for providing snapshots and for creating the indices that the rest of the cluster uses.

Our append-only logs are quite simple. We chose to use the bitcoin blockchain and the BitTorrent swarm as the database. A user can create a transaction including OP_RETURN, which allows them to embed information in the blockchain. This creates a nice linear ordering to updates globally. The embedded information includes the hash of a torrent containing the insertion commands to enter into the log.

Finding the newest version of a value at a key then becomes finding the last transaction from a given bitcoin address, and fetching the torrent. This has latency, but clients should never have to query their stream directly. Instead, the Supervisor-made indices exist for primary database querying.

The question is still begged; how do we know that the data in our database is consistent. Can we afford to trust anyone? By using what I’ve termed a “prime stamp,” the supervisor of the system can ensure that transactions only occur along the paths that the codebase would at first.

Prime stamps work because modulo is cheap but prime factorization is hard. Every transaction-verification window, all updates in the database are given a large prime to tag it. When a server adds an update to the updates for the client to insert into the database, they multiply a “state” number by the prime they’re responsible for. They also sign the number with their private key.

This state is threaded through all requests. The client picks a random number and sends it to the endpoint. From there on out, the servers thread this random number through and multiply it by large primes. The client puts this accumulated stamp into a transaction block in a torrent that is pushed to each bitcoin “stream” which has a key which is modified by the transaction.

If a transaction stamp divides the accumulation, then the request was crafted by the servers with very high probability. The thing though is that these stamps are only published after a time-period snapshot is taken. That means that an attacker would need to do statistical profiling of many updates made from clients which hit the same endpoint, and find the GCD of these primes. Now not all servers doing the same update will necessarily have the same prime tag. Duplication and versioning allows for a lot of code path obfuscation.  The attacker needs to try the permutation of the primes found.

How many times can an attacker get feedback on a prime stamp being right? Well, they can only get it wrong once. And when they do, they lose the proof-of-work necessary for a server to get an identity and to get trusted with code. It’s also worth noting that this factoring and guessing must be done during a single time cycle. At a very small scale, the problem still becomes infeasible when the cycle time is less than a half hour or so.

Protocol


Terminology

Blockchain – The BitCoin blockchain
Supervisor – The agent running the service, foots the server’s costs
Server – The agent hosting code for one or more Applications
Client – The agent being serviced by the Supervisor’s application code, running on numerous Servers
Prime stamp – A product of primes representing a database transaction.

Supervisor Bootstraps Application

1.     Supervisor generates a master public/private keypair and a secondary public/private keypair. They use the master keypair to sign a certificate chain for the secondary keypair. This allows the supervisor to place the secondary keypair on a quasi-untrusted server, and to use the master keypair to revoke or grant this and future secondary keys.
2.     Supervisor takes program description (described below), and posts a job offer document to the blockchain containing a type tag and a torrent hash of the offer. The offer contains specifics like the pay rate per request, the minimum number of servers needed to start serving, the BitTorrent tracker, and the expected client and server proof of work difficulties (to prevent denial-of-service).
3.     Supervisor waits until a sufficient number of servers have agreed to participate in the same half-hour time window.
4.     Supervisor begins the “Tick” action until the number of servers drops too low.
5.     If the Supervisor wishes to update the code, they will post a new job bid which references the previous one and wait until the next “Tick” for things to propagate.

Server Joins Application

1.     Server sees the job offer document type on the blockchain, and fetch the torrent
2.     Server decides to accept job
3.     Server does Proof-of-Work to create document accepting work. This is their identity certificate. It includes a public key and an available hostname and port to access them on.
4.     Server posts data to a torrent, puts hash in the blockchain. Server sends Supervisor smallest possible transaction.
5.     Server and Supervisor begin a micro-transaction.

Client Reserves Database Key

1.     Client picks a new bitcoin address and pushes a commit with an initial value for the value held there.
2.     The client then refers to this new bitcoin address in a transaction from the address used to register the Client. This includes the type tag of the data stored here. Datatypes must be types listed in the endpoint specification.
3.     The client public key is the only key that can write to it.

Client Makes Endpoint Request

1.     Client uses Tick document, examines endpoint request to find out the arguments to pass to the endpoint, and how many different servers are necessary to service it.
2.     If endpoint needs a database query to work on, Client uses the Supervisor’s index to find the data necessary to service.
3.     Client uses Tick document, finds a server for each entry-point hash in the document. For each server, client generates two symmetric keys and encrypts them both with the public key of the server. They then encrypt the data that only this server can see. This prevents the need for key negotiation to happen in the middle of servicing a request.
4.     Client picks a “top-level” server with the endpoint’s hash, one which won’t be able to see any primitives, and encrypts the entire message with public key for this server.
5.     Client sends the entire payload to the “top-level” server and waits for a response
6.     Client vets the server updates to make for information leakage.
7.     Client inserts updates into keys by posting torrent of diff of updates and inserting hash into blockchain.

Supervisor Tick

             The tick contains data for the last cycle as well as for the next 3 “tick” cycles. We have the 3-cycle delay because this gives the servers the time to get the next cycle loaded. It’s cheap to offer the buffer here. It’s worth noting that as this is a torrent file, servers and clients can download individual files out of it. For this reason, many of the fields below are separate files in the torrent.

Summary Portion:
1.     Examine the blockchain for update hashes pushed by tracked client addresses.
2.     Concatenate these blocks together and embed into document.
3.     Rebuild indexes from data changes and embed into document. This is important because while the logs would be O(N) to traverse on every read, we can create many cached index structures to amortize update costs.
4.     Post the collection of transaction stamps valid during this time period. It’s vital to post this while hashing the current values, or clients could retroactively use the stamps.
5.     Receive server logs, and pay server via micro-transaction for work done.

Next Portion:
1.     Check that the minimum number of servers posted in the contract have been offered. If not, sleep for time cycle.
2.     Check if clients have sent any transactions signed by servers that are not accepted transactions. Use known primes for that cycle to check the factorization of the transaction, O(N) in terms of number of mutations. Can do SIMD. Use the signature order to find out first who didn’t add the prime they should have, and eject them from the cluster by placing a “tombstone” in the blockchain for their previous server proof-of-work.
3.     Check the gossip documents sent, see if servers have been below latency promised latency for a fraction of last few cycles. If so, cancel contract. Require a number of cycles before giving them another chance. Exponential back off on chances.
4.     Find and shuffle primes for each database mutation call site
5.     For each transaction control flow that’s acceptable, compute the prime stamp with the new primes.
6.     Shuffle and probabilistically duplicate and version endpoint functions, and distribute to nodes.  
7.     Encrypt each for each server and append code with “tick” time for it to run (3 in future). Each endpoint code is a separate file in the torrent.
8.     Append schema with documentation -> endpoint message, including the primitive encryption that has to be done for each endpoint. In this way, the API is self-documenting.

Lastly, sign this document and push to BitTorrent, put hash on blockchain.

Server Response to Tick

1.     Replace last set of code and prime keys with new set
2.     Fetch next 3-offset set of code and keys and prepare.
3.     Encrypt list of client interaction receipts with Supervisor’s public key and place in torrent. Send hash of torrent to Supervisor in exchange for a microtransaction update to the new owed balance.

Client Response to Tick

1.     Validate that all commits made were not poisoned by servers. If so, notify Supervisor. This can be done lazily, on the next query.

Attacks:


There are not many attack options available to any actors here. All would amount to a punishable violation of trust with no lasting impact beyond loss of peers.

If the client tries to insert data into the database without using the application to create the transactions, then the servers will not have signed the transactions and created the stamp. This kind of bad behavior is easy to spot. We will destroy the client’s identity with the system. Proof-of-work makes this attack unfeasible; the attacker will do much more work than anybody else.

If a server returns an incorrect set of transactions, we know. If a server wants to eavesdrop on users, they don’t have access to enough information to learn more than a small set of primitives. Furthermore, the fact that the client chooses the servers means that an attacker would have to control most of the network to see more than one chunk of primitives.

Denial of service is countered by the pervasive rate limiting that proof-of-work and the blockchain brings.

Lastly, what if a server knew the transaction stamps and forged a bad transaction? Well what would this entail? They can’t use the already-released transaction stamps, as they’re out of date. They’d need to control enough servers to understand all of the control flow and see all of the tags. The ability of clients to enforce geographic distribution means that seizing enough computers rapidly enough is unfeasible.

Code Specification:


In order for our system to be usable, a specific interface has to be presented to the application developer. If we allowed for arbitrary looping, it would be possible for functions to route to each other such that the cluster would loop indefinitely. Rather than take on the halting problem through heuristics, a more principled approach is to ban unrestricted looping entirely.

Wait, you say, how can I do useful work without looping? While we are reducing the number of total applications that can be written in theory, we are not reducing the number of useful applications in practice. The application servers shouldn’t be making IO requests to the world, that would be a design which makes subverting the Golem system trivial. Instead, the IO is carried out by the clients and the database values and request data are sent to the servers.

The servers are a purely functional mapping of finite, pre-parsed data structures into finite data structures. There is no need to introduce arbitrarily diverging computation into the system. Instead, we will expose the primitives of map, filter, and fold while banning iteration keywords and both recursion and mutual recursion. The functional programming idea of a zipper extends these to trees. With trees, we gain key-value stores. The primitives are therefore sufficiently flexible for reasonable programs, and allow us t

Rather than writing a new language myself, we will lay out a design for an ORM-like library which can build these programs up. By using opaque types for “Golem values,” the functions can build up programs through working with values as if they were database objects. The program written by the developer is actually ran once, on the developer’s machine. There is nothing hiding in unexecuted conditionals: all Golem conditional statements are executed because they are essentially code generation functions. This program outputs a JSON document specifying the Golem program created. The time daemon will accept this program and will carry out the supervisor’s roles.  

Consider a function that receives endpoint requests. The argument to this request is a JSON object which contains primitive (String, Integer, etc) values, and nested JSON dictionaries. The program that the developer takes will register these endpoint functions. In this way, the request object is “injected” into the function. We exploit this inversion of control to pass a JSON object which has the same structure as the expected format, but passes “opaque types” in the place of primitives.

In order to see the value in an opaque type, you must pass it a callback. This is where the program jumps to another server. As Javascript is fairly used to callback-driven workflows, there are many libraries to present syntactic sugar here. These callbacks build a value that is of an opaque type. This return value will look like the value returned to the API consumer but something far weirder is actually happening here. Each opaque type is really a reference to the end of linked list of updates to the initial value it has. As the opaque types combine, the lists simply interleave. The return value of the endpoint function is thus a dataflow log of how the output is yielded from the input.

At points, the functions will make database queries. These queries are executed at “compile-time”, if you recall. We can therefore track messages sent to the database library, and build up an ordering of queries to insert into every update log. This visibility allows us to “hoist” all queries to the top of an entry-point block. Updates append to the update log, which are returned to the client.

In the case where queries require generated values, we will notify the developer at compile-time that we had to split the function. This does mean more state visibility, but this is only a problem when the application wants to hide database access. This privilege is an anti-pattern that we’ve explicitly negated in the design of Golem.

After arriving at a dataflow representation of the program, the wire format becomes a series of calls and statements to make to yield the arguments for the eventual data constructor returned to the person hitting the endpoint, along with the update log.

Previous Work


The core of the idea comes from onion routing. Nobody can see enough of the entire global state to understand enough to do harm. By the time information is obtainable, it is too late for it to be useful. The fact that obfuscation can truly work when you have safety in numbers led to the architecture of the application server cluster.

I was also inspired by https://github.com/elendirx/web2web . The method of using the blockchain as a carrier for an individual stream of updates, and using that to serve BitTorrent hashes, proved instrumental to the ability to host coordination documents in a verifiable way.

Implementation Difficulty


I hope to see an implementation of this architecture done by the end of this semester. Thankfully, not much really has to be written.

The infrastructure for Server execution is really limited to an interpreter of map, filter, and fold structured recursion, as well as pattern matching on data constructors and creating them. This would be an undertaking for a new language, but since this is a high-level library we manage to elide most of the issues. Something that would be fun to play with one day is using Tensorflow as the execution framework. Tensorflow is a dataflow execution framework and offers both the needed primitives and the ability to dynamically optimize around actual control flow. A JIT which specializes on executed Golem function will definitely help with the costs of interpretation, as endpoints have an enforced structure.

The Client work is really just a combination of off-the-shelf parts. Web2web shows that webtorrent and blockchain scraping is fairly trivial to implement with the existing libraries. The only thing that remains is to create a wrapper around the database fetching and the server selection for the query. This logic is fairly uniform, so there shouldn’t be much cyclomatic complexity here.

The Supervisor daemon will be the largest amount of labor, but is limited to the summary document creation, the prime-stamp logic, seeding of torrents necessary to the application, and code transformation. The code transformation should be fairly easy to work with since we are really just creating a logging library that looks like an ORM.

Conclusion


I believe that Golem represents a fairly significant contribution to the problem of trust in website hosting and administration. By treating an application as a conversation between the client and the code author, we are able to use the servers as an untrusted third party. The code author defines the legal application states, and the swarm of clients and servers work to ensure that the database only contains data generated by valid transactions. The history of database transactions is used to create a performant, indexed data structure for clients to read from.

It’s worthwhile to consider the multitude of recent crimes and coercions that would have been unable to occur in this system. Hackers have very little to hack. If they hack a server, all they can do is get the server ejected from the cluster. If they hack a client, all they can do is impersonate the client; this is an attack that is outside of our scope. If the hack a supervisor, the supervisor can revoke the secondary keypair and user the master key to re-assert a valid index structure and cluster code distribution. The only program state that really exists is the append-only log of commits from the clients. Everything else is a constructed view of the data that can be re-created when necessary. State can’t be lost, and there is no state to really steal since the non-persistent data is distributed throughout the entire cluster.


By trusting nothing beyond the difficulty of prime factorization, we have created a system in which trust cannot be stolen or abused to do lasting harm. That is an Internet that I’d like to be a citizen of.

No comments:

Post a Comment