Saturday, September 24, 2016

Golem: Trustless Distributed P2P Hosting

Project Proposal Draft / RFC

For my independent study, I must design a distributed application which uses the design patterns from the systems I am studying. I have an inkling of where I want this to go. This is the initial abstract of my idea.

I2P and Tor work because at each step, each node has limited visibility. They're only able to see the information that is necessary for them to do their work, and the client expects responses to be encrypted by keys that the current node doesn't have.

This is much like the idea of parametricity in generic programming. When a function takes two arguments, each of which is of a type that is a generic type without constraints, the function is very limited on what it can do. A function which has a type signature of A -> A where A is a generic argument can only be the id function because the function has no way to get an A other than the one which is passed to it.

Similarly, a Tor node which has a job of passing an encrypted message along can't inspect the encrypted message or alter it in any meaningful way (when it can, that's an attack). What if Tor nodes didn't just pass along a message to the next node in the circuit, but were expected to do meaningful work?

Consider the program below:

int foo(int a, string b) {
    int len = strlen(b)
    return max(len, a)

Now let Node A only have access to the function foo, Node B only have access to the function strlen, and let Node C only have access to the function max. They don't know they're holding these specific functions, only that they're holding functions which respond to the hashes of function names and the hashes of data constructors. Max and strlen require access to primitive datatypes while foo doesn't, and so only max and strlen will be able to see these.

When someone wants to call foo, Node A can send back a list of the identifiers for Nodes which have a copy of strlen and a list of the nodes which have a copy of max. It doesn't say the function name, but it says the subset of the input which each node will see. As the function cannot see any state not sent by the user (will clarify later), we know that the data sent along will be subsets of the input sent to each function node. Eventually, it will bottom out, or the program will have a trivially-detectable loop in the call graph and we can reject it. The latter can be identified because we will require the use of map, filter, and fold (catamorphisms) to work with the data.

This server list will include a "score" for each node which is made by summing up all of the measured latencies for each node servicing a request, find out the amount that each node contributed to this sum, and then take the inverse of that. These values can split up a number line into slices, where the faster nodes have a larger piece of the slice. The client is thus able to pick a random number and use it to pick a node. The client can force node B to be from a different country than node A to make legal attacks more difficult. These values are stored in a DHT much like the I2P routerinfo tables. This DHT is rate-limited by requiring the user to make a very cheap proof of work on each query, and a more expensive proof of work whenever a server wishes to join the network.

Now if the strlen function needed to call another function, the node that was chosen would negotiate with the client through messages relayed through the node containing foo to return a list of clients to pick. Eventually, the peer would find the transitive closure of the functions which are necessary to call. As the peer knows the shape of the data, data constructor tags can be sent to the functions to predict the flow in order to avoid needlessly picking nodes.

We will require that all control flow be done by recursing over finite data structures (more on that later) which makes it possible to prove that the program cannot enter an infinite loop. A "maximum number of loops" integer can suffice as a hacky data structure, but the data sent by the request or the application state is probably going to determine the flow.


Now the client will have the nodes route messages to the callers like Tor, to hide user network identity to some degree. The client will use asymmetric encryption to negotiate one-time symmetric keys with each node in the chain. The client now knows which node will need to see which data, and so can encrypt the data for each node. This may require duplication of some of the data, but a well-designed application should be able to minimize the amount of duplication. Encryption schemes which allow multiple keys to decode the same data can also be used here.

Now the node holding foo will see a data constructor that contains key/value pairs for a and b. These names will be removed, the nodes will see only hash identifiers for variable names. It will then send along these encrypted values to the other nodes to get back the encrypted results. These results cannot be inspected by the calling function. All it can do is apply a data constructor (really a hash tag on an ordering of data) and return it to the user. There is a limit on the damage each node can do, as all it sees is data constructor tags and can return references to the DHT. The idea is that *real* security breaches require that a node see multiple pieces of primitive data at once. You can't attack a user without resorting to brute force unless you see all fields of their login, for instance. By recursively splitting up the data, nobody sees enough to figure out how the codebase works or what the user is really doing.

To further make the job of an evil server harder, we will obfuscate function definitions. A function really figures out what to do by analyzing the constructor used for datatypes and dispatching to the jump table for each function. We can combine function bodies as long as the argument data types are entirely different, and can even add entries to the jump table that a user should never send. The computational complexity of extracting a dataflow graph from function bodies made like this is pretty high. It also requires that an adversary see all of the functions. This will never happen.


The system is really a way for a code author and a system user to communicate, using the servers as an untrusted resource. We call the code author a Supervisor. The cluster will have many, many codebases. Each codebase is identified by the public key of the supervisor. The supervisor is responsible for writing their code using an ORM-like interface which exposes the constraints of the system.

This code is then versioned, with the application having versions. A user will include the current version hash in their messages, and servers must match the hash. The hash will be stored in a blockchain block which is signed by the supervisor. This blockchain will hold two types of entries: DHT snapshots and server trust roles.

Global State

Users are responsible for directly querying the DHT to get the desired information that must be used by the servers to process their data. This DHT is really a pseudo-Merkle-tree made by having each DHT-holding server create a Merkle tree on their local machine and sending a cryptographically signed hash of the root node to the blockchain after each time slice. These hashes act like transactions and are mined in blocks. When a User queries the DHT, they get the information required to verify the data is in the Merkle tree, the blockchain-insertion-offset, and the data itself.

We see here that all state is global. This is true to some degree. The DHT requires a HashCash proof of work for queries and inserts, so there is limited visibility. Nobody is going to dump the entire database. Furthermore, old data will be purged from the DHT when the key is rewritten after a block is mined (it sticks around between inserts so that the state transition can be audited). Only the hash of the old data will be immutable. Lastly, I don't think there's anything wrong with this. A user should not put trust in the supervisor to use sensitive data correctly. That is why the servers do not insert the user data into the DHT themselves, the user will see the input and the output in the global state and so can audit their privacy exposure.

The supervisor will be expected to review the blockchain and to sign blocks every now and then. This allows an attentive Supervisor to collect aggregates with some degree of accuracy, without being able to retroactively scrape the database or see the entire system state. The Supervisor is as rate limited as everyone else. We define the most valid application blockchain to be the one with the most supervisor signatures in it. Their job is the final data integrity verifier, as well as the code creator.

Server Administration

Each server will be part of a cluster serving requests for many, many applications. I expect there to be a single pool of computers in Golem, much like a typical cloud hosting platform. These servers will register themselves in the DHT as "open for new applications" if their capacity is below a certain threshold. The DHT nodes will get an "application creation insertion" request and will propagate the message to servers in their local store which are available. Eventually a server will notify the cluster that it is at capacity. Application creations will carry promises of reward with them. This can be done through smart contracts on another blockchain, such as ethereum, where the evidence of work done is enough to get a payout. 

Reaping this reward requires a careful handshake. Servers start by using HashCash to mine themselves a work contract. This proof of work makes the identity of the server expensive to make, and is the asset that the server protects and enriches.

Then comes requesting the hash of the blockchain from the supervisor's DHT application versioning update. They then join the mini-cluster mining this blockchain. Every minute or so, a new block is mined. These blocks hold the DHT references. A server which detects a state transition for a key between two snapshots which isn't valid as per the finite state machine published by the supervisor will reject the transition by appending the rejection to the block being mined. All servers doing this work will duplicate verification. They will then sign this transaction and send the block with the signature to the blockchain with a message to the supervisor. The first server to finish validating gets a tiny reward. Already-mining servers can check that the block uploaded hashes to the one they made locally, and sign the block. They send this signature to the supervisor. The supervisor collects all of the servers they see sign it, verify it is correct themselves (don't need to do all of the time, as the threat of "firing" a server means they can check every few updates), and then sign the server signatures. The supervisor's signatures is the only one that matters for an application blockchain. The valid one is the one with the most signatures from the supervisor. 

If the first server makes a mistake, another server can steal their reward by posting evidence of the correct version to the supervisor along with a copy of the posted (incorrect) block. Servers which make mistakes have their identities destroyed by the Supervisor "tombstoning" their public key, requiring that they redo a HashCash transaction. 

Notice that the first server to mine is the only one rewarded for mining. Where is the motivation for the other servers? Servers which do not mine for more than one timeslice will be marked offline by the application in the global Golem DHT, meaning that they get no traffic. When a user wants to interact with the system, they send the supervisor a request to give them "tokens". These may be free, ad-supported, or paid tokens. These tokens are sent to every single node involved in the transitive closure of an API endpoint request. At the end of a work slice, a server will collect all of these tokens and will place them in a message which is encrypted with the supervisor's public key and signed with the server's identity private key. The supervisor knows what the program flow should be, and so gets a view of the code paths hit by the user without knowing the data that was actually provided. The server signs the tokens in a microtransaction to pay them for this work slice. 

If a server choses to "go rogue" and to send the wrong control flow routing information to a client, then the client will negotiate with the wrong servers and the supervisor will see keys from the wrong nodes. Nodes are picked to be unlikely to be cooperatively compromised. Keys may go missing from servers who leave the network, but it is possible to heuristically prove that data is only missing and that the state transition was correct, not wrong. These tokens allow the servers to be audited for honesty. Like Foucault's panopticon, it is the nonzero probability of their bad behavior being observed that makes bad behavior economically devastating for servers. 

Lastly, a server gets rewards only after having an "identity chain" over a certain length. This equates to an hour of work or so. Making the identity requires mining the HashCash, and each block which the supervisor signs which contains their signature becomes another block in their individual transaction history, or "identity chain." When they want to be paid, they publish the last microtransaction signature from the server and their identity chain's last hash to a smart contract blockchain. The miners can verify the length of the chain and the microtransaction, and will pay out. Cashing out requires the server and supervisor negotiate a new microtransaction. 

Lastly, what about evil supervisors? What if they don't want to pay the server? The fine-grained nature of the system and the microtransactions means that there is a small, bounded work to be done by the server between they get a guarantee of payment. Furthermore if the supervisor incorrectly tombstones their identity, the server has a cryptographically-verified log of the work they did and the state they saw, and can prove that they met the server's function specification due to showing that the application state changed correctly (assuming the User inserted their output into the DHT). There are possible attacks here, but the fact that a server takes time to have their bandwidth "ramp up" in the Golem global network due to the Tor-like bandwidth monitoring of nodes over time, and the small granularity of payments, means that a dishonest supervisor has a small amount of time in which they can behave badly before they get a bad reputation in the Golem network. A bad reputation (many server "complaint" messages, each of which has a HashCash requirement), means that servers will reject work requests and that users will get an inkling that the service-provider has bad business practices. 

1 comment:

  1. I've hosted a couple progressive dinners and they're so fun!! Love the tips you shared here!! And I love that you've got coffee brewed Suntuubi