Thursday, August 25, 2016

Consensus without Trust: Cryptographic Enforcement of Distributed Protocols


Most services on the internet work by having a lot of servers owned by the same group of people running software that receives input from untrusted clients and use it to craft output for them. In this classic client-server network model, the trust ends at the perimeter of the server farm. You trust your own systems, and validate everything from clients against your own state. This type of model is starting to reach growing pains as Moore's law breaks down. As the speed of CPU development slows we see a loss of the ability of the most powerful computers to outpace the request rate of hundreds of thousands of cheaper consumer devices had by clients.

In 1995 an Italian political group called the Strano Network began visiting the websites of the French government quite frequently. So frequently, in fact, that the servers and their network connections were completely monopolized. This was the first distributed denial of service. As early as 1995 we saw the ability for a small number of actors to overwhelm the center of the system. Even when the system stays up, hardware can be damaged and server farm contracts can be lost. Non-malicious DDOSes happen all of the time as well, where a small site is linked by a major news organization and suddenly has hundreds of dollars in hosting charges from their provider. In any client-server model, the server is a single point of failure. In response to this, alternate models of computation have been created. We will focus on decentralized and distributed systems.

Decentralized And Distributed Systems

A decentralized system removes the single point of failure, typically relying on a number of "supervisor" servers. Sometimes ordinary client servers can be promoted to these roles, but a decentralized system typically places a lot of trust in these supervisors. When a network problem occurs or a node goes down, the trusted nodes will elect a new leader. If you lose enough nodes, a decentralized system can be partitioned into smaller networks that do not synchronize, so it still has failure states. When leaders are lost, complex election algorithms such as Paxos and Raft must be used to safely choose a new leader.

A distributed system is the most egalitarian. Every node in the system is able to both request services and provide services to other nodes. Depending on the amount of bandwidth available to nodes, as well as port availability, some clients may only be service-takers. At it's core though, a distributed system is characterized by allowing untrusted nodes to carry out services for other nodes.


This brings us to the problem which my independent study will focus on. How do you create a system which will provide the promised services and properties even if a significantly large minority of the network is untrustworthy? While malicious attackers first come to mind, untrustworthy can refer to sensors which sometimes misbehave and network connections which corrupt or tamper with data during transmission.

We will focus on real-life systems such as Bittorrent, Bitcoin, Ethereum, Tor, I2P, IPFS, and Riffle.

Vector Clocks and Time

Computer time seems reliable until one is working with more than one computer. Clock skew, network issues, power loss, and data corruption can all work to counter the usefulness of real-world time. This is the big problem in modern distributed systems. What use is time though? Time is useful because it lets us tell each other which events happen before other events. This kind of reasoning can be handled in a principled way by making each event mark an explicit predecessor. This is called a vector clock.

If nodes Alice, Bob, and Ann have seen a message (#1) that Bob has $5, Alice has $0, and Ann has $0. Bob wants to take advantage of the really unreliable internet connection between Alice and Ann. He buys a $5 item from Alice (message #2.1) and then quickly buys a $5 from Ann (message #2.2) before Alice and Ann can talk. His goal is for each to see a $5 withdrawal from his account while it has $5, so he ends up with $0. This is a classic example of a race condition.

Bob has seen the double-spend, but he is alright with breaking the protocol. He likely tells everybody else that he only made one of the transactions. At some point, Alice and Ann will synchronize by getting copies of each other's messages. Alice and Ann can both see that there are two messages (2.2 and 2.1) with the same parent message (message #1), and that these two messages alter the values associated with the same key (Bob's account address).

Depending on the protocol, the nodes might decide to accept only the transaction with the smallest numerical hash, might drop both transactions, or might decide to put Bob in debt. By creating an explicit ordering between the events that each node sees, and by building conflict resolution into the protocol, we can ensure that steps taken out of order will eventually converge to the same global state.

It's worth noting that for a moment in time, Bob has gotten away with his crime. This is possible because Alice and Bob's transaction is "done" before Ann comes in. If you desire a system that makes double spending entirely impossible, you could require that all nodes synchronize. This offers it's own problems, making it possible for one slow node to halt traffic across the entire system. There is no perfect solution to conflict because each protocol is different. Intelligently balancing scalability with correctness is frequently in the realm of heuristics and time/difficulty constants which are chosen arbitrarily and frequently changed.

Cryptographic Verification

The systems that I am studying use cryptography to ensure their invariants. In most cryptographic algorithms, it is much less computationally costly to do things the "right" way than the "wrong" way. Doing things the wrong way involves guessing. Architecting a low probability of accidentally accepting bad data makes this guessing take a very long time. Practical cryptography comes in three disciplines nowadays: symmetric cryptography, asymmetric cryptography, and cryptographic hashing.

Symmetric cryptography is what probably comes to mind when you think of "encryption." Somebody has a key and uses it to turn plaintext into ciphertext. This key is then used to get the plaintext from the ciphertext. All of the secrecy lies in this key. This forms a problem when there's no trusted communication medium with the other party to send the key. In a distributed cluster, this is usually the case. Symmetric encryption is fast though, and is safe when the keys never need to be distributed.

Asymmetric cryptography came about as a solution to the key distribution problem. You have two keys, the public key and the private key. When one of them encrypts the information, the other one can decrypt it. By publishing the public key everywhere and holding onto the private key, a system for both encryption and signature exists. The private key can "sign" a message that the public key can be used to validate. This proves that the message came from the node it says it came from. The public key can be used to "encrypt" a message that the private key is then used to read. You can combine the two to verify the secret message. This signature is useful because it prevents someone from changing their mind; it is a receipt of the action. Asymmetric encryption is much slower usually. Therefore it is typically used only at first, in order to exchange keys for symmetric encryption. A new symmetric key for each session can thus be used.

Hashing is the last, and potentially the most important cryptographic tool for distributed systems. A hash is a function that takes a huge input domain and squashes it down into a much smaller output domain. These smaller values vary significantly as the input varies just a bit. A cryptographic hash function is made to be moderately fast to run forwards but incredibly slow to find the inverse for a given hash. If you know that there is a signed hash on a packet, you can say two things about the data in the packet. Either the first sender computed hash you see from the data you see, or somebody spent a very long time to find other data with the same hash. This is called a hash collision. If there are verifiable semantics for the data then it's worth noting that the chance of finding a hash collision that looks like it "makes sense" is astronomically low. Typically the data will look pseudorandom.

It should be worth noting that sometimes we *do* want to find a collision. Since this requires a brute force attempt that is probabilistically going to be computationally costly, finding a collision is a "proof of work." If you have data that hashes to an arbitrarily-chosen hash, then you've got evidence that you did the work to find this collision. Some systems chain proofs of work so that one has evidence of having done a *lot* of work. As these chains get bigger, it becomes harder and harder for a minority of the cluster to find a longer chain of collisions than the majority of the cluster has computed.

Merkel Everything

Large scale distributed systems are all about maintaining the structure of the state held by the cluster. What messages are in flight? What references in which data structures should point where? What order should things be in the queue? There are a lot of things to track. Should we expect every node to hold onto all of it? Is the cluster always right, or does it eventually stabilize? Can we tell when it has? It's tempting to say that the invariants of the data structure, a global property, should be handled by "those trusted people" but there are usually no trusted peers in a distributed system. In order to provide a reliable service, we need to design the way that untrusted nodes manipulate data so that if the crypto checks out we know that we can trust the data as much as we trust the node which signed it. 

The answer to this is the idea of the hash-associated-pointer. Data structures in the box-and-pointer model are constructed from allocation "blocks" which contain both pieces of data and references to other blocks. These blocks can be memory addresses, ip addresses, or keys into an associative structure. We can associated a hash of the pointed-to data with the reference. This is referred to as a Merkel tree when done with a tree, but there are "Merkel" versions of any data structure. It's worth noting that because the hash and the pointer must be updated together, updates must be atomic. 

The idea behind a Merkel tree is that it is a set of compressed "proofs" that pieces of data are trusted. If I want to check if a piece of data is located in a tree which is an order of magnitude larger than my local storage, I must ask the cluster of untrusted peers. Any peer might respond, and might give me data that the cluster didn't agree to incorporate. Merkel trees allow me to check that the data is indeed in the cluster's collection of trusted state.

Consider the tree below. After joining the cluster, I fetch a hash of the root of the tree from a source that's been signed with the key of one or more trusted peers. If enough untrusted peers sign it, we typically trust the cluster at large and so trust it. I can then ask the cluster for the interior of the tree. It's worth noting that I don't need the entire tree. If I have Top Hash and I need to verify that L2 is in the tree, I need Top Hash, Hash 0, Hash 0-1, Hash 0-0, and L2. I can then hash L2, and check that this is 0-1. I then check that Hash 0 is the hash of Hash 0-0 and Hash 0-1 concatenated. If this is the case, then I check that Top Hash is the hash of Hash 0 and Hash 1 concatenated. In this way, I need only get the single hash at the top of the tree from a trusted source. Any untrusted peer can give me the rest of the data structure. 

A more rudimentary example of hash-associated-pointers is the use of Distributed Hash Tables, where data is replicated throughout the cluster and the hash is used to search for the data and to find the peer responsible for the data. Trust is placed in the assertion that if the fetched data has the hash that the data was submitted to the network with, then the fetched data is most likely the same data that was first submitted. 

Merkel Structures as Proofs of History

An interesting interpretation of Merkle structures are as proofs of constructibility. Let's imagine that I have a system which says that any peer can report another peer trying to fill the system with duplicate requests. In this case, peers are told to drop traffic from this attacking peer. How do we prevent attacking peers from reporting innocent peers, while preventing peers from being overwhelmed by checking these reports? 

If a peer has inserted service requests into the global state, then they should exist in the Merkel structure which organizes this state. We assume that peers sign their service requests. We would make an interval tree which tracks times of requests. By providing the paths to the attacker's requests, a good peer can verify that the maximum number of requests have been inserted into the tree in this interval. After checking that we've put these requests in the interval tree, we can insert this attacker into the associative Merkel tree which uses peer hashes as keys. This requires a number of tree accesses which is logarithmic in the number of service requests and in the number of blacklisted peers. 

In the future, new peers which forward this attacker's message can be told that this attacker is banned. Good peers will send the path to the hash of the attacker in the blacklist, and other peers can verify that they're in the blacklist. In this way, we have added rate limiting to the global cluster without requiring that any one peer has more power than the others. If any peer misbehaves, all good peers will stop communicating with them. 

1 comment: