Saturday, November 12, 2016

OpenBazaar: Truly Free Trade Through Crypto

All of the privacy tools in the world won't help one if all of one's commerce is centralized through a few corporate websites that eagerly mine one's information to sell to advertisers and governments either directly or indirectly. With time, a person can be profiled and targeted. The classic example of the pressure-cooker-backpack police raid shows that United States online purchases have become watched as if we lived in a totalitarian nation. 

While most agree that customs and standardized import restrictions in the real world are useful, many dislike it online. One country may impose political views upon the entire rest of the world because of the location that a business incorporates. The ability to censor what is sold, to fix prices, and to impose localized trade restrictions degrades free trade. Ebay, Amazon, and Etsy refuse to allow for resale of many items. This pushes users into the murky waters of sites without a verified trust system. Trust is the bedrock of online commerce. 

OpenBazaar seeks to create a fully distributed marketplace protocol which optimizes for anonymity, freedom, trustworthiness, and convenience. The restriction of full distribution without any central arbitrators or global buyer enumeration forces a creative architecture that promises high scalability. 

Threat Model

OpenBazaar has two classes of enemies worth considering. The first wants to abuse the system much as users abuse existing commerce sites on a daily basis. The latter wants to perform an expensive, widespread attack to deanonymize users or degrade the entire system.

Bad vendors and buyers are addressed very well through the power of multisig Bitcoin transactions and through OpenBazaar's web of trust model. 

The web of trust (which we explain better later) allows a buyer who trusts a small number of peers to iteratively predict the trust they should place in another peer. This network is created through using external services, interaction, or personal history.

When a sale occurs, the buyer and vendor pick a peer that both trust create a Bitcoin transaction which requires that 2 of the 3 parties agree to the transaction for it to move forward. This allows for arbitration without relying on a centralized support team. If one fears that a bad node may be picked, this type of contract can scale indefinitely. One could construct a system where 8 of 15 nodes must agree to the transaction, or it does not go through. Shipping tracking information and terms of the sale are placed into a cryptographically-traceable ledger that acts as a log for arbitration. 

Globally malicious attackers require a different type of strategy. Transactions occur over the blockchain, meaning that an impersonation (Sybil) attack would require an attacker to spend an unfeasible amount of money to overload vendors. In essence, they would simply be buying out the market and aiding business! 

To impersonate a vendor, an attacker would need to gain trust by becoming a vendor. Once they begin acting badly and tying up arbitrator time, their reputation will suffer. In this way, a Sybil attack degrades into a failed attempt to game the system.

Deanonymization attacks are the only justified fear. OpenBazaar does a good job of tackling the problem better than previous solutions. By preventing one from seeing the entire web of trust, OpenBazaar prevents an adversary who can observe some mail and who know a few users' identities from inductively tracing down every purchaser and vendor. 

A malicious vendor will be able to see one's IP address if Tor isn't used. Currently, Tor and OpenBazaar do not interoperate together perfectly. This is coming quite soon though, and current usage seems to be good enough for certain network operations. Oddly enough, OpenBazaar suggested one day rolling out an onion routing mail protocol. By encrypting subsequent addresses and sending the package to intermediate peers, one can mimic Tor and avoid exposing the sender address to the purchaser. This would carry a stamp cost, but may be able to keep people safe from persecution. 

Technical Difficulties

Attackers are not the only challenge faced by OpenBazaar. OpenBazaar is prevented from making certain naive design choices due to their commitment to convenience and scalability. 

While using a blockchain technology like Ethereum to host traffic would have been easy, it would have added an unacceptable latency and mining fee to each transaction. Furthermore, it's an expensive model for needless consistency. The network view is quite localized; a buyer only needs to talk to the vendor most of the time. 

Reputation change is included in the Bitcoin blockchain, but this information is included in a Bitcoin transaction that would have to happen anyways. 

Lastly, the commitment to full distribution forced OpenBazaar to make choices about what types of products they can create. OpenBazaar is primarily a protocol, not an application. There can be multiple frontends. This forced OpenBazaar to make a "machine readable" trade format that sends the required information over JSON. At the same time, "human readable" names must be used to allow users to familiarize themselves with vendors. 

Sale Architecture

The secret behind OpenBazaar's indefinite scalability is the use of their Kademlia-style distributed hash table. This DHT associates a globally unique identifier with a peer's hostname and port. This identifier is a self-signed public key that has been hashed twice. This GUID cannot be reused by another peer without access to the private key in the keypair.

As such a GUID would be unacceptably difficult to remember for most people, OpenBazaar can use the Blockstack system to associate identities with GUIDs. OpenBazaar initially used Namecoin, but switched to the alternative Blockstack. Blockstack embeds identities into any suitable blockchain, rather than requiring the separate Namecoin blockchain. This has the advantage of not requiring explicit support from mining pools, which increases the number of nodes mining the block. This, argues many, makes Blockstack more secure. Other information in the Blockstack entry can be used for external validation. It's worth noting that this could compromise anonymity entirely for some vendors.

In order to allow for people to quickly find a peer's listed items, the DHT also contains the hashes and keywords for items that are listed for sale. After finding the hash of an item listing, a peer can then request this listing from the vendor in a direct P2P manner. If this architecture feels like BitTorrent, that is because it is remarkably similar.

These listings are known as Ricardian Contracts and are digitally signed documents with the necessary server information and public keys for a peer to resume the contract's back-and-forth. Contracts describe everything related to the listing, in a JSON-encoded document. The format is flexible enough that the merchant can describe the structure of payment and business that they expect from a buyer.

This means that OpenBazaar suits both digital and physical goods, and could potentially be used for labor and "sharing economy" tasks. Without a centralized authority taking a steep cut, OpenBazaar would be quite attractive. The web of trust that we will cover next can be used to establish a validated reputation to keep people safe.

Web of Trust and Ratings

There are two cases that we trust a person in our daily lives. We typically either have had extensive interaction with a person, or we have had it with someone who trusts them very well. In OpenBazaar, this is also the case. 

"Direct" trust can be established between people who validate each other's identity through other channels. If one doesn't have direct trust rating for a peer one wishes to query the evaluation of, one asks one's peers what their trust rating for the peer is. They perform a similar recursive check and query. Eventually, this bottoms out with a series of trust estimation chains. 

A trust value ranges between 0 and 1 for positive trust, and 0 and -1 for distrust. The peers one will query from must be trusted, and must therefore have a trust between 0 and 1. This can therefore be used as a scale factor. By taking the product of the trusts along the chain, and summing up all such products, the query finds an aggregate trust for the peer for a node's corner of the web. 

This system is entirely decentralized. The partial observability built into it also prevents deanonymization through passive observation by a nation-state adversary. To avoid enumeration, nodes must only allow queries from trusted nodes. 

There is one worrying attack that OpenBazaar is vulnerable to. A peer which copies the listings of another vendor could simply forward buyer requests to another peer. This would lead to an accumulation of trustworthy interactions with an untrustworthy individual. Furthermore, this attack is cheap. It could be done by almost anybody. The definitive way to check for one such abuse is for the vendor to include a copy of their GUID and key material inside the shipment. A buyer can therefore find out if something is wrong, and can spread news of distrust throughout the network. An adversary who is willing to receive and repackage shipments is essentially acting as a legitimate reseller. 

This system can be insufficient though. The web of trust should be a single web; this system is not difficult to partition. In order to provide a global trust score for some users, an external resource must be burned to prevent someone from creating many globally-trusted accounts and bootstrapping evil peers of these accounts to be recommended indirectly. The current best solution to that is simply to buy your good graces. By using Bitcoin's scripting language to make coins unspendable while recording the user's hash, a user can provide global evidence that their account cost them money to keep. The economic disincentive for that peer to behave badly has been proven to the network. 

The rating system is different. It's much more fine-grained for starters, enabling a review of shipment time and item quality as well as other attributes. Secondly, the goal is not to tell whether a peer is abusive but whether they offer a high-quality service. Ratings and reviews are documents in the DHT which have their hash embedded in the payment Bitcoin transaction. Ratings require transactions, which require a purchase with the vendor. Impersonation and sybil attacks are mitigated in this way.

Vendors do not review buyers, as all opportunities for buyer abuse are mitigated or arbitrated away by the protocol.

Success of Protocol

There are currently between 5,000 and 5,500 listings posted to OpenBazaar DHT. One can find everything from Alibaba purchases to expensive teas to physical and digital artwork. The network is slowly but steadily growing, and appears to have a lot of users who remain lightly active. 

The myriad of implementations are likely the reason. The official desktop application is a Javascript desktop application written using the Electron shell. For mobile, there is BazaarHound. Both of these allow for interactive exploration and make it easy to instantaneously purchase an order. 

Search on the applications could be better though. For this, there are two popular search engines: and BazaarBay. The former appears much more polished, competing with many small-time centralized interfaces for quality.

The official desktop application can be made to work through Tor, to add and another layer of anonymity. Information about one's host may be leaked by the information in the protocol itself though, and it's not clear right now how much the software will expose. In OpenBazaar 2.0, CoinDesk promised Tor integration from the ground up. The reliance on the IPFS (inter-planetary filesystem, a BitTorrent-like file network) means that IPFS must work perfectly with Tor for OpenBazaar to. 

This protocol has multiple implementations and is growing to carry many novelties and staples steadily. The flexibility and trustworthiness of the service means that OpenBazaar makes an amazing platform for new applications of the sharing economy which protect the users' privacy.


Saturday, November 5, 2016

Ethereum, or the Blockchain of Code

Software is incredibly powerful. Like most things in mathematics, the more powerful the thing being studied is, the harder it is to state useful things about it. In fact, the definition of programming as we know it came from Turing's seminal paper on the Halting Problem which told us what we couldn't do with a computer. It is from this that we draw the name "Turing-complete" to describe any system which is computationally powerful enough to run into the same limitations.

The Halting problem prevents us from knowing for sure how a computation will turn out without running. A slightly different problem that haunts many domains is that software is inherently disappearing. In calling a function, we trade the arguments for the end result. In the name of space efficiency, we later discard our arguments most of the time. At some point a variable is "useless" and the data it contained is erased.

Anybody who has gotten a bug report with an application state that seems borderline nonsense can understand the pain of this. Trying to reverse-engineer what one's own code did is what makes debugging so time consuming.

A slightly different problem caused by this is that it becomes important where code is run. When two systems cooperatively interact to compute something, something can go wrong at every single step of the way. Frequently, something does. It becomes the duty of an ultimately trusted entity to pick up the pieces and decide what the correct view of the world should be. In many situations, this is unacceptable.

The Solution

Ethereum tries to solve this problem by having everybody run the code, and by keeping around the arguments used to generate the result indefinitely. If this sounds expensive, it is. Someone using Ethereum to do numerical computation over a large amount of data is going to have to pay the network a lot for it.

Most applications aren't like this though. Most applications are thin, pretty wrappers around a data store that allow for simplistic manipulation. Messaging applications, commerce, social networks, and asset management applications are all examples of this.

The interesting thing is that when everybody is running the code, it is almost as if nobody is running the code. There's no need to keep servers online constantly to receive messages that might come with unknown frequency. The network as a whole provides uptime on behalf of the application writer. In this way, Ethereum applications are frequently called "serverless."

Ethereum promises a lot, and they do a pretty good job of delivering on it.


Ethereum is a blockchain technology, like many of the architectures that we've covered thus far. Human input into the network is done by sending transactions to the mining pool, which validates the transaction and adds it to a block to mine. After a miner manages to find the integer necessary to make the block's hash smaller than a determined difficulty, the block is considered part of the chain and the mining pool starts on the next block.

In financial blockchains, a transaction is always an exchange of value. Validating a transaction is thus validating that the person sending the money actually has the money. Bitcoin also has a scripting language which allows for very rudimentary computation. In Ethereum, use of the scripting language is the primary reason to use the network. Ethereum also contains a currency, which is used primarily to pay the transaction fee to the miners but can be sent between people as well. Validating an Ethereum transaction involves executing scripts held by the recipient of a transaction, which is paid for with the sender's transaction currency.


A contract is born by making a transaction which contains the compiled code of a contract in the data field. After mining, this contract becomes an entity that can receive transactions.

A contract is really thought of as a self-contained piece of code that responds to payment messages by executing the script. This script execution can essentially do anything that a person interacting with Ethereum can do. Scripts can create new contracts, send transactions to other contracts, and save the results of computation to a permanent storage. Anything that requires computation or storage has a computational fee that is measured in "gas" and is what determines the mining fee.

What's interesting is that these Turing complete virtual machines live on the blockchain to serve their defined goals without any further need for the code author to interact with the network. As long as the people who need to run the code can afford the execution, the network will take care of everything.

Managing Turing Completeness with Gas Money

Now anybody familiar with the Halting problem will flinch at the thought of putting an open server out there which will run any code submitted to it. An infinite loop can be crafted to lock up a system. A malicious agent would be able to derail every node that tries to validate that transaction.

Ethereum handles this through fine-grained accounting. A transaction contains two figures that determine the cost to the sender. The STARTGAS counts the computational load of operations that the sender expects the contract to run. Operations which store data or are more computationally expensive have a higher price in gas. The GASPRICE is the rate that the sender is willing to pay per "gas" unit in terms of Ethereum's underlying currency.

A transaction verifier will run a contract until the gas runs out, or until the script finishes. If the script finishes then the remaining gas is refunded to the recipient and the miner keeps the product of the consumed gas and GASPRICE. If the gas runs out, all state changes are reverted and the miner just keeps the money. 


Ethereum, like Bitcoin, faces the issue of storage. If Bitcoin were to handle the load of a modern credit card processing company, it would grow by about 1GB per hour. If Ethereum were to become mainstream, a naive implementation of it might face the same issues. As the blockchain grows, fewer and fewer nodes can afford to hold onto the entire chain. These "full nodes" have all of the power; if they were to form a malicious alliance then the history of the blockchain would be at risk.

Ethereum will likely scale a bit easier by the fact that the blockchain is not quite necessary to get a full view of the state of the network. Each Ethereum full node needs only to hold onto the global storage and the holdings of accounts, and can fold each incoming block into this.

Ethereum uses a modification of a Merkle tree optimized for addition and removal called a Patricia tree. This tree refers to subtrees of previous blocks by hash and tries to reduce hash invalidation. This is used to create a hash of the global application state that can be inserted into each block.

Ethereum has a formal method for finding the bad link in a poisoned chain. If a chain has a defect, it must occur between two blocks such that state N is correct and state N+1 is not. By walking through the blockchain from an expected good state, maybe even the initial block, a node can find the exact wrong state transition and can offer it to the network as proof that another miner had mined a bad block.

There is another kind of scalability worth considering, and that's the ability to do things cheaply. Ethereum's costs are relatively high currently, making it inappropriate for certain classes of high-throughput systems. Ethereum is moving to proof-of-stake in order to make it less computationally costly to mine, and therefore less expensive for the end user to run contracts.


One cannot talk about Ethereum without talking about the problem that was the DAO. The DAO, or distributed autonomous organization, was an entity that would hold the users' funds in a joint account that would be utilized as the holders saw fit. The DAO was widely successful, raising $150 Million. 

Unfortunately, the contract behind the DAO had a few bugs. Furthermore, the creators of the project didn't expect such a success, and put all of the money into a single account. This enabled an attacker to leave the DAO in a state that threatened to have the funds made useless or able to be stolen outright. 

The solution that the Ethereum project saw was to provide miners with an option to choose to continue mining the version of the blockchain that contained the result of the DAO and the attacker's actions, or to mine a new version that relocated the funds so that caretakers of the DAO could refund those who bought in to the best of their ability.

The miners favored the refund, and the attacker had their funds taken from them by the network. This "hard fork" was met with both relief and unease. This showed that unpopular actors in a blockchain network can have the network decide to do whatever it wants with their funds. While this is no surprise, given the nature of the 51% attack vulnerability in blockchains, it burst the bubble of some of the most crypto-anarchist-libertarian supporters of cryptocurrency. 

Cryptocurrency assets are able to be seized. The global consensus of ownership is a democracy due to the nature of the underlying blockchain.
Ethereum Ecosystem

A language or virtual machine succeeds or fails based on the flexibility of implementations and the availability of existing libraries. Ethereum is no different. There are many programmatic implementations of the clients currently, but the exciting stuff is still on the horizon.

At this point in time, much effort is being put into Ethereum's Mist browser. Mist offers a user-friendly interface for the intricacies of distributed serverless applications. It functions as a client for the Ethereum network and prioritizes privacy and the user experience.

I look forward to seeing the distributed applications that people will build for this exciting new platform.