MaidSafe and PARSEC - a new distributed consensus algorithm

I'll write something about my work for a change.

I've been employed at MaidSafe for over a year and a half now. It's a small, Scottish company working on creating a fully distributed Internet. Sounds a bit weird - the Internet is already distributed, isn't it? Well, it isn't completely - every website in the Internet exists on some servers belonging to some single company. All data in the Internet is controlled by the owners of the servers that host it, and not necessarily the actual owners of the data itself. This leads to situations in which our data is sometimes used in ways we don't like (GDPR, which came into force recently, is supposed to improve the state of affairs, but I wouldn't expect too much...).

MaidSafe aims to change all of this. It counters the centralised servers with the SAFE Network - a distributed network, in which everyone controls their data. When we upload a file to this network, we aren't putting it on a specific server. Instead, the file is sliced into multiple pieces, encrypted and distributed in multiple copies among the computers of the network's users. Every user shares a part of their hard drive, but only controls their own data - the rest is unreadable to them thanks to encryption. What's more, in order to prevent spam and incentivise the users to share their space, SAFE Network is going to have its own native cryptocurrency - Safecoin - but it won't be blockchain-based, unlike the other cryptocurrencies.

But enough advertising, let's get to the point.

The architecture of a distributed network gives rise to multiple challenges, not present in the traditional Internet. For example, let's assume that I have a file in the network, which is stored on some computers, and I want to modify it somehow. After I do that, I'm trying to read the file back from the network. How can I be sure that I'm seeing what I saved? There is no central authority - some computers storing my file might not have seen the update yet. Some might have disappeared from the network, there roles being taken over by other computers that weren't even the destination of the message carrying the update. Some computers might have been malicious and sending incorrect data on purpose - after all, the nodes of the network are regular users' computers, and we all know how popular trolling is. So, if the computers - the nodes of the network - are sending contradictory data, how do we know which version is the correct one?

The above is the so called distributed consensus problem, but not only that - the potential presence of malicious actors extends it to what is known as the "Byzantine generals problem" (being so named from the original formulation, in which the generals had to independently decide whether to attack a city). This problem is being widely analysed since the 80s, and there are multiple solutions for it - but it's still not the end! In our case, we want to be sure that the decision will be correct even if the messages being passed between computers are arbitrarily delayed. This introduces something called "asynchrony" - a lack of assumptions regarding timing. The problem defined this way is called ABFT - Asynchronous Byzantine Fault Tolerance.

There are solutions to ABFT as well - but they are either too slow, or too complex, or patented, or they have some other issues. This made us at MaidSafe decide to try and come up with our own algorithm, basing it on existing knowledge. PARSEC - a Protocol for Asynchronous, Reliable, Secure and Efficient Consensus - is the result of this effort.

PARSEC - contrary to its name - isn't fully asynchronous. In its current form, it contains some assumptions about delays in message delivery, which make it somewhat less sophisticated theoretically, but perhaps more practical. We are still looking for a way of getting full asynchrony, though.

I won't be getting into the technical details. For those interested, they are described here and here. In short, we combined the idea of a graph of "gossip" among the nodes with Asynchronous Binary Consensus (which is a type of consensus about a single value which can be either 0 or 1) using something called a "concrete coin" (roughly speaking, it is about computers being able to "toss a coin" independently and get 0 or 1 randomly, but so that they all get the same value with a large probability).

The whole field is very interesting and there is much left to be discovered there. For those interested in studying the issue a bit deeper - I encourage you to read the documents linked above, an article on Medium and to take a look at the SAFE Network community forums. I'll gladly answer any questions myself as well, so feel free to ask in the comments! :)