Consensus improvement draft

I've been doing research on documenting, isolating, and improving nano's consensus algorithm. The primary driver for this is to provide a formal specification that can be better analyzed and formally proven.

I've been making revisions, a reference implementation, and seeking out people with a background in this type of research with whom to collaborate. This is a highly specialized discipline and our search so far has not yet yielded anyone able to do an informed review despite targetted reach outs. Recent visibility around this area is why I'm putting this research out as-is, rather than in a completed state.

Attached is the latest copy of the draft and this is a link to the latest reference implementation.

I will continue to update the draft, specifically the parts around "slates" which have been expanded on in the reference implementation. In short, slates are replaced with a confidence metric that can be adjusted.

I've done initial testing on the confidence metric, which essentially amounts to the number of successive votes that need to be in agreement for confirmation to complete. Below is a table showing the number of successive votes and the associated failure rate in worst case byzantine behavior as tested so far:
1 vote - 0.3%
2 votes - 0.000002%
3 votes - (Unmeasurable)
The results so far are positive and show that with very few additional successive votes, the failure rate drops off exponentially.

There are a number of existing consensus algorithms on the market with varying degrees of research though many of them don't work in the way required for a platform that scales as nano does. The main categories existing algorithm fail for nano's use case are 1) algorithms that require leader selection i.e. Paxos. Paxos is probably the most robust algorithm on the market though it's designed for database clusters in organizations rather than an open scalable network. 2) algorithms that are not byzantine tolerant. Next to Paxos, Raft is in a similar market vertical and it is not designed to tolerate byzantine behavior. 3) algorithms that require interactive request/acknowledge from validators. As the number of non-validators in the network increases, the load these validators must handle increases to unsustainable levels and presents a denial of service attack simply by adding more non-validators to the network.

An ideal algorithm would provide consensus using network broadcasts which can be efficiently implemented using network topology techniques such as multicast or overlays that calculate efficient minimum spanning trees.

Since this thread is likely to have a lot of interaction, please keep comments on-topic and brief so the discussion doesn't get fragmented.

17 Likes

Very interesting. Could you please explain why is it needed to know the timeframe of the vote? I thought that non-byzantine validators never vote twice, is it incorrect somehow?

Multiple votes are needed in order to resolve conflicting proposals and obtain a quorum. The reason the vote duration is specified is so no one can interpret subsequent votes as overlapping. If a vote is at time 1000 and duration 50, they must wait until time 1050 in order to issue a new, conflicting vote.

2 Likes

I understand the proposition, just do not understand why would they need to publish a conflicting vote? Why not just wait until 65%-quorum is reached without reissuing any votes?

Are you considering any other alternatives that are leaderless Byzantine safe consensus algorithms?

Eg