Consensus algorithms are a key component of any successful distributed system. I have already briefly discussed the topic previously, but I want to give some extra emphasis since I think it is ultra-valuable.
An example that you might be familiar with is bitcoin’s distributed ledger system that ensures transparency by requiring all nodes to maintain decentralized records of transactions. The creation of these public ledgers, like blockchain which has been made so popular by bitcoin, requires the parties involved to agree on a single value. This is made possible with the use of consensus algorithm.
So, what really is a consensus algorithm? TechTarget defines it as:
“A consensus algorithm is a process in computer science used to achieve agreement on a single data value among distributed processes or systems. Consensus algorithms are designed to achieve reliability in a network involving multiple unreliable nodes.”
We take a look at some of the most popular consensus algorithms available and in use today.
The Paxos algorithm is one of these consensus algorithms that helps machines agree on a single value in distributed systems. It is also one of the oldest, and most versatile algorithms when it comes to distributed consensus.
The algorithm defines a peer to peer consensus protocol that works on a simple majority rule. This means that only a majority of devices will have to agree on a single value to achieve the result. This also makes Paxos algorithm fault tolerant as some of the peers can fail or fail to respond and the system would still function.
The basic steps in the working of the algorithm start with a node being selected as leader or proposer. This leader then selects and sends a value to all nodes which can be called acceptors. The acceptors then either reply with reject or accept to the leader. As soon as majority consensus is reached, a commit message is broadcasted to all the nodes. There is also the possibility of the leader node failing and that is why Paxos algorithm does not specify a single leader at a given point and allows that any node may make itself the leader and try to coordinate the transaction.
The Paxos algorithm was hard to understand and even harder to implement. A solution was presented in the form of Raft consensus algorithm. It is equivalent to Paxos in fault tolerance and performance. Each state in a distributed system working on Raft algorithm maintains a replica of information so that at any time no information is lost due to node failure. The algorithm works in two broad stages
The first stage is leader election. There can only be one leader at one time, although in the case of leader failure a new leader is elected – unlike Paxos where any node could become a leader on its own, in Raft the leader is elected by the other nodes. Each node in the system can either be a follower or a leader. The leader sends a heartbeat message to the followers periodically to indicate that it is alive. As soon as a node does not receive a heartbeat from the leader within a timeout it declares itself the leader and sends a vote request to all the other nodes. Each node that gets a vote request votes for the node that sent the very first vote request. In this way, a candidate is eventually able to get majority number of votes to become the leader. It is important to note that each node is allotted a different timeout so that several nodes may not declare themselves as candidates for leader and start the voting process.
The second stage is committed log replication. The leader handles any client request that comes in and logs it into its own commit log before sending it out to the other nodes. The other nodes replicate the log and send back a reply to the leader. Assuming there are 2n+1 nodes in the system then the leader only waits for a reply from n+1 nodes before sending a positive reply to the client.
Proof of Work
The Proof of Work algorithm was the first blockchain consensus algorithm devised by Satoshi Nakamoto. It ensures the accuracy and agreement of devices on a single value by requiring nodes to solve increasingly complex and difficult mathematical problems. In Bitcoin’s blockchain, recording a new transaction in the public ledger requires the client to solve the complex problem before the transaction can become a part of the public record. The algorithm has a slow throughput and consumes a lot of power as well as computational energy and time.
So why are these consensus algorithms and many more that we haven’t mentioned important? Or are they even important? Where, why, and how are they used? Three of the most common applications of consensus algorithms today include:
- Deciding whether to commit a distributed transaction to a database.
- Designating node as a leader for some distributed task.
- Synchronizing state machine replicas and ensuring consistency among them.
Proof of Work
The proof of work consensus algorithm is like the founding block of blockchains and cryptocurrencies and that is where its major implementations are. Bitcoin, Ethereum, Litecoin, Dogecoin, and most other cryptocurrencies make use of this algorithm. Its inefficiency in terms of energy consumption and being uneconomical is causing cryptocurrencies like Ethereum to move away from it.
Kubernetes and Docker Swarm are two very popular tools that enable management and handling of containers and containerized applications. Containerization is a lightweight alternative to full machine virtualization where applications are encapsulated within their own operating environment within a container. Both Docker Swarm and Kubernetes make use of and implement the raft consensus algorithm for this purpose.
Paxos is hard to implement kind of consensus algorithm which is why people shy away from it as it is hard to understand as well. It can be implemented in any distributed system to allow for complete fault tolerance and also ensuring a complete consensus of all connected state machines on a single value. It would ensure that even one or a few devices failing does not result in the loss of information or data. There are some issues with Paxos on top of it being hard to understand which make people prefer Raft over it.
Interested in API strategy, architecture, and design. Working to improve the quality and usability of information on the Web. CTO @OrfiumMusic