Académique Documents
Professionnel Documents
Culture Documents
What is consensus?
In short concensus is agreement on shared state (Single system image). Its typically used to
build replicated state machine.
Idea is that if the log on all different servers looks the same, the state machine looks the same.
All the state machines produce the same result for the same set of operations.
Failure Model
Raft consider only fail-stop (not Byzantine) model, delayed/lost messages.
Raft Overview
Basic concept is to Parition your data and provide each partition to a state machine. Raft heavily
relies on leader which is an after thought in Paxos.
Leader Election
Select one of the servers as leader, detect crashes and choose a new leader.
Log Replication
Leader takes commands from clients, append those to leader log and leader replicates its log to
other servers.
Safety
Only server with up-to-date log can become leader.
Approaches to Consensus:
1. Symmetric, leader-less
o At any given time, one server is in charge, others accept its decisions.
Raft uses leader, all the complexity comes in when leader fails
Leader Election
1. Safety: allow at most one winner per term.
Log
Each servers has its own copies of logs.
Each log in composed of entries and each entry has an index number.
Each entry contains a command and a term number (increases monotonically.)
Committed: If a particular entry is stored on majority of server, we say that entry is committed.
{Not Really}
Normal Operation:
Client sends command to leader
Log Consistency
Properties that are always true
1. Combination of index and term uniquely idenfies a long entry. i.e., they will have the
same comand. Also, the logs are identical in all preceding entries.
Above properties are enforced by check made during ApendEntries consistency check.
It includes two values: Index and term of entry just before the new one.
If the log doesnt have that entry, RPC will be rejected. Its kind of like an induction step. Which
proves that if an entry in a server matches a leader entry, then all preceding entries also match.
Leader Changes
Old leader may have left entries partially replicated.
No special step taken when new leader comes up. Continues normal operation. cleanup
has to happend during the normal operation.
Raft Approach:
Assume that leaders log is always correct. Will eventually make followers logs identical to
leaders.
Scenario slide 15: S4,S5 were leaders for term 2,3 and 4. Safety Requirement: Once a log entry
has been applied to a state machine, no other state machine must apply a different value for that
log entry.
Raft Guarantee: Once a leader has decided that a particular entry is committed, that entry will be
present in the logs of all future leaders.
How?
We cant know for sure if entry is committed. We elect the server that has the log that is most
likely all the enteries that are committed.
When candidate request a vote: it includes: index and term of last log entry. -> unqiuely entire
log.
2. If the terms match and the voting servers log is longer, deny the vote.
Whoever wins the election is guarantted to have the most complete log among the candidate
servers.
Slide 19:
At least one new entry from leaders term must also be stored on majority of servers.
Combination of election rules and commitment rules makes Raft safe.
Reparing Logs
Slide 21:
Get rid of all extra eneteries and fill them with enteries from leaders log.
Leader keeps nextIndex for each follower, nextIndex is 1 + leaders last index.
On receiving AppendEnteries, consistency check fails, decrement the nextIndex and try again. If
follower overwrites inconsistent entry, it deletes all subsequent entries. Old leader may not be
dead
Client Protocol
Sender commands to leader
If leader unknown, contact any server If contacted server not leader, it will redirect to
leader.
Leader doesnt response until logged, committed and executed by leaders state machine.
Client embeds unique id in each command. Before accepting command, leader checks its log for
entry with that id.
Change degree of replication. Cannot switch directly from one configuration to antoher:
conflictin majorities could arise. conflicting majorities
Two Phase:
1st phase: joint consensus (intermediate phase that need majority of both old and new
configurations for elections.) union of old and new configurations. separate
2nd phase: clients makes a request to leader, server adds an entry to log describing new
configuration.