To see how the checkpointing protocol works, consider the example shown in Figure 2.6. In this example, we assume that the distributed system consists of three processes, where the three processes are fully connected, i.e., P 0has a connection with P 1, P 1has a connection with P 2, and P 2has a connection with P 0. Therefore, each process has two incoming channels and two outgoing channels connected to its two neighbors.
Assume process P 0is the checkpointing coordinator. It initiates the global checkpointing by sending a CHECKPOINT message to P 1and P 2, respectively, along the two outgoing channels. In the mean time, P 1sends a regular message m 0to P 0, and P 2sends a regular message m 1to P 1.
Upon receiving the CHECKPOINT message from P 0, P 1stops normal execution and sends a CHECKPOINT message along each of its outgoing channel to P 0and P 2, respectively. Similarly, P 2sends the CHECKPOINT message to P 0and P 1, respectively, once it receives the first CHECKPOINT message.
Due to the FIFO property of the connections, P 0receives m 0before it collects all the CHECKPOINT messages from all its incoming channels, and P 1receives m 1before it receives the CHECKPOINT messages from P 2. According to the protocol rule, such regular messages are logged instead of delivered because normal execution must be stopped once the global checkpointing is initiated. These logged messages will be appended to the local checkpoint once it is taken. In fact, such messages reflect the channel states of the distributed system. These messages won’t be delivered for execution until a process resumes normal execution.
When P 0receives the CHECKPOINT messages from P 1and P 2, it takes a local checkpoint, C 0,0and append the message log to the checkpoint. Similarly, P 1takes a local checkpoint when it receives the CHECKPOINT messages from P 0and P 2, and P 2takes a local checkpoint when it receives the CHECKPOINT messages from P 0and P 1.
Subsequently, P 1and P 2send their SAVED messages to P 0, i.e., the global checkpointing coordinator. P 0then informs P 1and P 2to resume normal execution with a RESUME message to each of them.
A more complicated distributed system in which some processes do not have direct connection with the coordinator will require some of the coordinator’s neighbors to relay the SAVED notification to the coordinator.
2.2.2.2 Correctness of the Protocol.
It is easy to see why the protocol always produce a set of checkpoints that can be used to reconstruct a consistent global state in the absence of failures. As shown in Figure 2.2(a) and (b), a consistent global state consists of only two scenarios with respect to each pair of local states:
1 All messages sent by one process prior to its taking a local checkpoint have been received and executed before the other process takes its local checkpoint.
2 Some messages sent by one process prior to its taking a local checkpoint might arrive after the other process has checkpointed its state, however, these messages are logged at stable storage for replay.
In the Tamir and Sequin protocol, if neither the coordinator nor any of the participants receives any regular message once the global checkpointing is initiated, then the scenario 1 holds. On the other hand, if a process receives one or more regular messages, it logs them and append them to the local checkpoint, ensuring their replayability. Hence, the scenario 2 holds. Because the protocol prohibits any process from continuing normal execution (including the sending of a message) as soon as it initiates (if it is the coordinator) or receives the very first CHECKPOINT message (for a participant), no process would receive a message prior to its checkpointing that has been sent by another process after that process has taken its local checkpoint in the same round. That is, the inconsistent global state scenario shown in Figure 2.2(a)does not occur.
2.2.3 Chandy and Lamport Distributed Snapshot Protocol
The Tamir and Sequin global checkpointing protocol is very elegant. However, it is a blocking protocol in that normal execution is suspended during each round of global checkpointing. For applications that do not wish to suspend the normal execution for potentially extensive period of time, the Chandy and Lamport distributed snapshot protocol [5] might be more desirable.
The Chandy and Lamport distributed snapshot protocol [5] is a nonblocking protocol in that normal execution is not interrupted by the global checkpointing. However, unlike the Tamir and Sequin protocol, the Chandy and Lamport distributed snapshot protocol only concerns on how to produce a consistent global checkpoint, and it prescribes no mechanisms on how to determine the end of the checkpointing round, and how to atomically switch over to the new global checkpoint.
Figure 2.7 Finite state machine specification for the Chandy and Lamport distributed snapshot protocol.
2.2.3.1 Protocol Description.
The finite state machine diagram for the Chandy and Lamport distributed snapshot protocol is given in Figure 2.7. A process will be in the Normal state between two rounds of global checkpointing, and in the Checkpointing state during a global checkpointing round. A process may encounter a number of events:
◾ The global checkpointing can be initiated by any of the processes in the distributed system. Once a process decides to initiate a global checkpointing round, it takes a local checkpoint and sends a Marker message to each of its outgoing channels. The state of the process changes from Normal to Checkpointing as a result.
◾ A process undergoes the same state transition (from Normal to Checkpointing) and take the same actions upon receiving the Marker message for the first time, except that it logs the Maker in a data structure referred to as the Marker Certificate in the finite state machine diagram. The Marker Certificate data structure keeps track of which incoming channel has received a Marker and whether or not all incoming channels have received the Marker. The Marker Certificate is called complete when every incoming channel has received a Marker.
◾ When a process receives the Marker message from a channel when it is in the Checkpointing state, it adds the Marker message to the Marker Certificate and checks whether or not the Marker Certificate is complete. If the Marker Certificate is now complete, the process transits to the Normal state (and possibly reports the completion of the global checkpointing to some predefined server). Otherwise, the process will remain in the Checkpointing state.
◾ In either the Normal or Checkpointing state, the process may receive a regular message. The regular message is always executed immediately. This is drastically different from the Tamir and Sequin global checkpointing protocol. The regular message will be appended to the channel state from which it is received only when the process is in the Checkpointing state and it has not received the Marker message in this channel.
An example run of the distributed snapshot protocol in a three-process distributed system is shown in Figure 2.8. P 0is the initiator of the round of the global checkpointing. P 0takes a local checkpoint and sends a Marker message along each of its outing channels. Upon receiving the Marker message, P 1immediately takes a local checkpoint and in turn sends a Marker message to each of its outgoing channels. Similarly, P 2takes a local checkpoint when it receives the first Marker message (from P 1) and sends a Marker message to each of its outgoing channels connecting to P 0and P 1, respectively.
Читать дальше