(Aspnes & Herlihy)
This case study concerns modelling and verifying the shared coin protocol of the randomised consensus algorithm of Aspnes and Herlihy [AH90] using PRISM. The shared coin protocol returns a preference 1 or 2, with a certain probability, whenever requested by a process at some point in the execution of the consensus algorithm. The shared coin protocol implements a collective random walk parameterised by the number of processes N and the constant K > 1 (independent of N). The consensus algorithm proceeds through rounds and for each round r there is a new copy of the shared coin protocol denoted CFr.
The processes access a global shared counter, initially 0. On entering the protocol, a process flips a coin, and, depending on the outcome, increments or decrements the shared counter. Since we are working in a distributed scenario, several processes may simultaneously want to flip a coin, which is modelled as a nondeterministic choice between probability distributions, one for each coin flip. Having flipped the coin, the process then reads the counter, say observing c. If c >= KN it chooses 1 as its preferred value, and if c <= -KN it chooses 2. Otherwise, the process flips the coin again, and continues doing so until it observes that the counter has passed one of the barriers.
There are two properties of the shared coin protocol we need to verify to prove Probabilistic wait free termination.
In the case of C2, calculate the exact (minimum) probability of the processes drawing the same value, as opposed to a lower bound p=(K-1)/2K established analytically in [AH90] using random walk theory.
The PRISM source code in the case when there are 8 processes is given below.
// COIN FLIPPING PROTOCOL FOR POLYNOMIAL RANDOMISED CONSENSUS [AH90] // 8 processes // gxn/dxp 20/11/00 mdp // CONSTANTS const int N=8; //number of processes const int K; // parameter const int range = 2*(K+1)*N; // range of sharded coin const int counter_init = (K+1)*N; // initial value const int left =2; // choose tails const int right= 2*(K+1)*N -N; // choose heads // shared coin global counter : [0..range] init counter_init; module process1 // program counter pc1 : [0..3]; // 0 - flip // 1 - write // 2 - check // 3 - finished // local coin coin1 : [0..1]; // flip coin [] (pc1=0) -> 0.5 : (coin1'=0) & (pc1'=1) + 0.5 : (coin1'=1) & (pc1'=1); // write tails -1 (and reset local coin) [] (pc1=1) & (coin1=0) & (counter>0) -> (counter'=counter-1) & (pc1'=2) & (coin1'=0); // write heads +1 (and reset local coin) [] (pc1=1) & (coin1=1) & (counter<range) -> (counter'=counter+1) & (pc1'=2) & (coin1'=0); // check [] (pc1=2) & (counter<=left) -> (pc1'=3) & (coin1'=0); // decide tails [] (pc1=2) & (counter>=right) -> (pc1'=3) & (coin1'=1); // decide heads [] (pc1=2) & (counter>left) & (counter<right) -> (pc1'=0); // flip again // loop when finished [] (pc1=3) -> (pc1'=3); endmodule // construct remaining processes through renaming module process2 = process1[pc1=pc2,coin1=coin2] endmodule module process3 = process1[pc1=pc3,coin1=coin3] endmodule module process4 = process1[pc1=pc4,coin1=coin4] endmodule module process5 = process1[pc1=pc5,coin1=coin5] endmodule module process6 = process1[pc1=pc6,coin1=coin6] endmodule module process7 = process1[pc1=pc7,coin1=coin7] endmodule module process8 = process1[pc1=pc8,coin1=coin8] endmodule
All experiments were run on a 440 MHz SUN Ultra 10 workstation with 512 Mb memory under the Solaris 2.7 operating system.
The table below shows statistics for each model and the MTBDD which represents it. The first column gives the size of the model (i.e. the number of states). The second two columns refer to the MTBDD: the total number of nodes in the structure and the number of leaves (or terminal nodes).
| Model: | States: | Nodes: | Leaves: | ||
| N | K | ||||
| 2 | 2 | 272 | 290 | 3 | |
| 4 | 528 | 301 | 3 | ||
| 8 | 1,040 | 312 | 3 | ||
| 16 | 2,064 | 323 | 3 | ||
| 32 | 4,112 | 334 | 3 | ||
| 64 | 8,208 | 345 | 3 | ||
| 4 | 2 | 22,656 | 1,294 | 3 | |
| 4 | 43,136 | 1,305 | 3 | ||
| 8 | 84,096 | 1,316 | 3 | ||
| 16 | 166,016 | 1,327 | 3 | ||
| 32 | 329,856 | 1,338 | 3 | ||
| 6 | 2 | 1,258,240 | 3,134 | 3 | |
| 4 | 2,376,448 | 3,131 | 3 | ||
| 8 | 4,612,864 | 3,142 | 3 | ||
| 16 | 9,085,696 | 3,153 | 3 | ||
| 8 | 2 | 61,018,112 | 5,371 | 3 | |
| 4 | 114,757,632 | 5,382 | 3 | ||
| 8 | 222,236,672 | 5,587 | 3 | ||
| 16 | 437,194,752 | 5,404 | 3 | ||
| 10 | 2 | 2,761,248,768 | 8,691 | 3 | |
| 4 | 5,179,854,848 | 8,713 | 3 | ||
| 6 | 7,598,460,928 | 8,712 | 3 | ||
The table below gives the times taken to construct the models. There are two stages. Firstly, the system description (in our module based language) is parsed and converted to the MTBDD representing it. Secondly, reachability is performed to identify non-reachable states and the MTBDD is filtered accordingly. Reachability is performed using a BDD based fixpoint algorithm. The number of iterations for this process is also given.
| Model: | Construction: | ||||
| N | K | Time (s): | Iter.s: | ||
| 2 | 2 | 0.158 | 37 | ||
| 4 | 0.187 | 73 | |||
| 8 | 0.224 | 145 | |||
| 16 | 0.317 | 289 | |||
| 32 | 0.581 | 577 | |||
| 64 | 1.11 | 1,153 | |||
| 4 | 2 | 0.368 | 99 | ||
| 4 | 0.503 | 195 | |||
| 8 | 0.845 | 387 | |||
| 16 | 1.44 | 771 | |||
| 32 | 2.75 | 1,539 | |||
| 6 | 2 | 1.36 | 149 | ||
| 4 | 1.94 | 293 | |||
| 8 | 3.18 | 581 | |||
| 16 | 5.50 | 1,157 | |||
| 8 | 2 | 3.03 | 199 | ||
| 4 | 4.37 | 391 | |||
| 8 | 7.50 | 775 | |||
| 16 | 13.2 | 1,543 | |||
| 10 | 2 | 6.80 | 249 | ||
| 4 | 10.1 | 489 | |||
| 6 | 12.6 | 729 | |||
Requirements. when model checking, we have the following requirement:
This corresponds to the conditions on the adversary/scheduler which we impose by model checking against only fair adversaries.
Model Checking Algorithms.There are two precomputation steps in the model checking procedure involving BDD based fixpoint algorithms: Prob1E and Prob0A which finds those states such that the maximum probability of satisfying the formula is 1 and 0 respectively.
In addition, in the main algorithm we stop iterating when the relative error between subsequent iteration vectors is less than 1e-6, that is:
Properties. We have model checked the following properties:
This is expressed in PCTL for 8 processes by:
P>=1 [ true U pc1=3 & pc2=3 & pc3=3 & pc4=3 & pc5=3 & pc6=3 & pc7=3 & pc8=3 ]
This is a probability 1 property, and therefore the solution is found through a probability-1 precomputation step.
Model checking times:
| Model: | Precomputation: | |||
| N | K | Time (s): | Iterations: | |
| 2 | 2 | 0.02 | 17 | |
| 4 | 0.03 | 29 | ||
| 8 | 0.04 | 53 | ||
| 16 | 0.15 | 197 | ||
| 32 | 0.13 | 101 | ||
| 64 | 0.31 | 389 | ||
| 4 | 2 | 0.40 | 33 | |
| 4 | 0.60 | 57 | ||
| 8 | 0.57 | 105 | ||
| 16 | 0.82 | 201 | ||
| 32 | 1.64 | 393 | ||
| 6 | 2 | 1.68 | 49 | |
| 4 | 2.06 | 85 | ||
| 8 | 2.93 | 157 | ||
| 16 | 4.21 | 301 | ||
| 8 | 2 | 6.44 | 65 | |
| 4 | 7.93 | 113 | ||
| 8 | 10.26 | 209 | ||
| 16 | 15.05 | 401 | ||
| 10 | 2 | 18.12 | 81 | |
| 4 | 18.20 | 141 | ||
| 6 | 20.39 | 201 | ||
In this case we calculate the actual minimum probability, i.e. the minimum value of p for which C2 holds. This query, in the case of 8 processes, can be expressed in PCTL as follows:
Pmin=? [ true U pc1=3 & pc2=3 & pc3=3 & pc4=3 & pc5=3 & pc6=3 & pc7=3 & pc8=3 & coin1=1 & coin2=1 & coin3=1 & coin4=1 & coin5=1 & coin6=1 & coin7=1 & coin8=1 ]
The tables below give the model checking results for the MTBDD, Sparse and Hybrid engines. This shows the minimum value of p and the lower bound (K-1)/2K, established analytically in [AH90] using random walk theory.
Model checking times:
| Model: | Precomputation (Prob0A): | Main Algorithm: | p: | (K-1)/2K: | |||||||||
| N | K | Time (s): | Iterations: | Time per iter. (s): | Iterations: | ||||||||
| MTBDD | Sparse | Hybrid | |||||||||||
| 2 | 2 | 0.034 | 29 | 0.0040 | 0.00004 | 0.00019 | 480 | 0.382814 | 0.25 | ||||
| 4 | 0.133 | 53 | 0.0087 | 0.00012 | 0.00037 | 1,740 | 0.437752 | 0.3125 | |||||
| 8 | 0.093 | 101 | 0.0186 | 0.00025 | 0.00063 | 6,132 | 0.468783 | 0.4375 | |||||
| 16 | 0.427 | 197 | 0.0387 | 0.00052 | 0.0012 | 21,108 | 0.484507 | 0.46875 | |||||
| 32 | 0.178 | 389 | 0.1242 | 0.0011 | 0.0026 | 70,668 | 0.492715 | 0.484375 | |||||
| 64 | 0.385 | 773 | 0.0752 | 0.0021 | 0.0044 | 227,663 | 0.498193 | 0.492188 | |||||
| 4 | 2 | 0.32 | 63 | 0.084 | 0.008 | 0.016 | 1,741 | 0.317360 | 0.25 | ||||
| 4 | 0.42 | 111 | 0.173 | 0.014 | 0.029 | 6,133 | 0.406307 | 0.3125 | |||||
| 8 | 0.50 | 207 | 0.343 | 0.041 | 0.057 | 21,110 | 0.453255 | 0.4375 | |||||
| 16 | 0.89 | 399 | 0.670 | 0.102 | 0.127 | 70,669 | 0.477088 | 0.46875 | |||||
| 32 | 1.70 | 783 | 1.11 | 0.212 | 0.345 | 227,665 | 0.490379 | 0.484375 | |||||
| 6 | 2 | 1.19 | 97 | 0.535 | 0.942 | 1.60 | 3,649 | 0.294365 | 0.25 | ||||
| 4 | 1.59 | 169 | 0.988 | 1.74 | 3.01 | 12,679 | 0.395906 | 0.3125 | |||||
| 8 | 2.49 | 313 | 1.83 | 3.37 | 5.80 | 42,967 | 0.448209 | 0.4375 | |||||
| 16 | 4.92 | 601 | 2.90 | - | 11.72 | 140,887 | 0.475138 | 0.46875 | |||||
| 8 | 2 | 3.71 | 131 | 1.87 | - | - | 6,133 | 0.282790 | 0.25 | ||||
| 4 | 4.98 | 227 | 3.18 | - | - | 21,110 | 0.390749 | 0.3125 | |||||
| 8 | 7.93 | 419 | 4.56 | - | - | 70,669 | 0.445831 | 0.4375 | |||||
| 16 | 13.16 | 803 | 6.91 | - | - | 227,665 | 0.474747 | 0.46875 | |||||
| 10 | 2 | 9.162 | 165 | 8.38 | - | - | 9,157 | 0.275919 | 0.25 | ||||
| 4 | 11.0 | 285 | 8.60 | - | - | 31,255 | 0.387693 | 0.3125 | |||||
| 6 | 12.278 | 405 | 9.79 | - | - | 63,241 | 0.425450 | 0.416667 | |||||