www.prismmodelchecker.org

Randomised Mutual Exclusion

(Rabin)

Contents

Acknowledgements: We are grateful to Martijn Moraal, Jeroen Schutte, Fides Aarts and Johan Uijen (students on Frits Vaandrager's Analysis of Embedded Systems course) for contributions to this case study.

Introduction

This case study is based on Rabin's solution to the well known mutual exclusion problem [Rab82].

Let P1 ... Pn be n processes that from time to time need to execute a critical section in which exactly one of them is allowed to employ a shared resource. The processes can coordinate their activities by use of a common resource.

The Model

The model is represented as an MDP. We let N denotes the number of processes. The PRISM source code for the case where N = 3 is show below:

// N-processor mutual exclusion [Rab82]
// gxn/dxp 03/12/08

// to remove the need for fairness constraints for this model it is sufficent
// to remove the self loops from the model 

// the step corresponding to a process making a draw has been split into two steps
// to allow us to identify states where a process will draw without knowing the value
// randomly drawn
// to correctly model the protocol and prevent erroneous behaviour, the two steps are atomic
// (i.e. no other process can move one the first step has been made)
// as for example otherwise an adversary can prevent the process from actually drawing
// in the current round by not scheduling it after it has performed the first step

mdp

// size of shared counter
const int K = 6; // 4+ceil(log_2 N)

// global variables (all modules can read and write)
global c : [0..1]; // 0/1 critical section free/taken
global b : [0..K]; // current highest draw
global r : [1..2]; // current round

// formula for process 1 drawing
formula draw = p1=1 & (b<b1 | r!=r1);

// formula to keep drawing phase atomic
// (a process can only move if no other process is in the middle of drawing)
formula go = (draw2=0&draw3=0);

module process1

	p1 : [0..2]; // local state
	//  0 remainder
	//  1 trying
	//  2 critical section
	b1 : [0..K]; // current draw: bi
	r1 : [0..2]; // current round: ri
	draw1 : [0..1]; // performed first step of drawing phase

	// remain in remainder
	// [] go & p1=0 -> (p1'=0);
	// enter trying
	[] go & p1=0 -> (p1'=1);
	// make a draw
	[] go & draw & draw1=0 -> (draw1'=1);
	[] draw1=1 -> 0.5 : (b1'=1) & (r1'=r) & (b'=max(b,1)) & (draw1'=0)
	         + 0.25 : (b1'=2) & (r1'=r) & (b'=max(b,2)) & (draw1'=0)
	         + 0.125 : (b1'=3) & (r1'=r) & (b'=max(b,3)) & (draw1'=0)
	         + 0.0625 : (b1'=4) & (r1'=r) & (b'=max(b,4)) & (draw1'=0)
	         + 0.03125 : (b1'=5) & (r1'=r) & (b'=max(b,5)) & (draw1'=0)
	         + 0.03125 : (b1'=6) & (r1'=r) & (b'=max(b,6)) & (draw1'=0);
	// enter critical section and randomly set r to 1 or 2
	[] go & p1=1 & b=b1 & r=r1 & c=0 -> 0.5 : (r'=1) & (c'=1) & (b'=0) & (b1'=0) & (r1'=0) & (p1'=2)
	                                  + 0.5 : (r'=2) & (c'=1) & (b'=0) & (b1'=0) & (r1'=0) & (p1'=2);
	// loop when trying and cannot make a draw or enter critical section
	// [] go & p1=1 & r1=r & b1<=b & ((c=0 & b1!=b) | c=1) -> (p1'=p1);
	// leave crictical section
	[] go & p1=2 -> (p1'=0) & (c'=0);
	// stay in critical section
	// [] go & p1=2 -> (p1'=2);
	
endmodule

// construct further modules through renaming
module process2 = process1 [p1=p2, b1=b2, r1=r2, draw1=draw2, draw2=draw1 ] endmodule
module process3 = process1 [p1=p3, b1=b3, r1=r3, draw1=draw3, draw3=draw1 ] endmodule

// formulas/labels for use in properties:

// number of processes in critical section
formula num_procs_in_crit = (p1=2?1:0)+(p2=2?1:0)+(p3=2?1:0);

// one of the processes is trying
label "one_trying" = p1=1|p2=1|p3=1;

// one of the processes is in the critical section
label "one_critical" = p1=2|p2=2|p3=2;

// maximum current draw of the processes
formula maxb = max(b1,b2,b3);

View: printable version          Download: rabin3.nm

The table below shows statistics for each model we have built. The first section gives the number of states in the MDP representing the model. The second part refers to the MTBDD which represents this MDP: the total number of nodes and the number of leaves (terminal nodes).

N:   Model:   MTBDD:
States: Nodes: Leaves:
3 27,766 27,966 7
4 668,836 63,135 7
5 27,381,358 131,5918
6 624,265,622196,3768
7 1.3e+11 271,8258
8 2.8e+12 357,9388
9 1.6e+13 723,5149
103.5e+15 894,8979

The table below gives the times taken to construct the models. There are two stages. Firstly, the system description (in our module based language) is parsed and converted to the MTBDD representing it. Secondly, reachability is performed to identify non-reachable states and the MTBDD is filtered accordingly. Reachability is performed using a BDD based fixpoint algorithm. The number of iterations for this process is also given.

N:   Construction:   Reachability:
Time (s): Time (s): Iter.s:
3 0.5270.23 21
4 1.4150.81 24
5 4.7763.85 27
6 10.198.59 30
7 20.1617.6233
8 35.6330.5136
9 97.9286.7639
10188.4169.442

Model Checking

When model checking, we have the following requirements:

  • Scheduler-Liveness: for each time t and any process Pi not in its remainder phase at time t, there exists a time t' > t in which Pi makes a move.
  • Fault freeness: if a process Pi moves into its critical section phase then eventually Pi moves from this phase to its exit phase and from its exit phase to its remainder phase.

These correspond to fairness assumptions on the adversary/scheduler. However, by following the approach employed by [DFP04] for the dining philosophers case study, in fact we can bypass this restriction by removing the self-loops from the model.

We have verified the following PRISM properties in all states of the model:

// Mutual exclusion: at any time t there is at most one process in its critical section phase
num_procs_in_crit <= 1

// Liveness: if a process is trying, then eventually a process enters the critical section
"one_trying" => P>=1 [ F "one_critical" ]
View: printable version          Download: rabin_prop1.pctl

One requirement that we have been unable to verify is:

  • k-Bounded-Waiting: if M processes participate in a trying round, then each process has probability at least 1/kM of entering the critical section at the end of the trying round."

One problem with this requirement is that it cannot be expressed by a PCTL formula as property corresponds to the probability of a conjunction of path formulae (expressing the properties a process participates in a round, M processes participate in a round and a process enters the critical section) being greater than a certain bound. However, in PCTL we are restricted to comparing only the probability of single path formulae with bounds.

On the other hand, this property could be expressed in PCTL*, but we would still need to consider a different formula for each possible value of M (M ranges from 1 up to the number of processes).

We note that Saias [Sai92] proved that this property does not hold. Furthermore, Kushilevitz and Rabin [KR92] have presented a modified version of this algorithm which solves this problem.

Based on ideas from Martijn Moraal and Jeroen Schutte, we instead consider a weaker property. More precisely, we compute:

  • Prob-Win: "the minimum probability that a process enters the critical section, given that the process tries in the current round"

As shown in the PRISM description above, to allow for the analysis of this property, we modified the model by dividing the step corresponding to a process making a draw into two. This allowed us to identify states where a process will draw without knowing what value the process will randomly pick. To correctly model the protocol and prevent erroneous behaviour, we kept these two steps atomic (i.e. no other process can move once the first step has been made). For example, if we do not add this restriction, then the adversary can prevent the process from actually drawing in the current round by not scheduling it after it has performed the first step.

The actual property, concerning process 1, verified in PRISM is given below:

// minimum probability process 1 enters the criticial section next
Pmin=?[ !"one_critical" U (p1=2) {draw1=1 & !"one_critical"}{min} ]
View: printable version          Download: rabin_prop2.pctl

The use of the filter in the property expresses the fact that we are only interested in the probability in states for which:

  • the process is going to make a draw (draw1=1);
  • no process is in the critical section (otherwise probability is clearly 0);

and we take the minimum value over this set of states.

Verifying this property returns the value 0 which is due to the fact that the adversary can use the values previously drawn by other processes. This is similar to the proof Saias [Sai92] presents for proving that Rabin's bounded waiting property does not hold. To demonstrate this fact we present the results for the following property which restricts attention to those states where the values drawn by the other processes are below k.

const int k;

Pmin=?[ !"one_critical" U (p1=2) {draw1=1 & !"one_critical" & maxb<=k}{min} ]
View: printable version          Download: rabin_prop3.pctl

The tables below present results for this property as k varies (note that the case k=4+ceil(log2 N) corresponds to the Prob-Win property above).

  • k=0:
    N:   Model Checking:   Probability:
    Time (s): Iter.s:
    3 0.0538 0.237457
    4 0.164110.180014
    5 0.476140.144384
    6 4.570170.119936
    7 10.17200.102551
    8 21.11230.089578
    9 88.64260.079959
    10199.6290.071948

  • k=1:
    N:   Model Checking:   Probability:
    Time (s): Iter.s:
    3 0.0538 0.237457
    4 0.164110.180014
    5 0.476140.144384
    6 4.570170.119936
    7 10.17200.102551
    8 21.11230.089578
    9 88.64260.079959
    10199.6290.071948

  • k=2:
    N:   Model Checking:   Probability:
    Time (s): Iter.s:
    3 0.0538 0.208007
    4 0.164110.174957
    5 0.476140.144384
    6 4.570170.119936
    7 10.17200.102551
    8 21.11230.089578
    9 88.64260.079959
    10199.6290.071948

  • k=3:
    N:   Model Checking:   Probability:
    Time (s): Iter.s:
    3 0.0538 0.114257
    4 0.164110.104644
    5 0.476140.096728
    6 4.570170.089208
    7 10.17200.082461
    8 21.11230.076397
    9 88.64260.071337
    10199.6290.066468

  • k=4:
    N:   Model Checking:   Probability:
    Time (s): Iter.s:
    3 0.0538 0.059570
    4 0.164110.056793
    5 0.476140.054858
    6 4.570170.052572
    7 10.17200.050404
    8 21.11230.048347
    9 88.64260.046794
    10199.6290.044992

  • k=5:
    N:   Model Checking:   Probability:
    Time (s): Iter.s:
    3 0.0538 0.030273
    4 0.164110.029327
    5 0.476140.029109
    6 4.570170.028432
    7 10.17200.027773
    8 21.11230.027131
    9 88.64260.026903
    10199.6290.026345

  • k=6:
    N:   Model Checking:   Probability:
    Time (s): Iter.s:
    3 0.0538 0.0
    4 0.164110.0
    5 0.476140.014903
    6 4.570170.014671
    7 10.17200.014441
    8 21.11230.014216
    9 88.64260.014392
    10199.6290.014225

  • k=7:
    N:   Model Checking:   Probability:
    Time (s): Iter.s:
    3 - - -
    4 - - -
    5 0.476140.0
    6 4.570170.0
    7 10.17200.0
    8 21.11230.0
    9 88.64260.007395
    10199.6290.007337

  • k=8:
    N:   Model Checking:   Probability:
    Time (s): Iter.s:
    3 - - -
    4 - - -
    5 - - -
    6 - - -
    7 - - -
    8 - - -
    9 88.64260.0
    10199.6290.0

Case Studies