PRISM-games: Property Specification

Specification of properties to be checked in PRISM-games is based on the temporal logic rPATL [CFK+12, CFK+13b], which combines elements of the probabilistic temporal logic PCTL for MDPs, the logic ATL for games and agent-based systems, and extensions of PRISM's reward operators. The properties can be divided into:

zero-sum properties, where one coalition of players tries to optimise/satisfy an objective (maximising/minimising a probability/reward) while the remaining players form a second coalition with the opposite objective;
nonzero-sum properties, where the players are divided into two coalitions, each with a separate objective (maximising/minimising a probability/reward), and the coalitions try to find an equilibrium for these objectives.

For the formal syntax and semantics of zero-sum properties see [CFK+13b] for SMG/TSGs and [KNPS18] for CSGs. For the case of nonzero-sum properties, see [KNPS19, KNPS22].

Zero-Sum Properties

Some simple zero-sum property are listed below.

// player 1 can ensure that within 3 steps 
// a state in which the variable c equals 2 is reached
// no matter the behaviour of the other players
<<1>> P>=0.84 [ F<=3 c=2 ]

// what is the minimum probability player 1 can ensure that 
// within 3 steps a state in which the variable c equals 2 is reached
// no matter the behaviour of the other players
<<1>> Pmin=? [ F<=3 c=2 ]

// what is the maximum probability player 1 can ensure that
// eventually a state in which the variable h equals 2 and variable c equals 0 is reached
// no matter the behaviour of the other players
<<1>> Pmax=? [ F (h=2 & c=0) ]

In general, <<...>> contains a coalition of players for which strategies are required. Some further (generic) example properties of the P operator are listed below.

// Players 4 and 5 can ensure that the probability
// of reaching an "end"-state within 100 time-steps is >= 0.95
<<4,5>> P>=0.95 [ F<=100 "end" ]

// The maximum probability that players p1 and p2 can guarantee of
// of reaching a "b"-state within k steps, whilst not passing through any "c"-states,
// no matter the behaviour of the other players
<<p1,p2>> Pmin=? [ !"c" U<=k "b" ]

// The maximum probability with which players p1 and p2
// can guarantee that an "end"-state is reached
<<p1,p3>> Pmax=? [ F "end"]

// Players 1, 2 and 3 can guarantee that the probability
// of reaching a "b"-state, whilst passing only through "a"-states, is > 0.5
<<1,2,3>> P>0.5 [ "a" U "b" ]

Specifications of rPATL coalitions (within <<..>>) can use either the player names or their indices separated by commas. Player numbering is determined by the order or their specification in the model file, starting from 1. To specify the empty set of players, use <<>>.

PRISM-games also has a variety of reward-based properties (see [CFK+12, CFK+13b] for details):

// The maximum expected (state) reward "r" at the k step
// that can be guaranteed by player 1
<<1>> R{"r"}max=? [ C<=k ]

// The minimum expected reward "r" accumulated over the first k steps
// that can be guaranteed by player p2
<<p2>> R{"r"}min=? [ C<=k ]

// The expected total value for reward structure "r"
// that can be guaranteed by player 1 is at least 2.5
<<1>> R{"r"}>=2.5 [ C ]

// The maximum expected value of reward "r" accumulated before
// reaching "success" that can be guaranteed by player p1
<<p1>> R{"r"}max=? [ F "success" ]

// Under any possible strategy (for all players),
// the expected reward accumulated before reaching "finished" is > 100
<<>> R>100 [ F0 "finished" ]

// The minimum expected reward accumulated before reaching "fail"
// that can be guaranteed by players p2 and p5
<<p2,p5>> Rmin=? [ Fc "fail" ]

To use expected total rewards, the game must be a stopping game (i.e., terminal states with zero reward must be reached almost surely under all strategies).

When verifying SMGs/TSGs there are additional reward-based properties that can be verified including:

expected mean-payoff;
ratios of expected mean-payoffs;
almost-sure satisfaction of mean-payoff or ratio rewards.

For expected long-run objectives or ratio objectives, games must be controllable multichain (i.e., the sets of states that can occur in any maximal end component are almost surely reachable).

Examples of these properties are:

// The expected long-run average (mean payoff) value for reward structure "r"
// that can be guaranteed by player 1 is at least 2.5
<<1>> R{"r"}>=2.5 [ S ]

// The ratio of expected long-run average (mean payoff) values for reward structures "r" and "c"
// that can be guaranteed by player 1 is at least 0.5
<<1>> R{"r"}/{"c"}>=0.5 [ S ]

// Player 1 can guarantee that, with probability 1,
// the expected long-run average (mean payoff) value for reward structure "r" is at least 1.5
<<1>>  P>=1 [ R(path){"r"}>=1.5 [ S ] ]

// Player 1 can guarantee that, with probability 1,
// the ratio of expected long-run average (mean payoff) values
// for reward structures "r" and "c" is at least 0.5
<<1>>  P>=1 [ R(path){"r"}/{"c"}>=0.5 [ S ] ]

Finally, PRISM-games also supports multi-objective properties for SMGs/TSGs, in which multiple objectives can be combined using a Boolean expression comprising the operators & (conjunction), | (disjunction), ! (negation) and => (implication). Currently, the objectives combined together must of the same type (e.g. expected total reward). For the case of almost-sure satisfaction properties, objectives must be combined into a conjunction, not an arbitrary Boolean expression.

// Player 1 can guarantee that the expected total reward values
// for reward structures "r1" and "r2" are at least v1 and v2, respectively
<<1>> ( R{"r1"}>=v1 [ C ] & R{"r2"}>=v2 [ C ] )

// Player 1 can guarantee that, whenever the ratio of
// expected long-run average values for "r1" and "c" is at most v1,
// then the ratio for "r2" and "c" is at least v.
<<1>> ( R{"r1"}/{"c"}<=v1 [ C ] => R{"r2"}/{"c"}>=v2 [ C ] )

Nonzero-Sum (Equilibria) Properties

Some simple nonzero-sum properties, which describe Nash equilibria, are listed below. More precisely, these refer to social welfare Nash equilibria, i.e., they also maximise the sum of the two objectives.

// Players p1 and p2 can collaborate so that the sum of their probabilities
// of reach their goal states is at least 1.5
<<p1:p2>> >=1.5(P[ F "goal1" ] + P[ F "goal2" ])

// What is the sum of expected values when players 1 and 2 collaborate 
// and each wants to minimise their expected rewards, with respect to
// the reward measures r1 and r2, to reach their individual goals
<<p1:p2>>min=?(R{"r1"}[ F "goal1" ] + R{"r2"}[ F "goal2" ])

Unlike zero-sum properties, nonzero sum properties need to specify two coalitions of players. This is again achieved using the <<...>> construct, but now containing two comma-separated lists of players, split by a colon.

Some further example of probabilistic nonzero-sum properties are listed below.

// Players p1 and p2 can collaborate so that each player 
// reaches their goal states without crashing with probability 1
<<p1:p2>> >=2( P[ !"crash1" U "goal1" ] + P[ F !"crash2" U "goal2" ] )

// What is the sum of probabilities when the players collaborate 
// to reach there individual goals without with the condition 
// player 2 has to reach its goal within 10 steps without crashing
<<p1:p2>>max=?( P[ F "goal1" ] + P[ F !"crash2" U<=10 "goal2" ] )

// What is the sum of the probabilities for the players to send 
// their individual messages when the players collaborate and 
// the players p2 and p3 form a coalition to send their messages
<<p1:p2,p3>>max=?( P[ F "sent1" ] + P[ F "sent2" & "sent3" ] )

And some examples of reward nonzero-sum properties are:

// What is the sum of the individual expected costs of the players p1 and p2
// over the first k steps when the player collaborate to minimize these costs
<<p1:p2>>max=? ( R{"c1"}[ C<=k ] + R{"c2"}[ C<=k ] )

// Can the players guarantee the sum of the individual rewards at
// steps 8 for player 1 and step 10 for player 2 is at least 4.5
<<p1:p2>> >=4.5 ( R{"r1"}[ I=8 ] + R{"r2"}[ I=10 ] )

// What is the maximum sum of the expected time of player 1 eventually sending its packet
// and the expected time for the coalition of players two and three to send their packets
<<p1:p2,p3>>min=? ( R{"time"}[F "sent1" ] + R{"time"}[F "sent2" & "sent3" ] )

// What is the minimum sum of the expected cost of player 1 eventually reaching its goal
// and the expected costs for the coalition of players two and three over the first 100 steps
<<p1:p2,p3>>min=? ( R{"c1"}[ F "goal1" ] + R{"c23"}[ C<=100 ])

Different variants of equilibria are supported. The type of equilibrium can be either Nash (the default, as in above examples) or a correlated equilibrium. The optimisation criterion can be either social welfare (the default, as in above examples) or social fairness, which minimises the difference between coalitions' objectives, rather than maximising the sum. For details, see [KNPS22]. These are specified as in the following examples.

// Social welfare Nash equilibrium
<<p1:p2>>{nash,social}max=?(R{"r1"}[F s] + R{"r2"}[F s]);

// Social welfare Nash equilibrium
<<p1:p2>>{nash,fair}max=?(R{"r1"}[F s] + R{"r2"}[F s]);

// Social welfare correlated equilibrium
<<p1:p2>>{corr,social}max=?(R{"r1"}[F s] + R{"r2"}[F s]);

// Social fairness correlated equilibrium
<<p1:p2>>{corr,fair}max=?(R{"r1"}[F s] + R{"r2"}[F s]);