User guide#
Example: One-shot trust game with binary actions#
[Kre90] introduced a game commonly referred to as the trust game.
We will build a one-shot version of this game using pygambit
’s game transformation
operations.
There are two players, a Buyer and a Seller. The Buyer moves first and has two actions, Trust or Not trust. If the Buyer chooses Not trust, then the game ends, and both players receive payoffs of 0. If the Buyer chooses Trust, then the Seller has a choice with two actions, Honor or Abuse. If the Seller chooses Honor, both players receive payoffs of 1; if the Seller chooses Abuse, the Buyer receives a payoff of -1 and the Seller receives a payoff of 2.
We create a game with an extensive representation using Game.new_tree()
:
In [1]: import pygambit as gbt
In [2]: g = gbt.Game.new_tree(players=["Buyer", "Seller"],
...: title="One-shot trust game, after Kreps (1990)")
...:
The tree of the game contains just a root node, with no children:
In [3]: g.root
Out[3]: Node(game=Game(title='One-shot trust game, after Kreps (1990)'), path=[])
In [4]: g.root.children
Out[4]: NodeChildren(parent=Node(game=Game(title='One-shot trust game, after Kreps (1990)'), path=[]))
To extend a game from an existing terminal node, use Game.append_move()
:
In [5]: g.append_move(g.root, "Buyer", ["Trust", "Not trust"])
In [6]: g.root.children
Out[6]: NodeChildren(parent=Node(game=Game(title='One-shot trust game, after Kreps (1990)'), path=[]))
We can then also add the Seller’s move in the situation after the Buyer chooses Trust:
In [7]: g.append_move(g.root.children[0], "Seller", ["Honor", "Abuse"])
Now that we have the moves of the game defined, we add payoffs. Payoffs are associated with
an Outcome
; each Outcome
has a vector of payoffs, one for each player,
and optionally an identifying text label. First we add the outcome associated with the
Seller proving themselves trustworthy:
In [8]: g.set_outcome(g.root.children[0].children[0], g.add_outcome([1, 1], label="Trustworthy"))
Next, the outcome associated with the scenario where the Buyer trusts but the Seller does not return the trust:
In [9]: g.set_outcome(g.root.children[0].children[1], g.add_outcome([-1, 2], label="Untrustworthy"))
And, finally the outcome associated with the Buyer opting out of the interaction:
In [10]: g.set_outcome(g.root.children[1], g.add_outcome([0, 0], label="Opt-out"))
Nodes without an outcome attached are assumed to have payoffs of zero for all players. Therefore, adding the outcome to this latter terminal node is not strictly necessary in Gambit, but it is useful to be explicit for readability.
Kreps, D. (1990) “Corporate Culture and Economic Theory.” In J. Alt and K. Shepsle, eds., Perspectives on Positive Political Economy, Cambridge University Press.
Example: A one-card poker game with private information#
To illustrate games in extensive form, [Mye91] presents a one-card poker game. A version of this game also appears in [RUW08], as a classroom game under the name “stripped-down poker”. This is perhaps the simplest interesting game with imperfect information.
In our version of the game, there are two players, Alice and Bob. There is a deck of cards, with equal numbers of King and Queen cards. The game begins with each player putting $1 in the pot. One card is dealt at random to Alice; Alice observes her card but Bob does not. After Alice observes her card, she can choose either to Raise or to Fold. If she chooses to Fold, Bob wins the pot and the game ends. If she chooses to Raise, she adds another $1 to the pot. Bob then chooses either to Meet or Pass. If he chooses to Pass, Alice wins the pot and the game ends. If he chooses to Meet, he adds another $1 to the pot. There is then a showdown, in which Alice reveals her card. If she has a King, then she wins the pot; if she has a Queen, then Bob wins the pot.
We can build this game using the following script:
g = gbt.Game.new_tree(players=["Alice", "Bob"],
title="One card poker game, after Myerson (1991)")
g.append_move(g.root, g.players.chance, ["King", "Queen"])
for node in g.root.children:
g.append_move(node, "Alice", ["Raise", "Fold"])
g.append_move(g.root.children[0].children[0], "Bob", ["Meet", "Pass"])
g.append_infoset(g.root.children[1].children[0],
g.root.children[0].children[0].infoset)
alice_winsbig = g.add_outcome([2, -2], label="Alice wins big")
alice_wins = g.add_outcome([1, -1], label="Alice wins")
bob_winsbig = g.add_outcome([-2, 2], label="Bob wins big")
bob_wins = g.add_outcome([-1, 1], label="Bob wins")
g.set_outcome(g.root.children[0].children[0].children[0], alice_winsbig)
g.set_outcome(g.root.children[0].children[0].children[1], alice_wins)
g.set_outcome(g.root.children[0].children[1], bob_wins)
g.set_outcome(g.root.children[1].children[0].children[0], bob_winsbig)
g.set_outcome(g.root.children[1].children[0].children[1], alice_wins)
g.set_outcome(g.root.children[1].children[1], bob_wins)
All extensive games have a chance (or nature) player, accessible as
.Game.players.chance
. Moves belonging to the chance player can be added in the same
way as to personal players. At any new move created for the chance player, the action
probabilities default to uniform randomization over the actions at the move.
In this game, information structure is important. Alice knows her card, so the two nodes at which she has the move are part of different information sets. The loop:
for node in g.root.children:
g.append_move(node, "Alice", ["Raise", "Fold"])
causes each of the newly-appended moves to be in new information sets. In contrast, Bob does not know Alice’s card, and therefore cannot distinguish between the two nodes at which he has the decision. This is implemented in the following lines:
g.append_move(g.root.children[0].children[0], "Bob", ["Meet", "Pass"])
g.append_infoset(g.root.children[1].children[0],
g.root.children[0].children[0].infoset)
The call Game.append_infoset()
adds a move at a terminal node as part of
an existing information set (represented in pygambit
as an Infoset
).
Building a strategic game#
Games in strategic form, also referred to as normal form, are represented solely
by a collection of payoff tables, one per player. The most direct way to create
a strategic game is via Game.from_arrays()
. This function takes one
n-dimensional array per player, where n is the number of players in the game.
The arrays can be any object that can be indexed like an n-times-nested Python list;
so, for example, numpy arrays can be used directly.
For example, to create a standard prisoner’s dilemma game in which the cooperative payoff is 8, the betrayal payoff is 10, the sucker payoff is 2, and the noncooperative payoff is 5:
In [11]: import numpy as np
In [12]: m = np.array([[8, 2], [10, 5]])
In [13]: g = gbt.Game.from_arrays(m, np.transpose(m))
In [14]: g
Out[14]: Game(title='Untitled strategic game')
The arrays passed to Game.from_arrays()
are all indexed in the same sense, that is,
the top level index is the choice of the first player, the second level index of the second player,
and so on. Therefore, to create a two-player symmetric game, as in this example, the payoff matrix
for the second player is transposed before passing to Game.from_arrays()
.
There is a reverse function Game.to_arrays()
that produces
the players’ payoff tables given a strategic game. The output is the list of numpy
arrays,
where the number of produced arrays is equal to the number of players.
In [15]: m, m_transposed = g.to_arrays()
In [16]: m
Out[16]:
array([[Rational(8, 1), Rational(2, 1)],
[Rational(10, 1), Rational(5, 1)]], dtype=object)
The optional parameter dtype` controls the data type of the payoffs in the generated arrays.
In [17]: m, m_transposed = g.to_arrays(dtype=float)
In [18]: m
Out[18]:
array([[8.0, 2.0],
[10.0, 5.0]], dtype=object)
The function supports any type which can convert from Python’s fractions.Fraction type. For example, to convert the payoffs to their string representations via str:
In [19]: m, m_transposed = g.to_arrays(dtype=str)
In [20]: m
Out[20]:
array([['8', '2'],
['10', '5']], dtype=object)
Representation of numerical data of a game#
Payoffs to players and probabilities of actions at chance information sets are specified as numbers. Gambit represents the numerical values in a game in exact precision, using either decimal or rational representations.
To illustrate, we consider a trivial game which just has one move for the chance player:
In [21]: import pygambit as gbt
In [22]: g = gbt.Game.new_tree()
In [23]: g.append_move(g.root, g.players.chance, ["a", "b", "c"])
In [24]: [act.prob for act in g.root.infoset.actions]
Out[24]: [Rational(1, 3), Rational(1, 3), Rational(1, 3)]
The default when creating a new move for chance is that all actions are chosen with
equal probability. These probabilities are represented as rational numbers,
using pygambit
’s Rational
class, which is derived from Python’s
fractions.Fraction. Numerical data can be set as rational numbers:
In [25]: g.set_chance_probs(g.root.infoset,
....: [gbt.Rational(1, 4), gbt.Rational(1, 2), gbt.Rational(1, 4)])
....:
In [26]: [act.prob for act in g.root.infoset.actions]
Out[26]: [Rational(1, 4), Rational(1, 2), Rational(1, 4)]
They can also be explicitly specified as decimal numbers:
In [27]: g.set_chance_probs(g.root.infoset,
....: [gbt.Decimal(".25"), gbt.Decimal(".50"), gbt.Decimal(".25")])
....:
In [28]: [act.prob for act in g.root.infoset.actions]
Out[28]: [Decimal('0.25'), Decimal('0.50'), Decimal('0.25')]
Although the two representations above are mathematically equivalent, pygambit
remembers the format in which the values were specified.
Expressing rational or decimal numbers as above is verbose and tedious.
pygambit
offers a more concise way to express numerical data in games:
when setting numerical game data, pygambit
will attempt to convert text strings to
their rational or decimal representation. The above can therefore be written
more compactly using string representations:
In [29]: g.set_chance_probs(g.root.infoset, ["1/4", "1/2", "1/4"])
In [30]: [act.prob for act in g.root.infoset.actions]
Out[30]: [Rational(1, 4), Rational(1, 2), Rational(1, 4)]
In [31]: g.set_chance_probs(g.root.infoset, [".25", ".50", ".25"])
In [32]: [act.prob for act in g.root.infoset.actions]
Out[32]: [Decimal('0.25'), Decimal('0.50'), Decimal('0.25')]
As a further convenience, pygambit
will accept Python int
and float
values.
int
values are always interpreted as Rational
values.
pygambit
attempts to render float values in an appropriate Decimal
equivalent. In the majority of cases, this creates no problems.
For example,
In [33]: g.set_chance_probs(g.root.infoset, [.25, .50, .25])
In [34]: [act.prob for act in g.root.infoset.actions]
Out[34]: [Decimal('0.25'), Decimal('0.5'), Decimal('0.25')]
However, rounding can cause difficulties when attempting to use float values to represent values which do not have an exact decimal representation
In [35]: g.set_chance_probs(g.root.infoset, [1/3, 1/3, 1/3])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[35], line 1
----> 1 g.set_chance_probs(g.root.infoset, [1/3, 1/3, 1/3])
File src/pygambit/game.pxi:823, in pygambit.gambit.Game.set_chance_probs()
ValueError: set_chance_probs(): must specify non-negative probabilities that sum to one
This behavior can be slightly surprising, especially in light of the fact that in Python,
In [36]: 1/3 + 1/3 + 1/3
Out[36]: 1.0
In checking whether these probabilities sum to one, pygambit
first converts each
of the probabilitiesto a Decimal
representation, via the following method
In [37]: gbt.Decimal(str(1/3))
Out[37]: Decimal('0.3333333333333333')
and the sum-to-one check then fails because
In [38]: gbt.Decimal(str(1/3)) + gbt.Decimal(str(1/3)) + gbt.Decimal(str(1/3))
Out[38]: Decimal('0.9999999999999999')
Setting payoffs for players also follows the same rules. Representing probabilities
and payoffs exactly is essential, because pygambit
offers (in particular for two-player
games) the possibility of computation of equilibria exactly, because the Nash equilibria
of any two-player game with rational payoffs and chance probabilities can be expressed exactly
in terms of rational numbers.
It is therefore advisable always to specify the numerical data of games either in terms
of Decimal
or Rational
values, or their string equivalents.
It is safe to use int values, but float values should be used with some care to ensure
the values are recorded as intended.
Reading a game from a file#
Games stored in existing Gambit savefiles can be loaded using read_efg()
or read_nfg()
:
In [39]: g = gbt.read_nfg("e02.nfg")
In [40]: g
Out[40]: Game(title='Selten (IJGT, 75), Figure 2, normal form')
Computing Nash equilibria#
Interfaces to algorithms for computing Nash equilibria are provided in pygambit.nash
.
Method |
Python function |
---|---|
We take as an example the one-card poker game. This is a two-player, constant sum game, and so all of the equilibrium-finding methods can be applied to it.
For two-player games, lcp_solve()
can compute Nash equilibria directly using
the extensive representation. Assuming that g
refers to the game
In [41]: result = gbt.nash.lcp_solve(g)
In [42]: result
Out[42]: NashComputationResult(method='lcp', rational=True, use_strategic=False, equilibria=[[[[Rational(1, 1), Rational(0, 1)], [Rational(1, 3), Rational(2, 3)]], [[Rational(2, 3), Rational(1, 3)]]]], parameters={'stop_after': 0, 'max_depth': 0})
In [43]: len(result.equilibria)
Out[43]: 1
The result of the calculation is returned as a NashComputationResult
object.
The set of equilibria found is reported in NashComputationResult.equilibria
;
in this case, this is a list of mixed behavior profiles.
A mixed behavior profile specifies, for each information set, the probability distribution over
actions at that information set.
Indexing a MixedBehaviorProfile
by a player gives a MixedBehavior
,
which specifies probability distributions at each of the player’s information sets:
In [44]: eqm = result.equilibria[0]
In [45]: eqm["Alice"]
Out[45]: [[Rational(1, 1), Rational(0, 1)], [Rational(1, 3), Rational(2, 3)]]
In this case, at Alice’s first information set, the one at which she has the King, she always raises.
At her second information set, where she has the Queen, she sometimes bluffs, raising with
probability one-third.
The probability distribution at an information set is represented by a MixedAction
.
MixedBehavior.mixed_actions()
iterates over these for the player:
In [46]: for infoset, mixed_action in eqm["Alice"].mixed_actions():
....: print(infoset)
....: print(mixed_action)
....:
Infoset(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Alice'), number=0)
[Rational(1, 1), Rational(0, 1)]
Infoset(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Alice'), number=1)
[Rational(1, 3), Rational(2, 3)]
So we could extract Alice’s probabilities of raising at her respective information sets like this:
In [47]: {infoset: mixed_action["Raise"] for infoset, mixed_action in eqm["Alice"].mixed_actions()}
Out[47]:
{Infoset(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Alice'), number=0): Rational(1, 1),
Infoset(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Alice'), number=1): Rational(1, 3)}
In larger games, labels may not always be the most convenient way to refer to specific
actions. We can also index profiles directly with Action
objects.
So an alternative way to extract the probabilities of playing “Raise” would be by
iterating Alice’s list of actions:
In [48]: {action.infoset: eqm[action] for action in g.players["Alice"].actions if action.label == "Raise"}
Out[48]:
{Infoset(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Alice'), number=0): Rational(1, 1),
Infoset(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Alice'), number=1): Rational(1, 3)}
Looking at Bob’s strategy,
In [49]: eqm["Bob"]
Out[49]: [[Rational(2, 3), Rational(1, 3)]]
Bob meets Alice’s raise two-thirds of the time. The label “Raise” is used in more than one information set for Alice, so in the above we had to specify information sets when indexing. When there is no ambiguity, we can specify action labels directly. So for example, because Bob has only one action named “Meet” in the game, we can extract the probability that Bob plays “Meet” by:
In [50]: eqm["Bob"]["Meet"]
Out[50]: Rational(2, 3)
Moreover, this is the only action with that label in the game, so we can index the profile directly using the action label without any ambiguity:
In [51]: eqm["Meet"]
Out[51]: Rational(2, 3)
Because this is an equilibrium, the fact that Bob randomizes at his information set must mean he
is indifferent between the two actions at his information set. MixedBehaviorProfile.action_value()
returns the expected payoff of taking an action, conditional on reaching that action’s information set:
In [52]: {action: eqm.action_value(action) for action in g.players["Bob"].infosets[0].actions}
Out[52]:
{Action(infoset=Infoset(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Bob'), number=0), label='Meet'): Rational(-1, 1),
Action(infoset=Infoset(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Bob'), number=0), label='Pass'): Rational(-1, 1)}
Bob’s indifference between his actions arises because of his beliefs given Alice’s strategy.
MixedBehaviorProfile.belief()
returns the probability of reaching a node, conditional on
its information set being reached:
In [53]: {node: eqm.belief(node) for node in g.players["Bob"].infosets[0].members}
Out[53]:
{Node(game=Game(title='One card poker game, after Myerson (1991)'), path=[0, 0]): Rational(3, 4),
Node(game=Game(title='One card poker game, after Myerson (1991)'), path=[0, 1]): Rational(1, 4)}
Bob believes that, conditional on Alice raising, there’s a 75% chance that she has the king;
therefore, the expected payoff to meeting is in fact -1 as computed.
MixedBehaviorProfile.infoset_prob()
returns the probability that an information set is
reached:
In [54]: eqm.infoset_prob(g.players["Bob"].infosets[0])
Out[54]: Rational(2, 3)
The corresponding probability that a node is reached in the play of the game is given
by MixedBehaviorProfile.realiz_prob()
, and the expected payoff to a player
conditional on reaching a node is given by MixedBehaviorProfile.node_value()
.
In [55]: {node: eqm.node_value("Bob", node) for node in g.players["Bob"].infosets[0].members}
Out[55]:
{Node(game=Game(title='One card poker game, after Myerson (1991)'), path=[0, 0]): Rational(-5, 3),
Node(game=Game(title='One card poker game, after Myerson (1991)'), path=[0, 1]): Rational(1, 1)}
The overall expected payoff to a player given the behavior profile is returned by
MixedBehaviorProfile.payoff()
:
In [56]: eqm.payoff("Alice")
Out[56]: Rational(1, 3)
In [57]: eqm.payoff("Bob")
Out[57]: Rational(-1, 3)
The equilibrium computed expresses probabilities in rational numbers. Because
the numerical data of games in Gambit are represented exactly,
methods which are specialized to two-player games, lp_solve()
, lcp_solve()
,
and enummixed_solve()
, can report exact probabilities for equilibrium strategy
profiles. This is enabled by default for these methods.
When a game has an extensive representation, equilibrium finding methods default to computing
on that representation. It is also possible to compute using the strategic representation.
pygambit
transparently computes the reduced strategic form representation of an extensive game
In [58]: [s.label for s in g.players["Alice"].strategies]
Out[58]: ['11', '12', '21', '22']
In the strategic form of this game, Alice has four strategies. The generated strategy labels list the action numbers taken at each information set. We can therefore apply a method which operates on a strategic game to any game with an extensive representation
In [59]: result = gbt.nash.gnm_solve(g)
In [60]: result
Out[60]: NashComputationResult(method='gnm', rational=False, use_strategic=True, equilibria=[[[0.33333333333866677, 0.6666666666613335, 0.0, 0.0], [0.6666666666559997, 0.3333333333440004]]], parameters={'perturbation': [[1.0, 0.0, 0.0, 0.0], [1.0, 0.0]], 'end_lambda': -10.0, 'steps': 100, 'local_newton_interval': 3, 'local_newton_maxits': 10})
gnm_solve()
can be applied to any game with any number of players, and uses a path-following
process in floating-point arithmetic, so it returns profiles with probabilities expressed as
floating-point numbers. This method operates on the strategic representation of the game, so
the returned results are of type MixedStrategyProfile
, and
specify, for each player, a probability distribution over that player’s strategies.
Indexing a MixedStrategyProfile
by a player gives the probability distribution
over that player’s strategies only.
In [61]: eqm = result.equilibria[0]
In [62]: eqm["Alice"]
Out[62]: [0.33333333333866677, 0.6666666666613335, 0.0, 0.0]
In [63]: eqm["Bob"]
Out[63]: [0.6666666666559997, 0.3333333333440004]
The expected payoff to a strategy is provided by MixedStrategyProfile.strategy_value()
:
In [64]: {strategy: eqm.strategy_value(strategy) for strategy in g.players["Alice"].strategies}
Out[64]:
{Strategy(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Alice'), label='11'): 0.33333333334400045,
Strategy(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Alice'), label='12'): 0.33333333332799997,
Strategy(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Alice'), label='21'): -0.9999999999839995,
Strategy(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Alice'), label='22'): -1.0}
In [65]: {strategy: eqm.strategy_value(strategy) for strategy in g.players["Bob"].strategies}
Out[65]:
{Strategy(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Bob'), label='1'): -0.33333333333066656,
Strategy(player=Player(game=Game(title='One card poker game, after Myerson (1991)'), label='Bob'), label='2'): -0.3333333333386667}
The overall expected payoff to a player is returned by MixedStrategyProfile.payoff()
:
In [66]: eqm.payoff("Alice")
Out[66]: 0.33333333333333354
In [67]: eqm.payoff("Bob")
Out[67]: -0.33333333333333354
When a game has an extensive representation, we can convert freely between
MixedStrategyProfile
and the corresponding
MixedBehaviorProfile
representation of the same strategies
using MixedStrategyProfile.as_behavior()
and MixedBehaviorProfile.as_strategy()
.
In [68]: eqm.as_behavior()
Out[68]: [[[1.0, 0.0], [0.3333333333386667, 0.6666666666613333]], [[0.6666666666559997, 0.3333333333440004]]]
In [69]: eqm.as_behavior().as_strategy()
Out[69]: [[0.3333333333386667, 0.6666666666613333, 0.0, 0.0], [0.6666666666559997, 0.3333333333440004]]
Acceptance criteria for Nash equilibria#
Some methods for computing Nash equilibria operate using floating-point arithmetic and/or generate candidate equilibrium profiles using methods which involve some form of successive approximations. The outputs of these methods therefore are in general \(\varepsilon\)-equilibria, for some positive \(\varepsilon\).
To provide a uniform interface across methods, where relevant Gambit provides a parameter maxregret, which specifies the acceptance criterion for labeling the output of the algorithm as an equilibrium. This parameter is interpreted proportionally to the range of payoffs in the game. Any profile returned as an equilibrium is guaranteed to be an \(\varepsilon\)-equilibrium, for \(\varepsilon\) no more than maxregret times the difference of the game’s maximum and minimum payoffs.
As an example, consider solving the standard one-card poker game using
logit_solve()
. The range of the payoffs in this game is 4 (from +2 to -2).
In [70]: g = gbt.read_efg("poker.efg")
In [71]: g.max_payoff, g.min_payoff
Out[71]: (Rational(2, 1), Rational(-2, 1))
logit_solve()
is a globally-convergent method, in that it computes a
sequence of profiles which is guaranteed to have a subsequence that converges to a
Nash equilibrium. The default value of maxregret for this method is set at
\(10^{-8}\):
In [72]: result = gbt.nash.logit_solve(g, maxregret=1e-8)
In [73]: result.equilibria
Out[73]: [[[[1.0, 0.0], [0.33333338649882943, 0.6666666135011707]], [[0.6666667065407631, 0.3333332934592369]]]]
In [74]: result.equilibria[0].max_regret()
Out[74]: 3.987411578698641e-08
The value of MixedBehaviorProfile.max_regret()
of the computed profile exceeds
\(10^{-8}\) measured in payoffs of the game. However, when considered relative
to the scale of the game’s payoffs, we see it is less than \(10^{-8}\) of
the payoff range, as requested:
In [75]: result.equilibria[0].max_regret() / (g.max_payoff - g.min_payoff)
Out[75]: 9.968528946746602e-09
In general, for globally-convergent methods especially, there is a tradeoff between precision and running time. Some methods may be slow to converge on some games, and it may be useful instead to get a more coarse approximation to an equilibrium. We could instead ask only for an \(\varepsilon\)-equilibrium with a (scaled) \(\varepsilon\) of no more than \(10^{-4}\):
In [76]: result = gbt.nash.logit_solve(g, maxregret=1e-4)
In [77]: result.equilibria[0]
Out[77]: [[[1.0, 0.0], [0.3338351656285656, 0.666164834417892]], [[0.6670407651644306, 0.3329592348608147]]]
In [78]: result.equilibria[0].max_regret()
Out[78]: 0.00037581039824041707
In [79]: result.equilibria[0].max_regret() / (g.max_payoff - g.min_payoff)
Out[79]: 9.395259956010427e-05
The convention of expressing maxregret scaled by the game’s payoffs standardises the
behavior of methods across games. For example, consider solving the poker game instead
using liap_solve()
.
In [80]: result = gbt.nash.liap_solve(g.mixed_behavior_profile(), maxregret=1.0e-4)
In [81]: result.equilibria[0]
Out[81]: [[[0.9999979852809777, 2.0147190221904606e-06], [0.3333567614695495, 0.6666432385304505]], [[0.6668870457043463, 0.33311295429565374]]]
In [82]: result.equilibria[0].max_regret()
Out[82]: 0.00022039452688993322
In [83]: result.equilibria[0].max_regret() / (g.max_payoff - g.min_payoff)
Out[83]: 5.5098631722483304e-05
If, instead, we double all payoffs, the output of the method is unchanged.
In [84]: for outcome in g.outcomes:
....: outcome["Alice"] = outcome["Alice"] * 2
....: outcome["Bob"] = outcome["Bob"] * 2
....:
In [85]: result = gbt.nash.liap_solve(g.mixed_behavior_profile(), maxregret=1.0e-4)
In [86]: result.equilibria[0]
Out[86]: [[[0.9999979852809777, 2.0147190221904606e-06], [0.3333567614695495, 0.6666432385304505]], [[0.6668870457043463, 0.33311295429565374]]]
In [87]: result.equilibria[0].max_regret()
Out[87]: 0.00044078905377986644
In [88]: result.equilibria[0].max_regret() / (g.max_payoff - g.min_payoff)
Out[88]: 5.5098631722483304e-05
Generating starting points for algorithms#
Some methods for computation of Nash equilibria take as an initial condition a
MixedStrategyProfile
or MixedBehaviorProfile
which is used
as a starting point. The equilibria found will depend on which starting point is
selected. To facilitate generating starting points, Game
provides
methods Game.random_strategy_profile()
and Game.random_behavior_profile()
,
to generate profiles which are drawn from the uniform distribution on the product
of simplices.
As an example, we consider a three-player game from McKelvey and McLennan (1997), in which each player has two strategies. This game has nine equilibria in total, and in particular has two totally mixed Nash equilibria, which is the maximum possible number of regular totally mixed equilbria in games of this size.
We first consider finding Nash equilibria in this game using liap_solve()
.
If we run this method starting from the centroid (uniform randomization across all
strategies for each player), liap_solve()
finds one of the totally-mixed equilibria.
In [89]: g = gbt.read_nfg("2x2x2.nfg")
In [90]: gbt.nash.liap_solve(g.mixed_strategy_profile())
Out[90]: NashComputationResult(method='liap', rational=False, use_strategic=True, equilibria=[[[0.4000000330601743, 0.5999999669398257], [0.499999791913678, 0.5000002080863221], [0.3333334438355959, 0.6666665561644041]]], parameters={'start': [[0.5, 0.5], [0.5, 0.5], [0.5, 0.5]], 'maxregret': 0.0001, 'maxiter': 1000})
Which equilibrium is found depends on the starting point. With a different starting point, we can find, for example, one of the pure-strategy equilibria.
In [91]: gbt.nash.liap_solve(g.mixed_strategy_profile([[.9, .1], [.9, .1], [.9, .1]]))
Out[91]: NashComputationResult(method='liap', rational=False, use_strategic=True, equilibria=[[[0.9999999952144591, 4.785540892853837e-09], [0.9999999998451463, 1.5485358776552024e-10], [0.9999999988047714, 1.1952285809728638e-09]]], parameters={'start': [[0.9, 0.1], [0.9, 0.1], [0.9, 0.1]], 'maxregret': 0.0001, 'maxiter': 1000})
To search for more equilibria, we can instead generate strategy profiles at random.
In [92]: gbt.nash.liap_solve(g.random_strategy_profile())
Out[92]: NashComputationResult(method='liap', rational=False, use_strategic=True, equilibria=[[[0.5000003054481229, 0.49999969455187704], [0.39999980076309893, 0.6000001992369011], [0.24999993730851963, 0.7500000626914802]]], parameters={'start': [[0.7368607614149606, 0.26313923858503935], [0.10205385843799224, 0.8979461415620078], [0.5545300809522465, 0.4454699190477535]], 'maxregret': 0.0001, 'maxiter': 1000})
Note that methods which take starting points do record the starting points used in the
result object returned. However, the random profiles which are generated will differ
in different runs of a program. To support making the generation of random strategy
profiles reproducible, and for finer-grained control of the generation of these profiles
if desired, Game.random_strategy_profile()
and Game.random_behavior_profile()
optionally take a numpy.random.Generator
object, which is used as the source
of randomness for creating the profile.
In [93]: import numpy as np
In [94]: gen = np.random.default_rng(seed=1234567890)
In [95]: p1 = g.random_strategy_profile(gen=gen)
In [96]: p1
Out[96]: [[0.8359110871883912, 0.16408891281160876], [0.7422931336795922, 0.25770686632040796], [0.977833473596193, 0.02216652640380706]]
In [97]: gen = np.random.default_rng(seed=1234567890)
In [98]: p2 = g.random_strategy_profile(gen=gen)
In [99]: p2
Out[99]: [[0.8359110871883912, 0.16408891281160876], [0.7422931336795922, 0.25770686632040796], [0.977833473596193, 0.02216652640380706]]
In [100]: p1 == p2
Out[100]: True
When creating profiles in which probabilities are represented as floating-point numbers,
Game.random_strategy_profile()
and Game.random_behavior_profile()
internally use the Dirichlet distribution for each simplex to generate correctly uniform
sampling over probabilities. However, in some applications generation of random profiles
with probabilities as rational numbers is desired. For example, simpdiv_solve()
takes such a starting point, because it operates by successively refining a triangulation
over the space of mixed strategy profiles.
Game.random_strategy_profile()
and Game.random_behavior_profile()
both take an optional parameter denom which, if specified, generates a profile in which
probabilities are generated uniformly from the grid in each simplex in which all probabilities
have denominator denom.
In [101]: gen = np.random.default_rng(seed=1234567890)
In [102]: g.random_strategy_profile(denom=10, gen=gen)
Out[102]: [[Rational(1, 2), Rational(1, 2)], [Rational(7, 10), Rational(3, 10)], [Rational(0, 1), Rational(1, 1)]]
In [103]: g.random_strategy_profile(denom=10, gen=gen)
Out[103]: [[Rational(1, 10), Rational(9, 10)], [Rational(3, 5), Rational(2, 5)], [Rational(3, 5), Rational(2, 5)]]
These can then be used in conjunction with simpdiv_solve()
to search for equilibria
from different starting points.
In [104]: gbt.nash.simpdiv_solve(g.random_strategy_profile(denom=10, gen=gen))
Out[104]: NashComputationResult(method='simpdiv', rational=True, use_strategic=True, equilibria=[[[Rational(1, 1), Rational(0, 1)], [Rational(1, 1), Rational(0, 1)], [Rational(1, 1), Rational(0, 1)]]], parameters={'start': [[Rational(7, 10), Rational(3, 10)], [Rational(4, 5), Rational(1, 5)], [Rational(0, 1), Rational(1, 1)]], 'maxregret': Rational(1, 10000000), 'refine': 2, 'leash': None})
In [105]: gbt.nash.simpdiv_solve(g.random_strategy_profile(denom=10, gen=gen))
Out[105]: NashComputationResult(method='simpdiv', rational=True, use_strategic=True, equilibria=[[[Rational(1, 1), Rational(0, 1)], [Rational(0, 1), Rational(1, 1)], [Rational(0, 1), Rational(1, 1)]]], parameters={'start': [[Rational(4, 5), Rational(1, 5)], [Rational(1, 5), Rational(4, 5)], [Rational(0, 1), Rational(1, 1)]], 'maxregret': Rational(1, 10000000), 'refine': 2, 'leash': None})
In [106]: gbt.nash.simpdiv_solve(g.random_strategy_profile(denom=10, gen=gen))
Out[106]: NashComputationResult(method='simpdiv', rational=True, use_strategic=True, equilibria=[[[Rational(1, 1), Rational(0, 1)], [Rational(1, 1), Rational(0, 1)], [Rational(1, 1), Rational(0, 1)]]], parameters={'start': [[Rational(1, 2), Rational(1, 2)], [Rational(1, 1), Rational(0, 1)], [Rational(1, 2), Rational(1, 2)]], 'maxregret': Rational(1, 10000000), 'refine': 2, 'leash': None})
Quantal response equilibrium#
Gambit implements the idea of [McKPal95] and [McKPal98] to compute Nash equilibria
via path-following a branch of the logit quantal response equilibrium (LQRE) correspondence
using the function logit_solve()
. As an example, we will consider an
asymmetric matching pennies game from [Och95] as analyzed in [McKPal95].
In [107]: g = gbt.Game.from_arrays(
.....: [[1.1141, 0], [0, 0.2785]],
.....: [[0, 1.1141], [1.1141, 0]],
.....: title="Ochs (1995) asymmetric matching pennies as transformed in McKelvey-Palfrey (1995)"
.....: )
.....:
In [108]: gbt.nash.logit_solve(g)
Out[108]: NashComputationResult(method='logit', rational=False, use_strategic=True, equilibria=[[[0.5000000234106035, 0.49999997658939654], [0.19998563837426647, 0.8000143616257336]]], parameters={'first_step': 0.03, 'max_accel': 1.1})
logit_solve()
returns only the limiting (approximate) Nash equilibrium found.
Profiles along the QRE correspondence are frequently of interest in their own right.
Gambit offers several functions for more detailed examination of branches of the
QRE correspondence.
The function logit_solve_branch()
uses the same procedure as logit_solve()
,
but returns a list of LQRE profiles computed along the branch instead of just the limiting
approximate Nash equilibrium.
In [109]: qres = gbt.qre.logit_solve_branch(g)
In [110]: len(qres)
Out[110]: 193
In [111]: qres[0]
Out[111]: LogitQREMixedStrategyProfile(lam=0.000000,profile=[[0.5, 0.5], [0.5, 0.5]])
In [112]: qres[5]
Out[112]: LogitQREMixedStrategyProfile(lam=0.175632,profile=[[0.5182276540742868, 0.4817723459257562], [0.49821668880066783, 0.5017833111993909]])
logit_solve_branch()
uses an adaptive step size heuristic to find points on
the branch. The parameters first_step and max_accel are used to adjust the initial
step size and the maximum rate at which the step size changes adaptively. The step size
used is computed as the distance traveled along the path, and, importantly, not the
distance as measured by changes in the precision parameter lambda. As a result the
lambda values for which profiles are computed cannot be controlled in advance.
In some situations, the LQRE profiles at specified values of lambda are of interest.
For this, Gambit provides logit_solve_lambda()
. This function provides
accurate values of strategy profiles at one or more specified values of lambda.
In [113]: qres = gbt.qre.logit_solve_lambda(g, lam=[1, 2, 3])
In [114]: qres[0]
Out[114]: LogitQREMixedStrategyProfile(lam=1.000000,profile=[[0.5867840364385154, 0.4132159635614846], [0.45180703169971026, 0.5481929683002897]])
In [115]: qres[1]
Out[115]: LogitQREMixedStrategyProfile(lam=2.000000,profile=[[0.6175219458400859, 0.3824780541599141], [0.3719816648492249, 0.6280183351507752]])
In [116]: qres[2]
Out[116]: LogitQREMixedStrategyProfile(lam=3.000000,profile=[[0.6168968501329284, 0.3831031498670716], [0.31401636202001226, 0.6859836379799878]])
LQRE are frequently taken to data by using maximum likelihood estimation to find the
LQRE profile that best fits an observed profile of play. This is provided by
the function logit_estimate()
. We replicate the analysis of a block
of the data from [Och95] for which [McKPal95] estimated an LQRE.
In [117]: data = g.mixed_strategy_profile([[128*0.527, 128*(1-0.527)], [128*0.366, 128*(1-0.366)]])
In [118]: fit = gbt.qre.logit_estimate(data)
The returned LogitQREMixedStrategyFitResult
object contains the results of the
estimation.
The results replicate those reported in [McKPal95], including the estimated value of lambda,
the QRE profile probabilities, and the log-likelihood.
Because data contains the empirical counts of play, and not just frequencies, the resulting
log-likelihood is correct for use in likelihoood-ratio tests. [1]
In [119]: print(fit.lam)
1.8456097536855864
In [120]: print(fit.profile)
[[0.615651314427859, 0.3843486855721409], [0.3832909400456291, 0.616709059954371]]
In [121]: print(fit.log_like)
-174.76453191087444
All of the functions above also support working with the agent LQRE of [McKPal98].
Agent QRE are computed as the default behavior whenever the game has a extensive (tree)
representation. For logit_solve()
, logit_solve_branch()
, and
logit_solve_lambda()
, this can be overriden by passing use_strategic=True;
this will compute LQRE using the reduced strategy set of the game instead.
Likewise, logit_estimate()
will perform estimation using agent LQRE if the
data passed are a MixedBehaviorProfile
, and will return a
LogitQREMixedBehaviorFitResult
object.
Footnotes
Using external programs to compute Nash equilbria#
Because the problem of finding Nash equilibria can be expressed in various mathematical formulations (see [McKMcL96]), it is helpful to make use of other software packages designed specifically for solving those problems.
There are currently two integrations offered for using external programs to solve for equilibria:
enummixed_solve()
supports enumeration of equilibria in two-player games via lrslib. [2]enumpoly_solve()
supports computation of totally-mixed equilibria on supports in strategic games via PHCpack. [3]
For both calls, using the external program requires passing the path to the executable (via the lrsnash_path and phcpack_path arguments, respectively).
The user must download and compile or install these programs on their own; these are not packaged with Gambit. The solver calls do take care of producing the required input files, and reading the output to convert into Gambit objects for further processing.
McKelvey, Richard D. and McLennan, Andrew M. (1996) Computation of equilibria in finite games. In Handbook of Computational Economics, Volume 1, pages 87-142.