How to find Nash Equilibrium in a 2X2 payoff matrix

This video goes over the strategies and rules of thumb to help figure out where the Nash equilibrium will occur in a 2×2 payoff matrix. Generally you need to figure out what the dominant strategy is for each player and then use the dominant strategy of each player to see if a final cell ends up being the choice for both players.

Game Theory 12: Evolutionarily Stable Strategies


As we saw in the previous chapter on evolutionary games, when everyone was playing a random strategy it was best to play a Tit for Tat strategy. When everyone was playing a Tit for Tat strategy it was best to play Generous Tit for Tat. When people were playing this, it was then best to play an unconditional cooperative strategy. Once the game was in this state it was then best to play a defecting strategy, thus creating a cycle. This illustrates clearly the dynamic nature to the success of strategies within games.

Because evolutionary games are dynamic, meaning that agents’ strategies change over time, what is best for one agent to do often depends on what others are doing.

It is legitimate for us to then ask, are there any strategies within a given game that are stable and resistant to invasion?

In studying evolutionary games one thing that biologists and others have been particularly interested in is this idea of evolutionary stability, which are evolutionary games that lead to stable solutions or points of stasis for contending strategies.

Just as equilibrium is the central idea within static noncooperative games, the central idea in dynamic games is that of evolutionarily stable strategies, as those that will endure over time.

As an example, we can think about a population of seals that go out fishing every day. Hunting for fish is energy consuming and thus some seals may adopt a strategy of simply stealing the fish off those who have done the fishing. So if the whole population is fishing then if an individual mutant might be born that follows a defector strategy of stealing, it would then do well for itself because there is plenty of fishing happening. This successful defector strategy could then reproduce creating more defectors. At which point we might say that this defecting strategy is superior and will dominate. But of course, over time we will get a tragedy of the commons situation emerge as not enough seals are going out fishing. Stealing fish will become a less viable strategy to the point where they die out, and those who go fishing may do well again.

Thus the defector strategy is unstable, and likewise, the fishing strategy may also be unstable. What may be stable in this evolutionary game is some combination of both.


The Evolutionarily Stable Strategy is very much similar to Nash Equilibrium in classical Game Theory, with a number of additions.

Nash Equilibrium is a game equilibrium where it is not rational for any player to deviate from their present strategy.

An evolutionarily stable strategy here is a state of game dynamics where, in a very large population of competitors, another mutant strategy cannot successfully enter the population to disturb the existing dynamic.

Indeed, in the modern view, equilibrium should be thought of as the limiting outcome of an unspecified learning, or evolutionary process, that unfolds over time. In this view, equilibrium is the end of the story of how strategic thinking, competition, optimization, and learning work, not the beginning or middle of a one-shot game.

Therefore, a successful stable strategy must have at least two characteristics.

  1. One, it must be effective against competitors when it is rare – so that it can enter the previous competing population and grow.
  2. Secondly, it must also be successful later when it has grown to a high proportion of the population – so that it can defend itself.

This, in turn, means that the strategy must be successful when it contends with others exactly like itself. A stable strategy in an evolutionary game does not have to be unbeatable, it only has to be uninvadable and thus stable over time.

A stable strategy is a strategy that, when everyone is doing it, no new mutant could arise which would do better, and thus we can expect a degree of stability.


Of course, we don’t always get stable strategies emerge within evolutionary games. One of the simplest examples of this is the game Rock, Paper, Scissors.

The best strategy is to play a mixed random game, where one plays any of the three strategies one-third of the time.

However in biology, many creatures are incapable of mixed behavior — they only exhibit one pure strategy. If the game is played only with the pure Rock, Paper and Scissors strategies the evolutionary game is dynamically unstable. Rock mutants can enter an all scissor population, but then – Paper mutants can take over an all Rock population, but then – Scissor mutants can take over an all Paper population – and so on.

Using experimental economic methods, scientists have used the Rock, Paper, Scissors game to test human social evolutionary dynamical behaviors in the laboratory. The social cyclic behaviors, predicted by evolutionary game theory, have been observed in various lab experiments.

Likewise, it has been recorded within ecosystems, most notably within a particular type of lizard that can have three different forms, creating three different strategies, one of being aggressive, the other unaggressive and the third some what prudent. The overall situation corresponds to the Rock, Scissors, Paper game, creating a six-year population cycle as new mutants enter and become dominant before another strategy invades and so on.

Game Theory 4: Non-Cooperative Games


In studying the dynamics of cooperation and competition between actors, understanding the structure of the game that is being played is central to understanding the system of interest.

In game theory, a primary distinction is made between those game structures that are cooperative and those that are non-cooperative.

As we will see the fundamental dynamics surrounding the whole game are altered as we go from games whose structure is innately competitive to those games where cooperation is the default position.

A cooperative game is one wherein the agents are able to resort to some institution or third party in order to enable cooperation and optimal results for all.

A game is noncooperative if players cannot form the structures required to enable cooperation.

For example, we might think about two people wishing to make a commercial transaction online. Given two anonymous people interacting without some institution to enable cooperation, there is no reason for either to think that the other will carry through with the transaction as promised.

The seller is incentivised to take the money and not send the item while the buyer is likewise incentivised to take the product without sending the money. In the absence of some cooperative structure that would enable each party to trust the other and thus cooperate, the game would naturally gravitate towards defection and the potentially valuable transaction would not take place.

Thus we can see how in the absence of cooperative mechanisms each player may follow the course that renders them the best payoff without regard for what the other does, or what is optimal for the overall system and this can result in suboptimal outcomes for all.

In non-cooperative games, each agent in the game is assumed to act in their self-interest, and this self-interested agent is the primary unit of analysis within noncooperative games because there is no cooperative structure.

This is in contrast to cooperative game theory that treats groups or subgroups of agents as the unit of analysis and assumes they can achieve certain outcomes among themselves through binding cooperative agreements.

Game theory historically has been very much focused on non-cooperative games and trying to find optimal strategies within such a context. This is likely because non-cooperative games are very much amenable to our standard mathematical framework and thus offer nice closed form solutions.

But it is important to note that the real world is made up of situations that are sometimes cooperative, sometimes non-cooperative, and often involve elements of both.

As previously mentioned, non-cooperative games arise due to a number of factors. Firstly the game may be inherently zero-sum, meaning what one wins the other loses and thus there is an inherent dynamic of competition.

Many sports games are specifically designed to be zero-sum in their structure, so as to create a dynamic of competition. In such a case there is only one prize, and if someone else gets it, you don’t. There is no incentive for cooperation and every incentive for competition and thus the best option is for the actor to focus on maximizing their payoff irrespective of all else.

This is called a strictly competitive game. A strictly competitive game is a game in which the interests of each player are diametrically opposed.

Likewise, a game may be non-competitive due to the incapacity to create cooperative structures. Most people, when engaged in a game, will wish to not only optimize their own payoff but will wish to optimize the overall outcome as well.

In general, people do not like the idea of waste or of unfairness and we typically search for some optimal solution given both our own interests and some consideration for the overall organization.

The real world of social interaction is full of all sorts of informal social and cultural institutions designed to enable trust, cooperation and optimal outcomes for all.

Almost as soon as two people start to interact they will start to look for commonalities and shared interests that enable them to develop trust and cooperation.

Thus, non-cooperative games are typically those where the actors can not interact and form the trust required for cooperation. Indeed, there will be certain games that we construct where we specifically want competition and we do that by not allowing the players to cooperate, such as in a competitive market.

Lastly non-cooperative games can be a product of an incapacity to enforce binding contracts. If there is a third party involved to ensure optimal outcomes for the overall organization through sanctions and incentives, this can form a solid basis for cooperation – in the way that a government does by enforcing laws.

This is famously captured in Thomas Hobbes’ conception of the state of nature. Where he pondered “What was life like before civil society?” He went on to write “during the time men live without a common power to keep them all in awe, they are in that condition which is called war, and such a war as is of every man against every man.”

In this state, every person has a natural right or liberty to do anything one thinks necessary for preserving one’s own life.

Hobbes’ ideas illustrate vividly how in the absence of a third party to enforce cooperation, competition can prevail.


Non-cooperative games create a specific dynamic within a game, where we are taking the individual and their payoff as the basic unit of analysis. In such a circumstance we do not need to consider what is best for all if given some form of cooperation because this is not possible within the context.

We are solely interested in how the individuals will act.

The question of how should they act to optimize their own payoff, and given the assumption that both are performing this optimization what will be a stable solution to the game.

Given these assumptions, both players should search for a strategy that optimizes their payoff, and where those strategies of the players interact we should have a stable outcome, that we should be able to predict will occur.

This stable outcome is what we call an equilibrium.

Where equilibrium, in the general sense, means a state in which opposing forces are balanced, thus creating a point of stability and stasis.

When we see a ball at the bottom of a bowl it is in a state of equilibrium, because if we put it anywhere else in the bowl the force of gravity would act on it to pull it back to this static point. This is the same for the actors in a non-cooperative game because they are both trying to optimize their payoff they will both naturally gravitate towards the strategy that gives them the highest payoff.

But because their payoff is dependent on what strategy the other chooses and because they can not depend upon cooperation between them, they have to choose the best strategy assuming that the other will work to optimize their payoff without cooperating.

This point of equilibrium in a game is called the Nash equilibrium after the famous mathematician John Nash.

In game theory, the Nash equilibrium is a solution concept of a non-cooperative game involving two or more players in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his or her own strategy. If each player has chosen a strategy and no player can benefit by changing strategies while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding payoffs constitute a Nash equilibrium.

The Nash equilibrium is one of the foundational concepts in game theory.

The basic intuition of the Nash equilibrium is in predicting what others will do given their self-interest only and then choosing your optimal strategy given that assumption.

Nash equilibrium is a point where all players are doing their best given the absence of cooperation. It is a law that no one would want to change in the absence of some effective overall structure for coordination.


Nash equilibrium is best illustrated through the prisoner’s dilemma game.

The prisoner’s dilemma game is a classic two player game that is often used to present the concept of Nash equilibrium in a payoff matrix form.

Conceive of two prisoners detained in separate cells, interrogated simultaneously and offered deals in the form of lighter jail sentences for betraying the other criminal. They have the option to “cooperate” with the other prisoner by not telling on them, or “defect” by betraying the other.

However, if both players defect, then they will both serve a longer sentence than if neither said anything. Lower jail sentences are here interpreted as higher payoffs.

The prisoner’s dilemma has a similar matrix as depicted for the coordination game, but the maximum reward for each player is obtained only when the players’ decisions are different. Each player improves their own situation by switching from “cooperating” to “defecting”, given the knowledge that the other player’s best decision is to “defect”. The prisoner’s dilemma thus has a single Nash equilibrium: where both players choose to defect.

What has long made this an interesting case to study is the fact that this scenario is globally inferior to “both cooperating”. That is, both players would be better off if they both chose to “cooperate” instead of both choosing to defect. However, each player could improve their own situation by breaking the mutual cooperation, no matter how the other player changes their decision.


The central aim of non-cooperative game theory then is in trying to predict people’s actions within a game by finding the Nash equilibria and assuming they will play that because it is their best option.

It is then legitimate for us to ask does equilibrium analysis give us any predictive capacity over what happens in the real world? Often the outcome of experiments is not an equilibrium as predicted by the theory. This is mainly because people do not fully reason through the game in a fully logically consistent fashion.

Equilibrium is a point where everyone has figured out what everyone else will do, thus behaviorally it often does not predict what people will do the first time they play the game.

Equilibrium should more be interpreted as what will happen over a number of iterations within a non-cooperative game, as players come to better understand the game and how to reason through it.

Similar to putting a ball in a bowl, it takes time before it arrives at an equilibrium and this is what is seen in game experiments they tend over time towards the equilibrium.

For example, in a game, people are asked to choose a number between 0 to 100, with the winner being the person who is able to guess what will be 2/3 of the average figure proposed by others.

So everyone is being asked to guess a bit below the average number proposed.

In this game, only a small percentage choose the equilibrium point – which is zero – and because other people did not act rationally in this game they were wrong.

In many ways then choosing this equilibrium as a prediction of what would happen is not a good option. And this clearly diverges dramatically from what the theory tells us.

However, overtime, as the game is iterated upon the numbers chosen by people does move towards the equilibrium. Thus it tells us something about statistical averages of the system but not very much about how it will behave in the real world the first iteration of the game.