The Replicator equation is the first and most important game dynamic studied in connection with evolutionary game theory. The replicator equation and other deterministic game dynamics have become essential tools over the past 40 years in applying evolutionary game theory to behavioral models in the biological and social sciences.
These models show the growth rate of the proportion of agents using a certain strategy. As we will illustrate, this growth rate is equal to the difference between the average payoff of that strategy and the average payoff of the population as a whole.
There are three primary elements to a replicator model:
Firstly we have a set of agent types, each of which represents a particular strategy and each type of strategy has a payoff associated with it which is how well they are doing.
There is also a parameter associated with how many of each type there are in the overall population – each type represents a certain percentage of the overall population.
Now in deciding what they might do, people may adopt two approaches.
They may simply copy what other people are doing, in such a case the likelihood of an agent adopting any given strategy would be relative to its existing proportion of that strategy within the population. So if lots of people are doing some strategy the agent would be more likely to adopt that strategy over some other strategy that few are doing.
Alternatively, the agent might be more discerning and look to see which of other people’s strategies is doing well and adopt the one that is most successful, having the highest payoff.
The replicator dynamic model is going to try and balance these two potential approaches that agents might adopt, and hopefully, give us a more realistic model than one where agents simply adopt either strategy solely.
Given these rules, the replicator model is one way of trying to capture the dynamic of this evolutionary game, to see which strategies become more prevalent over time or how the percentage mix of strategies changes.
In a rational model, people will simply adopt the strategy that they see as doing the best amongst those present. But equally, people may simply adopt a strategy of simply copying what others are doing. If 10% are using strategy 1, 50% strategy 2, and 40% strategy 3, then the agent is more likely to adopt strategy 2 due to its prevalence.
So the weight that captures how likely an agent will adopt a certain strategy in the next round of the game is a function of the probability times the payoff.
If we wanted to think about this in a more intuitive way, we might think of having a bag of balls where the ball represents a strategy that will be played in the game. If a strategy has a better payoff then it will be a bigger ball and you will be more likely to pick that bigger ball.
Equally, if there are more agents using that strategy in the population, there will be more balls in the bag representing that strategy, meaning again you will be more likely to choose it. The replicator model is simply computing which balls will get selected and thus what strategies will become more prevalent.
One thing to note though is that the theory typically assumes large homogeneous populations with random interactions. The replicator equation differs from other equations used to model replication in that it allows the fitness function to incorporate the distribution of the population types rather than setting the fitness of a particular type constant. This important property allows the replicator equation to capture the essence of selection. But unlike other models, the replicator equation does not incorporate mutation and so is not able to innovate new types or pure strategies.
FISHERS FUNDAMENTAL THEOREM
An interesting corollary to this is what is called Fisher’s Fundamental Theorem, which is a model that tries to capture the role that variation plays in adaptation. The basic intuition is that a higher variation in the population will give it greater capacity to evolve optimal strategies given the environment.
Thus given a population of agents trying to adapt to their environment, the rate of adaptation of a population is proportional to the variation of types within that population. Fisher’s Fundamental Theorem then works to incorporate this additional important parameter, of the degree of variation among the population, so as to better model the overall process of strategy evolution.
Static game-theoretic solution concepts, such as Nash equilibrium, play a central role in predicting the evolutionary outcomes of game dynamics.
Conversely, game dynamics that arise naturally in analyzing behavioral evolution lead to a more thorough understanding of issues connected to the static concept of equilibrium. That is, both the classical and evolutionary approaches to game theory benefit through this interplay between them.
Replicator Dynamic models have become a primary method for studying the evolutionary dynamics in games both social, economic and ecological.
As we saw in the previous chapter on evolutionary games, when everyone was playing a random strategy it was best to play a Tit for Tat strategy. When everyone was playing a Tit for Tat strategy it was best to play Generous Tit for Tat. When people were playing this, it was then best to play an unconditional cooperative strategy. Once the game was in this state it was then best to play a defecting strategy, thus creating a cycle. This illustrates clearly the dynamic nature to the success of strategies within games.
Because evolutionary games are dynamic, meaning that agents’ strategies change over time, what is best for one agent to do often depends on what others are doing.
It is legitimate for us to then ask, are there any strategies within a given game that are stable and resistant to invasion?
In studying evolutionary games one thing that biologists and others have been particularly interested in is this idea of evolutionary stability, which are evolutionary games that lead to stable solutions or points of stasis for contending strategies.
Just as equilibrium is the central idea within static noncooperative games, the central idea in dynamic games is that of evolutionarily stable strategies, as those that will endure over time.
As an example, we can think about a population of seals that go out fishing every day. Hunting for fish is energy consuming and thus some seals may adopt a strategy of simply stealing the fish off those who have done the fishing. So if the whole population is fishing then if an individual mutant might be born that follows a defector strategy of stealing, it would then do well for itself because there is plenty of fishing happening. This successful defector strategy could then reproduce creating more defectors. At which point we might say that this defecting strategy is superior and will dominate. But of course, over time we will get a tragedy of the commons situation emerge as not enough seals are going out fishing. Stealing fish will become a less viable strategy to the point where they die out, and those who go fishing may do well again.
Thus the defector strategy is unstable, and likewise, the fishing strategy may also be unstable. What may be stable in this evolutionary game is some combination of both.
EVOLUTIONARILY STABLE STRATEGY
The Evolutionarily Stable Strategy is very much similar to Nash Equilibrium in classical Game Theory, with a number of additions.
Nash Equilibrium is a game equilibrium where it is not rational for any player to deviate from their present strategy.
An evolutionarily stable strategy here is a state of game dynamics where, in a very large population of competitors, another mutant strategy cannot successfully enter the population to disturb the existing dynamic.
Indeed, in the modern view, equilibrium should be thought of as the limiting outcome of an unspecified learning, or evolutionary process, that unfolds over time. In this view, equilibrium is the end of the story of how strategic thinking, competition, optimization, and learning work, not the beginning or middle of a one-shot game.
Therefore, a successful stable strategy must have at least two characteristics.
One, it must be effective against competitors when it is rare – so that it can enter the previous competing population and grow.
Secondly, it must also be successful later when it has grown to a high proportion of the population – so that it can defend itself.
This, in turn, means that the strategy must be successful when it contends with others exactly like itself. A stable strategy in an evolutionary game does not have to be unbeatable, it only has to be uninvadable and thus stable over time.
A stable strategy is a strategy that, when everyone is doing it, no new mutant could arise which would do better, and thus we can expect a degree of stability.
Of course, we don’t always get stable strategies emerge within evolutionary games. One of the simplest examples of this is the game Rock, Paper, Scissors.
The best strategy is to play a mixed random game, where one plays any of the three strategies one-third of the time.
However in biology, many creatures are incapable of mixed behavior — they only exhibit one pure strategy. If the game is played only with the pure Rock, Paper and Scissors strategies the evolutionary game is dynamically unstable. Rock mutants can enter an all scissor population, but then – Paper mutants can take over an all Rock population, but then – Scissor mutants can take over an all Paper population – and so on.
Using experimental economic methods, scientists have used the Rock, Paper, Scissors game to test human social evolutionary dynamical behaviors in the laboratory. The social cyclic behaviors, predicted by evolutionary game theory, have been observed in various lab experiments.
Likewise, it has been recorded within ecosystems, most notably within a particular type of lizard that can have three different forms, creating three different strategies, one of being aggressive, the other unaggressive and the third some what prudent. The overall situation corresponds to the Rock, Scissors, Paper game, creating a six-year population cycle as new mutants enter and become dominant before another strategy invades and so on.
Classical game theory was developed during the mid 20th century primarily for application in economics and political science. But in the 1970s a number of biologists started to recognize how similar the games being studied were to the interaction between animals within ecosystems. Game theory then quickly became a hot topic in biology as they started to find it relevant to all sorts of animal and microbial interactions from the feeding of bats to the territorial defense of stickleback fish.
Originally evolutionary game theory was simply the application of game theory to evolving populations in biology. Asking how cooperative systems could have evolved over time from various strategies that biological creatures might have adopted. However, the development of evolutionary game theory has produced a theory which holds great promise as a general theory of games.
More recently, evolutionary game theory has become of increased interest to economists, sociologists, anthropologists and social scientists in general as well as philosophers. In this video will talk about this more general application of evolutionary game theory.
Whereas the game theory that we have been talking about so far has been focused on static strategies, that is to say, strategies that do not change over time, evolutionary game theory differs from classical game theory in focusing more on the dynamics of strategy change. Here we are asking how strategies evolve over time and which kind of dynamic strategies are most successful in this evolutionary process.
One of the interesting differences between evolutionary game theory and standard game theory is that the evolutionary version does not require players to act rationally.
When we talk about biological cells or ants we know that they do not sit in front of a payoff matrix and ask themselves what is the best payoff, in evolutionary game theory natural selection does this for us.
So if we have a group of cooperators and defectors who randomly meet each other, the average payoff for the defectors is higher than the cooperators, therefore, they reproduce better. Payoffs in evolutionary biology correspond to reproductive success. So after some time evolution will have favored defectors to the point where all of the cooperators will be extinct.
The basic logic is that, for something to survive the course of time, it must be an optimal strategy or else any other strategy that is more effective will eventually come to dominate the population.
Traditionally, the story of evolution is told as one of competition, and there is certainly plenty of this. But there is also mutualism, where organisms and people manage to work together cooperatively and survive in the face of defectors. Many research papers have been written on this topic of how cooperation could evolve in the face of such an evolutionary dynamic.
The general question of interest in evolutionary game theory is in how do patterns of cooperation evolve, and what are optimal strategies to use in a game that evolves over time.
The basic mechanism that underlies the evolution of cooperation is the interdependency between acts over time.
In a single shot game, it makes sense to always defect, but with repeated interaction, cooperation becomes greatly more viable. If the game is repeated, it is no longer the case that strict defection is the best option.
If the prisoner’s dilemma situation is repeated it allows non-cooperation to be punished more, and cooperation to be rewarded more, than the single-shot version of the problem would suggest. We can understand this better by looking at a number of experiments that were done to investigate this dynamic.
The political scientist Robert Axelrod in the late seventies did a number of highly influential computer experiments asking what is a good strategy for playing a repeated Prisoner’s Dilemma. Axelrod asked for various researchers to submit computer algorithms to a competition to see which algorithms would fare best against each other. Computer models of the evolution of cooperation showed that indiscriminate cooperators almost always end up losing against defectors, who accept helpful acts from others but do not reciprocate. People who are cooperative and helpful indiscriminately all of the time will end up getting taken advantage of by others. However, if we have a population of pure defectors they will also lose out on the possible rewards of cooperation that would give all higher payoffs.
Many strategies have been tested; the best competitive strategies are general cooperation with a reserved retaliatory response if necessary.
The most famous and one of the most successful of these is Tit for Tat with a simple algorithm. Tit for Tat is a very simple algorithm of just three rules, I start with cooperation, if you cooperate, then I will cooperate. If you defect, then I will defect. Computer tournaments in which different strategies were pitted against each other showed Tit for Tat to be the most successful strategy in social dilemmas.
Tit for Tat is a common strategy in real-world social dilemmas because it is nice but firm it makes cooperation a possibility but is also quick to reprimand. It is a strategy that can be found naturally in everything from international trade policies to people borrowing and lending money. And in repeated interactions cooperation can emerge when people adopt a Tit for Tat strategy.
To go beyond Tit for Tat, researchers started to use computers to simulate the process of evolution. Instead of people submitting solutions the computer itself generated mutations and selected from them with the researchers recording and analyzing the results.
From these experiments, they found that if the players play randomly the winners are those that always defect. But then when everyone has come to play defect strategies, if a few people play Tit for Tat strategies, a small cluster can form where among themselves they get a good pay off.
Evolutionary selection can then start to favor them and they do not get exploited by all the defectors because they immediately switch to defect in retaliation.
But the Tit for Tat strategy did not last long in this setting as a new solution came to emerge given this context. This strategy was a mutant of Tit for Tat that was more forgiving called Generous Tit for Tat.
Generous Tit for Tat is an algorithm that starts with cooperation and then will reciprocate cooperation from others, but if the other defects it will defect with some probability. Thus it uses probability to enable the quality of forgiveness. It cooperates when others do but when they defect there is still some probability that it will continue to cooperate. This is a random decision so it is not possible for others to predict when it will continue to cooperate.
It turns out that this forgiving strategy is optimal in environments where there is some degree of noise in communications, as is characteristic of real-world environments.
In the real world, we often do not know for certain if our partner cheated or if someone really meant to say what they said, and these errors have to be compensated for by some degree of forgiveness. In a world of errors in action and perception, such a strategy can be a Nash equilibrium and evolutionarily stable. The more beneficial cooperation is, the more forgiving Generous Tit for Tat can be, while still resisting invasion by defectors.
The extraordinary thing that now happens is that once everyone has moved towards playing Generous Tit for Tat, cooperation becomes a much stronger attractor and at this stage, players can now play an unconditional cooperative strategy without having any disadvantage.
In a world of Generous Tit for Tat, there is no longer a need for any other actions and thus unconditional cooperators survive. In order for a strategy to be evolutionarily stable, it must have the property that if almost every member of the population follows it, no mutants can successfully invade – where a mutant is an individual who adopts a novel strategy.
In many situations, cooperation is favored and it even benefits an individual to forgive an occasional defection, but cooperative societies are always unstable because mutants inclined to defect can upset any balance. And this is the downfall of the cooperative strategy. What happens next is somewhat predictable. In a world where everyone is cooperating, unconditional defection is an optimal strategy once it takes hold.
Thus we can see a dynamic cyclical process, as higher forms of cooperation arise and then collapse. In many ways then this reflects what we see in the real world of economies and empires rising and falling as institutional structures for cooperation are formed, mature and eventually decline.
These experiments describe the evolution of systems of cooperation through direct interaction, and much of our interactions are repeated with people we have interacted with before and built up an understanding of their capacity for reciprocity. However, in large societies we have to interact with many people that we have not interacted with before and it may only be a once off interaction.
Experiments have shown that people help those who have helped others and have shown reciprocity in the past and that this form of indirect reciprocity has a higher payoff in the end. Reputation systems are what allow for the evolution of cooperation by indirect reciprocity. Natural selection favors strategies that base the decision to help on the reputation of the recipient. The idea is that you interact with others and that interaction is seen and people note whether you acted cooperatively or non-cooperatively. That information is then circulated so that others learn about your behavior. Direct reciprocity is where I help you and you help me, indirect reciprocity is where I help you and then somebody helps me because I now have a reputation for cooperating.
The result is the formation of reputation, when you cooperate that helps your reputation, when you defect it reduces it. That reputation then follows us around and is used as the basis for your interaction with others.
Thus reputation forms a system for the evolution of cooperation in larger societies where people may interact frequently with people that they may not know personally. But because of various reputation systems, they are able to identify those who are cooperative and enter into mutually beneficial reciprocal relations.
The more sophisticated and secure these reputation systems, the greater the capacity for cooperative organizations. We can create large systems wherein we know who to cooperate with and thus can be cooperative ourselves, potentially creating a successful community.
But of course, as the society gets bigger we have to form more complex institutions for enabling functional reputation systems. In such a way we have gone from small communities where local gossip was sufficed to know everyone’s capacity for cooperation, to large modern industrial societies where centralized organizations vouched for people’s reputation. To today’s burgeoning global reputation systems based on information technology and mediated through the internet.
Research shows that cooperators create better opportunities for themselves than non-cooperators: They are selectively preferred as collaborative partners, romantic partners, and group leaders. This only occurs however when people’s social dilemma choices are seen and recorded by others in some way.
However this kind of indirect reciprocity is cognitively complex, no other creature has mastered it to even a fraction of what humans have. Games of indirect reciprocity lead to the evolution of social intelligence and ever more sophisticated means of communications, social and cultural institutions that are characteristic of human civilization.
The basic problem of the evolution of cooperation is thus that nice guys get taken advantage of, and thus there must be some form of supporting structure to enable cooperation.
More than any other primate species, humans have overcome this problem through a variety of mechanisms, such as reciprocating cooperative acts, forming reputations of others and the self as cooperators, and caring about these reputations.
We create prosocial norms about good behavior that everyone in the group will enforce on others through disapproval, if not punishment, and will enforce on themselves through feelings of guilt and shame. All of which form the fabric of our sociocultural institutions that enable advanced forms of cooperation.
Throughout this section of the course , we have been talking about cooperation and different aspects of the social dilemma. In this video, we will look at various approaches that have been identified for fostering the cooperation required to overcome this core constraint.
Our capacity to solve the social dilemma in various ways is a defining factor in the strength of individual relationships, social organizations, economies, and society at large and is thus a topic that is of great interest to many.
Depletion of natural resources, pollutants, and intercultural conflict, can be characterized as examples of social dilemmas.
Social dilemmas are challenging because acting in one’s immediate self-interest is tempting to everyone involved, even though everybody benefits from acting in the longer-term collective interest. Thus some form of cooperative institutional infrastructure is required to enable the cooperation required for sustained success.
The empirical fact that subjects in most societies contribute anything in the simple public goods game, that we looked at previously, is a challenge for game theory to explain via motives of total self-interest. But as we have noted one of the defining features to human beings is their extraordinarily high level of cooperative behavior. Cooperation is a massive resource for advancing individual and group capabilities, and over the course of thousands of years, we have evolved complex networks for collaboration and cooperation which we can call institutions of various form.
These institutional structures help us to solve the many different forms of the tragedy of the commons that we encounter within large societies.
As we have touched upon previously, the central issue of the tragedy of the commons is externalities. That is to say, that the actions that the individual takes have costs that the person does not fully bare, as they are externalized to the overall organization. If there are then too many negative externalities and not enough positive externalities the organization will degrade over time. The central issue in solving the tragedy of the commons is then in reconnecting the costs of the individual’s actions on the whole with the costs that they pay. When the individual always pays the full costs for their actions then there is no social dilemma and we have a self-sustaining organization.
This may sound simple in the abstract, but in practice, it is not simple at all, and this is one reason why we have such a complex array of economic and social institutions. How we approach doing this though, depends on the degree of interconnectivity and interdependence between the players in the game.
When there is low interconnectivity, then there will likely be low interdependence, which means a high probability for negative correlations between actors.
When actors are independent then they can do things that affect the other without that effect returning to themselves.
For example, if I live in Germany and pollute the atmosphere so that there is acid rain in Sweden, as long as I never go to Sweden then what happens there does not affect me too much and this negative correlation can exist.
Now if we turn up the interconnectivity and interdependence, this changes the dynamic. Say I have business partners in Sweden and happen to go on holiday there also. Due to this interconnectivity and interdependence there is a much greater possibility for a positive correlation between my experience and what happens in Sweden. This interconnectivity and interdependence means that I increasingly have to factor my negative externalities into my cost-benefit equation.
The central importance of interdependence as a parameter in cooperation can be simply seen in the way that people cooperate more with those that they are closely connected to; more than with those that are of a different group, culture or society that they are not connected with.
Thus, how we go about solving the social dilemma depends on the degree of interconnectivity and interdependence within the dynamic. At a low-level, cooperative structures have to be imposed through regulation, while at a high level this is no longer necessary as the interconnectivity and interdependence can be used to create self-sustaining cooperative organizations. This is illustrated by how different cooperative structures have evolved within society. Those within small closely interdependent groups like the family and those that have formed for larger society that is composed of many groups that are more independent.
This is to a large extent part of what has happened as we have gone from small, pre-modern societies to large modern societies. As the scale of the social systems that we are engaged in has increased the interconnectivity and interdependence between any two random members has decreased – because they are farther apart in the network. Thus this has disintegrated traditional cooperative institutions that are based on local interactions and interdependencies.
In the absence of tools for interconnecting everyone within a large national society, we have had to create the formal centralized regulatory institutions of the nation state.
And of course, with the rise of information technology and globalization, this is once again changing as we create social interdependencies that span the entire planet.
The most manifest and obvious form for enabling cooperation is regulation and rules that are imposed on the social system by a third party to ensure behavior that is of benefit to the group. The aspect of cooperation examined in many experimental games is cooperation that occurs when people follow rules limiting the exercise of their self-interested motivations.
People might want to take from a shop without paying, but are required to abide by the law, they may want to fish in a lake, but limit what they catch to the quantity specified in a permit.
They buy a fuel efficient car because of regulation taxing the sale of inefficient cars.
In all of these situations, people are refraining from engaging in behavior that would give them immediate benefit but is against the welfare of the group.
This method for enabling cooperation through regulation and rule adherence is deeply intuitive to us and often the default assumption as to how we might achieve cooperation.
The central aim of regulation is to connect the individual’s externalities with the costs and benefits they pay by imposing extra costs on them for certain negative externalities, while providing them with subsidies and payments for certain activity that generates positive externalities.
This form of solution for enduring cooperation through an external third party that imposes sanctions or rewards can be very effective in situations of independence between members.
For example, this would be a good solution to the prisoner’s dilemma where the members can not communicate with each other and are otherwise independent. By forming a third party that could impose sanctions on them, we could change the payoffs in the game to enable cooperative outcomes.
Although the regulatory approach is simple and straightforward, the development and maintenance of this external organization have overhead costs. It is also prone to corruption and has other limitations to it.
Studies have been conducted into the success of the establishment of a leader or authority to manage a social dilemma. Experimental studies on commons dilemmas show that over-harvesting groups are more willing to appoint a leader to look after the common resource.
There is a preference for a democratically elected prototypical leader with limited power especially when people’s group ties are strong.
When ties are weak, groups prefer a stronger leader with a coercive power base.
The question remains whether authorities can be trusted in governing social dilemmas and field research shows that legitimacy and fair procedures are extremely important in citizen’s willingness to accept the authority.
Furthermore, the formal governance structures of a police force, army, and judicial system will fail to operate unless people are willing to pay taxes to support them. This raises the question if many people want to contribute to these institutions.
Experimental research suggests that particularly low-trust individuals are willing to invest money in punishment systems.
The political economist Elinor Ostrom won a Nobel laureate for her studies of various communities around the world and how they managed to develop diverse institutional arrangements for managing natural resources, thus avoiding ecosystem collapse. She illustrated how communities can be managed successfully by the people who use them rather than by governments or private companies. In an interview talking about this centralized regulatory approach, she had this to say about it: “for some simple situations that theory works and we should keep it for the right situation, but there are so many other rich solutions.”
When interconnectivity between members within a game increases, so typically does interdependence and this changes the nature of the game.
Externalities are things that we can put external to our domain of value and interest, but interconnectivity reduces the capacity to do this.
One good example of this is the warning signs on the side of cigarette packets that makes you aware of the negative externalities of smoking on your body. They are trying to connect you with the negative externality that you are creating so that you recognize your interdependence and factor it into the equation under which you are making your decision to smoke.
Thus we can see an externality is not necessarily something that is far away, it is simply whatever you exclude from your value system so that reducing it has no reduction to your payoff. But connectivity takes this barrier down requiring us to recognize the value of the other entity and factor it into our decision. This connectivity can be of many different kinds.
Communication is a form of connection that can enable positive interdependence and there is a robust finding in the social dilemma literature that cooperation increases when people are given a chance to talk to each other.
Cooperation generally declines when group size increases. In larger groups, people often feel less responsible for the common good, as they are more removed from it and the other people with whom they share it.
Thus we can see what is really at the core of the social dilemma is the question of what people value and how far that value system extends.
Wherever we stop seeing something as part of us or our group, that is where negative externalities accumulate and start to give us the social dilemma. However, by building further connections so that people recognize their interdependence with what they previously saw as external, they will start to factor it into the value system under which they are making their choices and reduce their negative externalities. From this perspective, the issue is really one of value and externalities.
Connectivity can change that equation, working to internalize the externalities. Connectivity though is just an enabling infrastructure, one still has to build the channels of communication and structures that enable positive interdependence.
Building systems of cooperation in such a context means enabling ongoing interaction, with identifiable others, with some knowledge of previous behavior, lists of reputations that are durable and searchable and accessible, feedback mechanisms, transparency etc. These are all means of fostering positive interdependence once interconnectivity is present and through them, self-regulating and sustainable systems of cooperation can be formed.
If we think back to the public goods game, if the amount contributed is not hidden, then players tend to contribute significantly more. This is simply creating a transparent system where there is feedback.
As another example, we could think of eBay. EBay is really a huge social dilemma game. You would not send money before receiving the item nor would the other party send the item before receiving the money, so why has eBay succeeded? Not because eBay is going to throw you into jail if you don’t play nice, it is because of communication, transparency and feedback mechanisms that build positive interdependence.
This interconnectivity that builds positive interdependence is not just in space but also in time. Probably the single biggest difference in the prisoner’s game is whether it is a one-off or recurring game that is being played. And it is to this topic of games that play out over time that we will turn to in the next section.
The game theoretical version of the social dilemma is called the public goods game. Public goods games are usually employed to model the behavior of groups of individuals achieving a common goal. The public goods game has the same properties as the prisoner’s dilemma game, but describes a public good or a resource from which all may benefit regardless of whether or not they contributed to the good.
In the past ten to twenty years interest in social dilemmas has grown dramatically within many domains – particularly those resulting from overpopulation, resource depletion, and pollution.
The study of social dilemmas is one of the most interdisciplinary research fields, with the participation of researchers from anthropology, biology, economics, mathematics, neuroscience, political science, and psychology among others.
The social dilemma captures the core dynamics within groups requiring collective action. Where there is a conflict between an individual’s immediate personal or selfish interests and the actions that maximize the interests of the group.
At the heart of social dilemmas lies a disjunction between the costs to the individual and the cost to the whole, or benefits to the individual and benefits to the whole.
We call this value that is not factored into the cost-benefit equation of the individual an externality. And it is externalities that create the disjunction between the parts and whole and result in the social dilemma.
As an example of the social dilemma, we can think about the voting process within democratic political systems. In such a system citizens are called upon to make informed decisions about who should manage their country’s government. Many people choose not to vote, but of those who do, in order to make an informed decision, they have to gather and process information about the candidates.
Since each person’s vote is unlikely to affect the outcome of an election, and everyone knows this, there is little incentive to the individual to increase their knowledge about the relevant issue and overcome misinformation and preconceptions. The benefits of collecting and processing the information are diffused to the whole population. But the cost of doing such action is carried by the individual. Informed voting is then what we would call a public good.
As another example, we can think of a situation where during winter people in a village are asked to keep their thermostats low to conserve the limited amount of energy available. This would though require them to suffer from the cold without appreciably conserving the fuel supply by their individual sacrifice; yet if all keep their thermostats high, all may run out of fuel and freeze.
What is happening in this game is that there is a positive externality. Some of the value that is being generated by the individual is being externalized to the whole organization. This external value can not be immediately factored into the cost-benefit analysis of the individual. If the system is simply operated according to this immediate cost benefit analysis of the individuals then there will likely be an undersupply of it.
Equally, we can have negative externalities. The classical example being air pollution and traffic jams. An important part of the social dilemma is that at any given decision point, individuals receive higher payoffs for making selfish choices than they do for making cooperative choices, regardless of the choices made by those with whom they interact. And everyone involved receives lower payoffs if everyone makes selfish choices than if everyone makes cooperative choices.
What is interesting about the social dilemma is that it is not a feature of the agents, but of the structure of the game. It has nothing to do with the personalities and motivation of the individuals. It is a tragedy because you know what the outcome will be, but you can not individually do anything to avoid it, given only the self-interests of the individual.
No one has an individual motive to change their behavior even though everyone will be worse off.
Our traditional tools of non-cooperative game theory that are focused on the immediate costs and benefits to the individual, only really work when the externalities are limited. As soon as the externalities both positive or negative go above a certain level we have to think and operate collectively.
Many private goods are rivalrous, meaning they can be only consumed by one agent, and excludable meaning it is possible to exclude others from their use. This reduces the externalities from the item and thus makes it possible to associate the value of that item with an agent and factor it into their cost-benefit analysis.
Goods are public when they are considered both nonrivalrous and nonexcludable, meaning that there will likely be many externalities and thus they can not be effectively managed by the immediate cost benefit analysis of the individual agents. In such a case the social dilemma can arise. When people can benefit from the positive externalities, and produce more of the negative externalities without paying, the result is a macro level imbalance that leads to the system being rendered unsustainable. As we will discuss further in a coming video, there are essentially two different approaches to solving this imbalance and managing collective action.
The organization can try to create some top-down structure that regulates the system to ensure that those who create negative externalities pay for them and those who create positive externalities get paid for them. This works to reintegrate the positive externalities into the cost-benefit analysis of the agents and maintain a macro-level balance. This is the traditional approach taken, the classical example of which being governments, that use force and incentives to regulate the system towards these ends.
Equally a second approach is to increase the degree of connectivity between the agents and thus the potential for a higher degree of interdependence between them. Given that interdependence means that what happens to one agent also happens to another, as interdependence between agents and between the individual and the whole increases, externalities decrease – because there is nowhere for them to go and agents have to increasingly factor them into their own cost-benefit analysis.
Which is one approach to managing the system and potentially solving the problem.
Externalities are always a function of independence. If two things are one, then there is no possibility for externalities, the more independent they are the greater the possibility for externalities. Thus all solutions to externalities and the social dilemma will involve creating positive interdependence between the elements in the system, but different approaches will do this in different ways.
We will pick this theme up again in a future video when we go deeper into ways of solving social dilemmas. In the next video, we will look at what game theory can tell us about the social dilemma as we talk about public goods games.
Game theory is the study of interdependent interactions between adaptive agents, and when we look at these interactions between agents in the world around us we see elements of both competition and cooperation.
We see the organelles within a biological cell work together to enable its overall functioning. We see organisms within ecosystems forming symbiotic cooperative relations. We see people form families, tribes, cities, and nations all of which involve high levels of cooperation.
Game theory should then be a tool that helps us to understand what actions an agent should take within non-cooperative situations and what outcomes are most likely in such games.
But it should also be a tool that helps us understand how cooperation works. The first thing for us to note is that the dynamics of cooperation are very different from those that we have been studying in games of noncooperation. Cooperative dynamics rewrite the rules of the games people play.
In non-cooperative games like the prisoner’s dilemma, we noted how individual rationality, in fact, led to collective irrationality. We called this a dilemma and lamented the fact that within the non-cooperative framework of self-interested individuals there was nothing we could do about it. However, we as humans do not just look out for our own interests but also those of others and in so doing we have evolved highly sophisticated means for cooperation in the process.
When we add this new element to the game, that of cooperation, we now have the possibility for the agents to solve this dilemma. The question then turns to how and when do coalitions for cooperation form, where we can achieve both stable and optimal outcomes for the individual and the whole organization.
With non-cooperative games, we are solely focused on the payoffs to the individuals and searching for stable situations. Cooperative game theory, however, adds an extra dimension to this in that we now have to think about the payoff to the whole organization. In such a case we can not simply look a the actions of the individuals and their payoffs but we have to also look at the positive or negative externalities that these actions may have on the whole system.
Cooperation is a process by which the components of a system work together to achieve the global properties. In other words, individual components that appear independent work together to create a complex whole, greater-than-the-sum-of-its-parts system.
Virtually all of human civilization is a product of our capacity to work cooperatively. Indeed the complex systems that surround us, like our global economy and technologies like a jumbo jet are a testament to our extraordinary capacity for cooperation.
In most animal groups and even our closest relatives in the primate group, competition is the norm, and cooperation occurs largely only among kin, who have common genes and so have a biological incentive to do so, or else among a few individuals who cooperate reciprocally. But humans cooperate with each other in very large groups in a multiplicity of ways.
People risk their lives in war for their countrymen and we make sure that our less fortunate compatriots have enough food and medical care to survive. On a daily basis, we obey all kinds of prosocial norms. And when we do breach some prosocial norm, like not doing our part in a collective enterprise, we feel guilty or ashamed, in general, we are highly sensitive to cooperative behavior.
Human evolved capacity for cooperation is a cultural one that distinguishes us from other creatures who we may share up to 98% of our genes in common with. A large-scale study has recently provided some foundation to this hypothesis.
Researchers compared two-year-old children to their nearest primate relative chimpanzees and orangutans. They were all given 16 different tasks grouped into two overall categories. One concerning an understanding of the physical world, the other an understanding of the social world. The physical tests were related to space, quantity, and causality, and the social tests concerned capacities for social imitation, communication, and intention reading. The experiment revealed that the children were not across the board more intelligent than the primate animals, but in fact only with respect to social cognition were they more advanced. At just two years of age, the children were already about twice as high on this indicator as the other two creatures. This social capacity enables us to take advantage of the skills and knowledge of others within a social group through cooperation.
The researchers noted that apes did have social cognitive skills, but they were mainly using their social understanding of others within contexts of competition. From this, the researchers proposed that on top of great apes skills for social cognition humans had evolved additional social cognitive capabilities for dynamics of cooperation, which involve greater complexity, but which can ultimately be seen as the foundations to advanced forms of civilization.
Thus they came to understand others as not just intentional goal seeking agents, but also as potential cooperative agents with whom they could work together to produce outcomes that neither could produce alone. This cognitive capacity along with communications enabled us to create the ever more complex social and cultural institutions for cooperation, that today form the foundation of our advanced systems of socio-economic coordination.
The researchers claimed that this distinction between apes and humans can be identified even in the earliest human economies. Noting how apes are individual foragers, where they will travel in small groups until they find a food source like a fig tree and then run up and grab the food separately without collaboratively producing it or sharing it. Humans, however, are collaborative foragers meaning that most traditional forager groups derive most of their daily nutrition from collaborative activities in different forms, such as hunting.
This is not to say that advanced forms of cooperation do not happen within other creatures. We just have to look at ant or bee colonies to see sophisticated coordination. However, these creatures have nowhere near the kind of individual cognitive capacity that apes and humans do, and thus we do not get the same kind of complex dynamic between the individual and the group that is at the heart of human social systems and the study of cooperation.
Dynamics of cooperation require more cognitive capabilities on the behalf of the individual because they are greatly more complex in nature than noncooperative situations.
Whereas non-cooperative dynamics are governed solely by the self-interest of the individual’s, cooperation involves a new level of organization, that of the group, and a complex dynamic between the individual and the overall group.
This central dynamic within cooperation is captured in the idea of the social dilemma. Social dilemmas are characterized by two properties: The social payoff to each individual for defecting behavior is higher than the payoff for cooperative behavior, regardless of what the other society members do, yet all individuals in the society receive a lower payoff if all defect than if all cooperate. It is a situation where individual rational behavior leads to a situation where everyone is worse off.
Social dilemmas are of interest to many because they reveal the core tension between the individual and the group, that is engendered in situations of cooperation. At their core, social dilemmas are situations in which self-interest is at odds with collective interests and they can be found in many situations of interdependence; from resource management to relationship development, to international politics, public goods provision and business management. In the next section, we will zoom in to look more closely at the workings of this social dilemma.
In non-cooperative game theory, the focus is on the agents in the game and strategies that optimize their payoffs, which results in some form of equilibrium.
As we can see in the prisoner’s dilemma game the issue arises in that what turns out to be the equilibrium is suboptimal for all the agents when taken as a whole. One way of defining what we mean by suboptimal for all is the idea of Pareto Optimality.
Named after Vilfredo Pareto, Pareto optimality is a measure of efficiency.
Whereas Nash Equilibrium is a solution concept of non-cooperative games, Pareto optimality in game theory answers a very specific question of whether an outcome can be better than the other?
Pareto optimality is a notion of efficiency or optimality for all the members involved.
An outcome of a game is Pareto optimal if there is no other outcome that makes every player at least as well off and at least one player strictly better off. That is to say, a Pareto optimal outcome cannot be improved upon without hurting at least one player.
To illustrate this let’s take the game called the stag hunt, wherein two individuals go out on a hunt. Each can individually choose to hunt a stag or hunt a hare. Each player must choose an action without knowing the choice of the other. If an individual hunts a stag, they must have the cooperation of their partner in order to succeed. An individual can get a hare by themselves, but a hare is worth less than a stag.
In the stag hunt, there is a single outcome that is Pareto efficient, which is that they both hunt stags. With this outcome, both players receive a payoff of three, which is each player’s largest possible payoff for the game. In this case, we cannot switch to any other outcome and make at least one party better off without making anyone worse off. The stag option is here the only Pareto optimal outcome.
One of the features of a Nash equilibrium is that in general, it does not correspond to a socially optimal outcome. That is, for a given game it is possible for all the players to improve their payoffs by collectively agreeing to choose a strategy different from the Nash equilibrium. The reason for this is that some players may choose to deviate from the agreed-upon cooperative strategy after it is made in order to improve their payoffs further at the expense of the group. A Pareto optimal equilibrium describes a social optimum in the sense that no individual player can improve their payoff without making at least one other player worse off. Pareto optimality is not a solution concept, but it can be an important attribute in determining what solution the players should play, or learn to play over time.
This is the interesting thing about the prisoner’s dilemma, that all options are Pareto optimal except for the unique equilibrium, which is for both to defect. This strong contrast between Pareto optimality and Nash equilibrium is what makes the prisoner’s dilemma a central object of study in game theory. The fact that all of the overall efficient outcomes are the ones that do not occur in equilibrium, makes it a classical illustration of the core dynamic between cooperation and competition.
This is a good segway into the next section of the book where we will be talking about the dynamics of cooperation. Where we will be looking specifically at the overall outcomes trying to optimize them instead of just individual payoffs.
As we talked about in the last video, the central aim in non-cooperative game theory is in trying to find the optimal strategy for agents to play within a game and trying to predict the outcomes of the game by finding points of equilibrium.
This equilibrium is called the Nash Equilibrium and is considered the best option given the absence of frameworks to support cooperation.
This is what we call a solution concept.
In game theory, a solution concept is a model or rule for predicting how a game will be played. These predictions are called “solutions”, and describe which strategies will be adopted by players and, therefore, the results of the game.
The most commonly used solution concepts are equilibrium concepts.
Where we look for a set of choices, one for each player, such that each person’s strategy is best for them when all others are playing their stipulated best response.
In other words, each picks their best response to what the others do.
In game theory the term best response refers to the strategy (or strategies) which produce the most favorable outcome for a player, taking other players’ strategies as given. Best response is when you know what others are going to do and you choose your best response.
Sometimes one person’s best choice is the same no matter what the others do. This is called a “dominant strategy” for that player. Hence, a strategy is dominant if it is always better than any other strategy, for any profile of other players’ actions.
A strategy is termed strictly dominant if, regardless of what any other players do, the strategy earns a player a strictly higher payoff than any other. If a player has a strictly dominant strategy then they will always play it in equilibrium.
A strategy is weakly dominant if, regardless of what any other players do, the strategy earns a player a payoff at least as high as any other strategy.
If there are better strategies to take within a game then there must also be worse strategies to take and we call these worse strategies dominated. A dominated strategy in a game means that there is some other choice for the agent to make that will have a better payoff than that one.
When the game is non-cooperative and players are assumed to be rational, strictly dominated strategies are eliminated from the set of strategies that might feasibly be played. Thus the search for an equilibrium typically begins by looking for dominant strategies and eliminating dominated ones.
For example, in a single iteration of the prisoner’s dilemma game cooperation is strictly dominated by defect for both players. Because either player is always better off playing defect, regardless of what their opponent does. In searching for the equilibrium to this game we would simply look at each cell and ask is there a better option for the play? If so then the cell is dominated and we should not choose it. Once we have done this for both players we can identify a corresponding cell or number of cells that is optimal for each, giving us the equilibrium or possibly a number of different equilibria.
In games of conflict and competition, we are often interested in knowing what is the strategy that one can play that will reduce one’s exposure to some negative event.
For example, this might be a scenario of war, where we have a number of different options as to the route along which we will send our food supply to our troops. Along any of these routes, there is the possibility that they will get bombed. We would then try to choose the option that will minimize the amount of damage that might possibly be caused to the convoy. This is captured in the term minimax. Minimax is a decision rule for minimizing the possible loss for a worst case scenario. The minimax value of a player is the smallest value that the other players can force the player to receive, without knowing the agent’s actions.
A minimax strategy is commonly chosen when a player cannot rely on the other party to keep any agreement or they have in their interest that you gain the minimum payoff, such as in a zero-sum game.
Calculating the minimax value of a player is done in a worst-case approach: for each possible action of the player, we check all possible actions of the other players and determine the worst possible combination of actions – the one that gives the player the smallest value. Then, we determine which action the player can take in order to make sure that this smallest value is the largest possible.
A maximin strategy is one where the player attempts to earn the maximum possible benefit available. This means they will prefer the option which offers the chance of achieving the best possible outcome – even if a highly unfavorable outcome is possible when taking that strategy. This maximin strategy that is often referred to as the best of the best, is also seen as ‘naive’ and an overly optimistic strategy, in that it assumes a highly favorable environment for decision making. In contrast, the minimax strategy is a more realistic strategy in that it takes account of the worst case scenario and prepares for that eventuality.
In studying the dynamics of cooperation and competition between actors, understanding the structure of the game that is being played is central to understanding the system of interest.
In game theory, a primary distinction is made between those game structures that are cooperative and those that are non-cooperative.
As we will see the fundamental dynamics surrounding the whole game are altered as we go from games whose structure is innately competitive to those games where cooperation is the default position.
A cooperative game is one wherein the agents are able to resort to some institution or third party in order to enable cooperation and optimal results for all.
A game is noncooperative if players cannot form the structures required to enable cooperation.
For example, we might think about two people wishing to make a commercial transaction online. Given two anonymous people interacting without some institution to enable cooperation, there is no reason for either to think that the other will carry through with the transaction as promised.
The seller is incentivised to take the money and not send the item while the buyer is likewise incentivised to take the product without sending the money. In the absence of some cooperative structure that would enable each party to trust the other and thus cooperate, the game would naturally gravitate towards defection and the potentially valuable transaction would not take place.
Thus we can see how in the absence of cooperative mechanisms each player may follow the course that renders them the best payoff without regard for what the other does, or what is optimal for the overall system and this can result in suboptimal outcomes for all.
In non-cooperative games, each agent in the game is assumed to act in their self-interest, and this self-interested agent is the primary unit of analysis within noncooperative games because there is no cooperative structure.
This is in contrast to cooperative game theory that treats groups or subgroups of agents as the unit of analysis and assumes they can achieve certain outcomes among themselves through binding cooperative agreements.
Game theory historically has been very much focused on non-cooperative games and trying to find optimal strategies within such a context. This is likely because non-cooperative games are very much amenable to our standard mathematical framework and thus offer nice closed form solutions.
But it is important to note that the real world is made up of situations that are sometimes cooperative, sometimes non-cooperative, and often involve elements of both.
As previously mentioned, non-cooperative games arise due to a number of factors. Firstly the game may be inherently zero-sum, meaning what one wins the other loses and thus there is an inherent dynamic of competition.
Many sports games are specifically designed to be zero-sum in their structure, so as to create a dynamic of competition. In such a case there is only one prize, and if someone else gets it, you don’t. There is no incentive for cooperation and every incentive for competition and thus the best option is for the actor to focus on maximizing their payoff irrespective of all else.
This is called a strictly competitive game. A strictly competitive game is a game in which the interests of each player are diametrically opposed.
Likewise, a game may be non-competitive due to the incapacity to create cooperative structures. Most people, when engaged in a game, will wish to not only optimize their own payoff but will wish to optimize the overall outcome as well.
In general, people do not like the idea of waste or of unfairness and we typically search for some optimal solution given both our own interests and some consideration for the overall organization.
The real world of social interaction is full of all sorts of informal social and cultural institutions designed to enable trust, cooperation and optimal outcomes for all.
Almost as soon as two people start to interact they will start to look for commonalities and shared interests that enable them to develop trust and cooperation.
Thus, non-cooperative games are typically those where the actors can not interact and form the trust required for cooperation. Indeed, there will be certain games that we construct where we specifically want competition and we do that by not allowing the players to cooperate, such as in a competitive market.
Lastly non-cooperative games can be a product of an incapacity to enforce binding contracts. If there is a third party involved to ensure optimal outcomes for the overall organization through sanctions and incentives, this can form a solid basis for cooperation – in the way that a government does by enforcing laws.
This is famously captured in Thomas Hobbes’ conception of the state of nature. Where he pondered “What was life like before civil society?” He went on to write “during the time men live without a common power to keep them all in awe, they are in that condition which is called war, and such a war as is of every man against every man.”
In this state, every person has a natural right or liberty to do anything one thinks necessary for preserving one’s own life.
Hobbes’ ideas illustrate vividly how in the absence of a third party to enforce cooperation, competition can prevail.
Non-cooperative games create a specific dynamic within a game, where we are taking the individual and their payoff as the basic unit of analysis. In such a circumstance we do not need to consider what is best for all if given some form of cooperation because this is not possible within the context.
We are solely interested in how the individuals will act.
The question of how should they act to optimize their own payoff, and given the assumption that both are performing this optimization what will be a stable solution to the game.
Given these assumptions, both players should search for a strategy that optimizes their payoff, and where those strategies of the players interact we should have a stable outcome, that we should be able to predict will occur.
This stable outcome is what we call an equilibrium.
Where equilibrium, in the general sense, means a state in which opposing forces are balanced, thus creating a point of stability and stasis.
When we see a ball at the bottom of a bowl it is in a state of equilibrium, because if we put it anywhere else in the bowl the force of gravity would act on it to pull it back to this static point. This is the same for the actors in a non-cooperative game because they are both trying to optimize their payoff they will both naturally gravitate towards the strategy that gives them the highest payoff.
But because their payoff is dependent on what strategy the other chooses and because they can not depend upon cooperation between them, they have to choose the best strategy assuming that the other will work to optimize their payoff without cooperating.
This point of equilibrium in a game is called the Nash equilibrium after the famous mathematician John Nash.
In game theory, the Nash equilibrium is a solution concept of a non-cooperative game involving two or more players in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his or her own strategy. If each player has chosen a strategy and no player can benefit by changing strategies while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding payoffs constitute a Nash equilibrium.
The Nash equilibrium is one of the foundational concepts in game theory.
The basic intuition of the Nash equilibrium is in predicting what others will do given their self-interest only and then choosing your optimal strategy given that assumption.
Nash equilibrium is a point where all players are doing their best given the absence of cooperation. It is a law that no one would want to change in the absence of some effective overall structure for coordination.
Nash equilibrium is best illustrated through the prisoner’s dilemma game.
The prisoner’s dilemma game is a classic two player game that is often used to present the concept of Nash equilibrium in a payoff matrix form.
Conceive of two prisoners detained in separate cells, interrogated simultaneously and offered deals in the form of lighter jail sentences for betraying the other criminal. They have the option to “cooperate” with the other prisoner by not telling on them, or “defect” by betraying the other.
However, if both players defect, then they will both serve a longer sentence than if neither said anything. Lower jail sentences are here interpreted as higher payoffs.
The prisoner’s dilemma has a similar matrix as depicted for the coordination game, but the maximum reward for each player is obtained only when the players’ decisions are different. Each player improves their own situation by switching from “cooperating” to “defecting”, given the knowledge that the other player’s best decision is to “defect”. The prisoner’s dilemma thus has a single Nash equilibrium: where both players choose to defect.
What has long made this an interesting case to study is the fact that this scenario is globally inferior to “both cooperating”. That is, both players would be better off if they both chose to “cooperate” instead of both choosing to defect. However, each player could improve their own situation by breaking the mutual cooperation, no matter how the other player changes their decision.
The central aim of non-cooperative game theory then is in trying to predict people’s actions within a game by finding the Nash equilibria and assuming they will play that because it is their best option.
It is then legitimate for us to ask does equilibrium analysis give us any predictive capacity over what happens in the real world? Often the outcome of experiments is not an equilibrium as predicted by the theory. This is mainly because people do not fully reason through the game in a fully logically consistent fashion.
Equilibrium is a point where everyone has figured out what everyone else will do, thus behaviorally it often does not predict what people will do the first time they play the game.
Equilibrium should more be interpreted as what will happen over a number of iterations within a non-cooperative game, as players come to better understand the game and how to reason through it.
Similar to putting a ball in a bowl, it takes time before it arrives at an equilibrium and this is what is seen in game experiments they tend over time towards the equilibrium.
For example, in a game, people are asked to choose a number between 0 to 100, with the winner being the person who is able to guess what will be 2/3 of the average figure proposed by others.
So everyone is being asked to guess a bit below the average number proposed.
In this game, only a small percentage choose the equilibrium point – which is zero – and because other people did not act rationally in this game they were wrong.
In many ways then choosing this equilibrium as a prediction of what would happen is not a good option. And this clearly diverges dramatically from what the theory tells us.
However, overtime, as the game is iterated upon the numbers chosen by people does move towards the equilibrium. Thus it tells us something about statistical averages of the system but not very much about how it will behave in the real world the first iteration of the game.