Game Theory 11: Evolutionary Game Theory

EVOLUTIONARY GAME THEORY

Classical game theory was developed during the mid 20th century primarily for application in economics and political science. But in the 1970s a number of biologists started to recognize how similar the games being studied were to the interaction between animals within ecosystems. Game theory then quickly became a hot topic in biology as they started to find it relevant to all sorts of animal and microbial interactions from the feeding of bats to the territorial defense of stickleback fish.

Originally evolutionary game theory was simply the application of game theory to evolving populations in biology. Asking how cooperative systems could have evolved over time from various strategies that biological creatures might have adopted. However, the development of evolutionary game theory has produced a theory which holds great promise as a general theory of games.

More recently, evolutionary game theory has become of increased interest to economists, sociologists, anthropologists and social scientists in general as well as philosophers. In this video will talk about this more general application of evolutionary game theory.

Whereas the game theory that we have been talking about so far has been focused on static strategies, that is to say, strategies that do not change over time, evolutionary game theory differs from classical game theory in focusing more on the dynamics of strategy change. Here we are asking how strategies evolve over time and which kind of dynamic strategies are most successful in this evolutionary process.

EVOLUTION

One of the interesting differences between evolutionary game theory and standard game theory is that the evolutionary version does not require players to act rationally.

When we talk about biological cells or ants we know that they do not sit in front of a payoff matrix and ask themselves what is the best payoff, in evolutionary game theory natural selection does this for us.

So if we have a group of cooperators and defectors who randomly meet each other, the average payoff for the defectors is higher than the cooperators, therefore, they reproduce better. Payoffs in evolutionary biology correspond to reproductive success. So after some time evolution will have favored defectors to the point where all of the cooperators will be extinct.

The basic logic is that, for something to survive the course of time, it must be an optimal strategy or else any other strategy that is more effective will eventually come to dominate the population.

Traditionally, the story of evolution is told as one of competition, and there is certainly plenty of this. But there is also mutualism, where organisms and people manage to work together cooperatively and survive in the face of defectors. Many research papers have been written on this topic of how cooperation could evolve in the face of such an evolutionary dynamic.

The general question of interest in evolutionary game theory is in how do patterns of cooperation evolve, and what are optimal strategies to use in a game that evolves over time.

The basic mechanism that underlies the evolution of cooperation is the interdependency between acts over time.

In a single shot game, it makes sense to always defect, but with repeated interaction, cooperation becomes greatly more viable. If the game is repeated, it is no longer the case that strict defection is the best option.

If the prisoner’s dilemma situation is repeated it allows non-cooperation to be punished more, and cooperation to be rewarded more, than the single-shot version of the problem would suggest. We can understand this better by looking at a number of experiments that were done to investigate this dynamic.

EXPERIMENTS

The political scientist Robert Axelrod in the late seventies did a number of highly influential computer experiments asking what is a good strategy for playing a repeated Prisoner’s Dilemma. Axelrod asked for various researchers to submit computer algorithms to a competition to see which algorithms would fare best against each other. Computer models of the evolution of cooperation showed that indiscriminate cooperators almost always end up losing against defectors, who accept helpful acts from others but do not reciprocate. People who are cooperative and helpful indiscriminately all of the time will end up getting taken advantage of by others. However, if we have a population of pure defectors they will also lose out on the possible rewards of cooperation that would give all higher payoffs.

Many strategies have been tested; the best competitive strategies are general cooperation with a reserved retaliatory response if necessary.

The most famous and one of the most successful of these is Tit for Tat with a simple algorithm. Tit for Tat is a very simple algorithm of just three rules, I start with cooperation, if you cooperate, then I will cooperate. If you defect, then I will defect. Computer tournaments in which different strategies were pitted against each other showed Tit for Tat to be the most successful strategy in social dilemmas.

Tit for Tat is a common strategy in real-world social dilemmas because it is nice but firm it makes cooperation a possibility but is also quick to reprimand. It is a strategy that can be found naturally in everything from international trade policies to people borrowing and lending money. And in repeated interactions cooperation can emerge when people adopt a Tit for Tat strategy.

To go beyond Tit for Tat, researchers started to use computers to simulate the process of evolution. Instead of people submitting solutions the computer itself generated mutations and selected from them with the researchers recording and analyzing the results.

From these experiments, they found that if the players play randomly the winners are those that always defect. But then when everyone has come to play defect strategies, if a few people play Tit for Tat strategies, a small cluster can form where among themselves they get a good pay off.

Evolutionary selection can then start to favor them and they do not get exploited by all the defectors because they immediately switch to defect in retaliation.

But the Tit for Tat strategy did not last long in this setting as a new solution came to emerge given this context. This strategy was a mutant of Tit for Tat that was more forgiving called Generous Tit for Tat.

Generous Tit for Tat is an algorithm that starts with cooperation and then will reciprocate cooperation from others, but if the other defects it will defect with some probability. Thus it uses probability to enable the quality of forgiveness. It cooperates when others do but when they defect there is still some probability that it will continue to cooperate. This is a random decision so it is not possible for others to predict when it will continue to cooperate.

It turns out that this forgiving strategy is optimal in environments where there is some degree of noise in communications, as is characteristic of real-world environments.

In the real world, we often do not know for certain if our partner cheated or if someone really meant to say what they said, and these errors have to be compensated for by some degree of forgiveness. In a world of errors in action and perception, such a strategy can be a Nash equilibrium and evolutionarily stable. The more beneficial cooperation is, the more forgiving Generous Tit for Tat can be, while still resisting invasion by defectors.

The extraordinary thing that now happens is that once everyone has moved towards playing Generous Tit for Tat, cooperation becomes a much stronger attractor and at this stage, players can now play an unconditional cooperative strategy without having any disadvantage.

In a world of Generous Tit for Tat, there is no longer a need for any other actions and thus unconditional cooperators survive. In order for a strategy to be evolutionarily stable, it must have the property that if almost every member of the population follows it, no mutants can successfully invade – where a mutant is an individual who adopts a novel strategy.

In many situations, cooperation is favored and it even benefits an individual to forgive an occasional defection, but cooperative societies are always unstable because mutants inclined to defect can upset any balance. And this is the downfall of the cooperative strategy. What happens next is somewhat predictable. In a world where everyone is cooperating, unconditional defection is an optimal strategy once it takes hold.

Thus we can see a dynamic cyclical process, as higher forms of cooperation arise and then collapse. In many ways then this reflects what we see in the real world of economies and empires rising and falling as institutional structures for cooperation are formed, mature and eventually decline.

INDIRECT RECIPROCITY

These experiments describe the evolution of systems of cooperation through direct interaction, and much of our interactions are repeated with people we have interacted with before and built up an understanding of their capacity for reciprocity. However, in large societies we have to interact with many people that we have not interacted with before and it may only be a once off interaction.

Experiments have shown that people help those who have helped others and have shown reciprocity in the past and that this form of indirect reciprocity has a higher payoff in the end. Reputation systems are what allow for the evolution of cooperation by indirect reciprocity. Natural selection favors strategies that base the decision to help on the reputation of the recipient. The idea is that you interact with others and that interaction is seen and people note whether you acted cooperatively or non-cooperatively. That information is then circulated so that others learn about your behavior. Direct reciprocity is where I help you and you help me, indirect reciprocity is where I help you and then somebody helps me because I now have a reputation for cooperating.

The result is the formation of reputation, when you cooperate that helps your reputation, when you defect it reduces it. That reputation then follows us around and is used as the basis for your interaction with others.

Thus reputation forms a system for the evolution of cooperation in larger societies where people may interact frequently with people that they may not know personally. But because of various reputation systems, they are able to identify those who are cooperative and enter into mutually beneficial reciprocal relations.

The more sophisticated and secure these reputation systems, the greater the capacity for cooperative organizations. We can create large systems wherein we know who to cooperate with and thus can be cooperative ourselves, potentially creating a successful community.

But of course, as the society gets bigger we have to form more complex institutions for enabling functional reputation systems. In such a way we have gone from small communities where local gossip was sufficed to know everyone’s capacity for cooperation, to large modern industrial societies where centralized organizations vouched for people’s reputation. To today’s burgeoning global reputation systems based on information technology and mediated through the internet.

Research shows that cooperators create better opportunities for themselves than non-cooperators: They are selectively preferred as collaborative partners, romantic partners, and group leaders. This only occurs however when people’s social dilemma choices are seen and recorded by others in some way.

However this kind of indirect reciprocity is cognitively complex, no other creature has mastered it to even a fraction of what humans have. Games of indirect reciprocity lead to the evolution of social intelligence and ever more sophisticated means of communications, social and cultural institutions that are characteristic of human civilization.

The basic problem of the evolution of cooperation is thus that nice guys get taken advantage of, and thus there must be some form of supporting structure to enable cooperation.

More than any other primate species, humans have overcome this problem through a variety of mechanisms, such as reciprocating cooperative acts, forming reputations of others and the self as cooperators, and caring about these reputations.

We create prosocial norms about good behavior that everyone in the group will enforce on others through disapproval, if not punishment, and will enforce on themselves through feelings of guilt and shame. All of which form the fabric of our sociocultural institutions that enable advanced forms of cooperation.

Game Theory 10: Cooperative Structures

COOPERATIVE STRUCTURES

Throughout this section of the course , we have been talking about cooperation and different aspects of the social dilemma. In this video, we will look at various approaches that have been identified for fostering the cooperation required to overcome this core constraint.

Our capacity to solve the social dilemma in various ways is a defining factor in the strength of individual relationships, social organizations, economies, and society at large and is thus a topic that is of great interest to many.

Depletion of natural resources, pollutants, and intercultural conflict, can be characterized as examples of social dilemmas.

Social dilemmas are challenging because acting in one’s immediate self-interest is tempting to everyone involved, even though everybody benefits from acting in the longer-term collective interest. Thus some form of cooperative institutional infrastructure is required to enable the cooperation required for sustained success.

The empirical fact that subjects in most societies contribute anything in the simple public goods game, that we looked at previously, is a challenge for game theory to explain via motives of total self-interest. But as we have noted one of the defining features to human beings is their extraordinarily high level of cooperative behavior. Cooperation is a massive resource for advancing individual and group capabilities, and over the course of thousands of years, we have evolved complex networks for collaboration and cooperation which we can call institutions of various form.

These institutional structures help us to solve the many different forms of the tragedy of the commons that we encounter within large societies.

As we have touched upon previously, the central issue of the tragedy of the commons is externalities. That is to say, that the actions that the individual takes have costs that the person does not fully bare, as they are externalized to the overall organization. If there are then too many negative externalities and not enough positive externalities the organization will degrade over time. The central issue in solving the tragedy of the commons is then in reconnecting the costs of the individual’s actions on the whole with the costs that they pay. When the individual always pays the full costs for their actions then there is no social dilemma and we have a self-sustaining organization.

This may sound simple in the abstract, but in practice, it is not simple at all, and this is one reason why we have such a complex array of economic and social institutions. How we approach doing this though, depends on the degree of interconnectivity and interdependence between the players in the game.

INTERDEPENDENCE

When there is low interconnectivity, then there will likely be low interdependence, which means a high probability for negative correlations between actors.

When actors are independent then they can do things that affect the other without that effect returning to themselves.

For example, if I live in Germany and pollute the atmosphere so that there is acid rain in Sweden, as long as I never go to Sweden then what happens there does not affect me too much and this negative correlation can exist.

Now if we turn up the interconnectivity and interdependence, this changes the dynamic. Say I have business partners in Sweden and happen to go on holiday there also. Due to this interconnectivity and interdependence there is a much greater possibility for a positive correlation between my experience and what happens in Sweden. This interconnectivity and interdependence means that I increasingly have to factor my negative externalities into my cost-benefit equation.

The central importance of interdependence as a parameter in cooperation can be simply seen in the way that people cooperate more with those that they are closely connected to; more than with those that are of a different group, culture or society that they are not connected with.

Thus, how we go about solving the social dilemma depends on the degree of interconnectivity and interdependence within the dynamic. At a low-level, cooperative structures have to be imposed through regulation, while at a high level this is no longer necessary as the interconnectivity and interdependence can be used to create self-sustaining cooperative organizations. This is illustrated by how different cooperative structures have evolved within society. Those within small closely interdependent groups like the family and those that have formed for larger society that is composed of many groups that are more independent.

This is to a large extent part of what has happened as we have gone from small, pre-modern societies to large modern societies. As the scale of the social systems that we are engaged in has increased the interconnectivity and interdependence between any two random members has decreased – because they are farther apart in the network. Thus this has disintegrated traditional cooperative institutions that are based on local interactions and interdependencies.

In the absence of tools for interconnecting everyone within a large national society, we have had to create the formal centralized regulatory institutions of the nation state.

And of course, with the rise of information technology and globalization, this is once again changing as we create social interdependencies that span the entire planet.

REGULATION

The most manifest and obvious form for enabling cooperation is regulation and rules that are imposed on the social system by a third party to ensure behavior that is of benefit to the group. The aspect of cooperation examined in many experimental games is cooperation that occurs when people follow rules limiting the exercise of their self-interested motivations.

People might want to take from a shop without paying, but are required to abide by the law, they may want to fish in a lake, but limit what they catch to the quantity specified in a permit.

They buy a fuel efficient car because of regulation taxing the sale of inefficient cars.

In all of these situations, people are refraining from engaging in behavior that would give them immediate benefit but is against the welfare of the group.

Regulation involves limiting undesirable behavior.

This method for enabling cooperation through regulation and rule adherence is deeply intuitive to us and often the default assumption as to how we might achieve cooperation.

The central aim of regulation is to connect the individual’s externalities with the costs and benefits they pay by imposing extra costs on them for certain negative externalities, while providing them with subsidies and payments for certain activity that generates positive externalities.

This form of solution for enduring cooperation through an external third party that imposes sanctions or rewards can be very effective in situations of independence between members.

For example, this would be a good solution to the prisoner’s dilemma where the members can not communicate with each other and are otherwise independent. By forming a third party that could impose sanctions on them, we could change the payoffs in the game to enable cooperative outcomes.

Although the regulatory approach is simple and straightforward, the development and maintenance of this external organization have overhead costs. It is also prone to corruption and has other limitations to it.

Studies have been conducted into the success of the establishment of a leader or authority to manage a social dilemma. Experimental studies on commons dilemmas show that over-harvesting groups are more willing to appoint a leader to look after the common resource.

There is a preference for a democratically elected prototypical leader with limited power especially when people’s group ties are strong.

When ties are weak, groups prefer a stronger leader with a coercive power base.

The question remains whether authorities can be trusted in governing social dilemmas and field research shows that legitimacy and fair procedures are extremely important in citizen’s willingness to accept the authority.

Furthermore, the formal governance structures of a police force, army, and judicial system will fail to operate unless people are willing to pay taxes to support them. This raises the question if many people want to contribute to these institutions.

Experimental research suggests that particularly low-trust individuals are willing to invest money in punishment systems.

The political economist Elinor Ostrom won a Nobel laureate for her studies of various communities around the world and how they managed to develop diverse institutional arrangements for managing natural resources, thus avoiding ecosystem collapse. She illustrated how communities can be managed successfully by the people who use them rather than by governments or private companies. In an interview talking about this centralized regulatory approach, she had this to say about it: “for some simple situations that theory works and we should keep it for the right situation, but there are so many other rich solutions.”

INTERDEPENDENCE

When interconnectivity between members within a game increases, so typically does interdependence and this changes the nature of the game.

Externalities are things that we can put external to our domain of value and interest, but interconnectivity reduces the capacity to do this.

One good example of this is the warning signs on the side of cigarette packets that makes you aware of the negative externalities of smoking on your body. They are trying to connect you with the negative externality that you are creating so that you recognize your interdependence and factor it into the equation under which you are making your decision to smoke.

Thus we can see an externality is not necessarily something that is far away, it is simply whatever you exclude from your value system so that reducing it has no reduction to your payoff. But connectivity takes this barrier down requiring us to recognize the value of the other entity and factor it into our decision. This connectivity can be of many different kinds.

Communication is a form of connection that can enable positive interdependence and there is a robust finding in the social dilemma literature that cooperation increases when people are given a chance to talk to each other.

Cooperation generally declines when group size increases. In larger groups, people often feel less responsible for the common good, as they are more removed from it and the other people with whom they share it.

Thus we can see what is really at the core of the social dilemma is the question of what people value and how far that value system extends.

Wherever we stop seeing something as part of us or our group, that is where negative externalities accumulate and start to give us the social dilemma. However, by building further connections so that people recognize their interdependence with what they previously saw as external, they will start to factor it into the value system under which they are making their choices and reduce their negative externalities. From this perspective, the issue is really one of value and externalities.

Connectivity can change that equation, working to internalize the externalities. Connectivity though is just an enabling infrastructure, one still has to build the channels of communication and structures that enable positive interdependence.

Building systems of cooperation in such a context means enabling ongoing interaction, with identifiable others, with some knowledge of previous behavior, lists of reputations that are durable and searchable and accessible, feedback mechanisms, transparency etc. These are all means of fostering positive interdependence once interconnectivity is present and through them, self-regulating and sustainable systems of cooperation can be formed.

If we think back to the public goods game, if the amount contributed is not hidden, then players tend to contribute significantly more. This is simply creating a transparent system where there is feedback.

As another example, we could think of eBay. EBay is really a huge social dilemma game. You would not send money before receiving the item nor would the other party send the item before receiving the money, so why has eBay succeeded? Not because eBay is going to throw you into jail if you don’t play nice, it is because of communication, transparency and feedback mechanisms that build positive interdependence.

This interconnectivity that builds positive interdependence is not just in space but also in time. Probably the single biggest difference in the prisoner’s game is whether it is a one-off or recurring game that is being played. And it is to this topic of games that play out over time that we will turn to in the next section.

Game Theory 9: Public Goods Games

The game theoretical version of the social dilemma is called the public goods game. Public goods games are usually employed to model the behavior of groups of individuals achieving a common goal. The public goods game has the same properties as the prisoner’s dilemma game, but describes a public good or a resource from which all may benefit regardless of whether or not they contributed to the good.

Game Theory 8: Social Dilemma

PUBLIC GOODS GAMES

In the past ten to twenty years interest in social dilemmas has grown dramatically within many domains – particularly those resulting from overpopulation, resource depletion, and pollution.

The study of social dilemmas is one of the most interdisciplinary research fields, with the participation of researchers from anthropology, biology, economics, mathematics, neuroscience, political science, and psychology among others.

The social dilemma captures the core dynamics within groups requiring collective action. Where there is a conflict between an individual’s immediate personal or selfish interests and the actions that maximize the interests of the group.

At the heart of social dilemmas lies a disjunction between the costs to the individual and the cost to the whole, or benefits to the individual and benefits to the whole.

We call this value that is not factored into the cost-benefit equation of the individual an externality. And it is externalities that create the disjunction between the parts and whole and result in the social dilemma.

As an example of the social dilemma, we can think about the voting process within democratic political systems. In such a system citizens are called upon to make informed decisions about who should manage their country’s government. Many people choose not to vote, but of those who do, in order to make an informed decision, they have to gather and process information about the candidates.

Since each person’s vote is unlikely to affect the outcome of an election, and everyone knows this, there is little incentive to the individual to increase their knowledge about the relevant issue and overcome misinformation and preconceptions. The benefits of collecting and processing the information are diffused to the whole population. But the cost of doing such action is carried by the individual. Informed voting is then what we would call a public good.

As another example, we can think of a situation where during winter people in a village are asked to keep their thermostats low to conserve the limited amount of energy available. This would though require them to suffer from the cold without appreciably conserving the fuel supply by their individual sacrifice; yet if all keep their thermostats high, all may run out of fuel and freeze.

What is happening in this game is that there is a positive externality. Some of the value that is being generated by the individual is being externalized to the whole organization. This external value can not be immediately factored into the cost-benefit analysis of the individual. If the system is simply operated according to this immediate cost benefit analysis of the individuals then there will likely be an undersupply of it.

Equally, we can have negative externalities. The classical example being air pollution and traffic jams. An important part of the social dilemma is that at any given decision point, individuals receive higher payoffs for making selfish choices than they do for making cooperative choices, regardless of the choices made by those with whom they interact. And everyone involved receives lower payoffs if everyone makes selfish choices than if everyone makes cooperative choices.

What is interesting about the social dilemma is that it is not a feature of the agents, but of the structure of the game. It has nothing to do with the personalities and motivation of the individuals. It is a tragedy because you know what the outcome will be, but you can not individually do anything to avoid it, given only the self-interests of the individual.

No one has an individual motive to change their behavior even though everyone will be worse off.

Our traditional tools of non-cooperative game theory that are focused on the immediate costs and benefits to the individual, only really work when the externalities are limited. As soon as the externalities both positive or negative go above a certain level we have to think and operate collectively.

Many private goods are rivalrous, meaning they can be only consumed by one agent, and excludable meaning it is possible to exclude others from their use. This reduces the externalities from the item and thus makes it possible to associate the value of that item with an agent and factor it into their cost-benefit analysis.

Goods are public when they are considered both nonrivalrous and nonexcludable, meaning that there will likely be many externalities and thus they can not be effectively managed by the immediate cost benefit analysis of the individual agents. In such a case the social dilemma can arise. When people can benefit from the positive externalities, and produce more of the negative externalities without paying, the result is a macro level imbalance that leads to the system being rendered unsustainable. As we will discuss further in a coming video, there are essentially two different approaches to solving this imbalance and managing collective action.

The organization can try to create some top-down structure that regulates the system to ensure that those who create negative externalities pay for them and those who create positive externalities get paid for them. This works to reintegrate the positive externalities into the cost-benefit analysis of the agents and maintain a macro-level balance. This is the traditional approach taken, the classical example of which being governments, that use force and incentives to regulate the system towards these ends.

Equally a second approach is to increase the degree of connectivity between the agents and thus the potential for a higher degree of interdependence between them. Given that interdependence means that what happens to one agent also happens to another, as interdependence between agents and between the individual and the whole increases, externalities decrease – because there is nowhere for them to go and agents have to increasingly factor them into their own cost-benefit analysis.

Which is one approach to managing the system and potentially solving the problem.

Externalities are always a function of independence. If two things are one, then there is no possibility for externalities, the more independent they are the greater the possibility for externalities. Thus all solutions to externalities and the social dilemma will involve creating positive interdependence between the elements in the system, but different approaches will do this in different ways.

We will pick this theme up again in a future video when we go deeper into ways of solving social dilemmas. In the next video, we will look at what game theory can tell us about the social dilemma as we talk about public goods games.

Game Theory 7: Cooperative Games

COOPERATIVE GAMES

Game theory is the study of interdependent interactions between adaptive agents, and when we look at these interactions between agents in the world around us we see elements of both competition and cooperation.

We see the organelles within a biological cell work together to enable its overall functioning. We see organisms within ecosystems forming symbiotic cooperative relations. We see people form families, tribes, cities, and nations all of which involve high levels of cooperation.

Game theory should then be a tool that helps us to understand what actions an agent should take within non-cooperative situations and what outcomes are most likely in such games.

But it should also be a tool that helps us understand how cooperation works. The first thing for us to note is that the dynamics of cooperation are very different from those that we have been studying in games of noncooperation. Cooperative dynamics rewrite the rules of the games people play.

In non-cooperative games like the prisoner’s dilemma, we noted how individual rationality, in fact, led to collective irrationality. We called this a dilemma and lamented the fact that within the non-cooperative framework of self-interested individuals there was nothing we could do about it. However, we as humans do not just look out for our own interests but also those of others and in so doing we have evolved highly sophisticated means for cooperation in the process.

When we add this new element to the game, that of cooperation, we now have the possibility for the agents to solve this dilemma. The question then turns to how and when do coalitions for cooperation form, where we can achieve both stable and optimal outcomes for the individual and the whole organization.

With non-cooperative games, we are solely focused on the payoffs to the individuals and searching for stable situations. Cooperative game theory, however, adds an extra dimension to this in that we now have to think about the payoff to the whole organization. In such a case we can not simply look a the actions of the individuals and their payoffs but we have to also look at the positive or negative externalities that these actions may have on the whole system.

COOPERATION

Cooperation is a process by which the components of a system work together to achieve the global properties. In other words, individual components that appear independent work together to create a complex whole, greater-than-the-sum-of-its-parts system.

Virtually all of human civilization is a product of our capacity to work cooperatively. Indeed the complex systems that surround us, like our global economy and technologies like a jumbo jet are a testament to our extraordinary capacity for cooperation.

In most animal groups and even our closest relatives in the primate group, competition is the norm, and cooperation occurs largely only among kin, who have common genes and so have a biological incentive to do so, or else among a few individuals who cooperate reciprocally. But humans cooperate with each other in very large groups in a multiplicity of ways.

People risk their lives in war for their countrymen and we make sure that our less fortunate compatriots have enough food and medical care to survive. On a daily basis, we obey all kinds of prosocial norms. And when we do breach some prosocial norm, like not doing our part in a collective enterprise, we feel guilty or ashamed, in general, we are highly sensitive to cooperative behavior.

Human evolved capacity for cooperation is a cultural one that distinguishes us from other creatures who we may share up to 98% of our genes in common with. A large-scale study has recently provided some foundation to this hypothesis.

Researchers compared two-year-old children to their nearest primate relative chimpanzees and orangutans. They were all given 16 different tasks grouped into two overall categories. One concerning an understanding of the physical world, the other an understanding of the social world. The physical tests were related to space, quantity, and causality, and the social tests concerned capacities for social imitation, communication, and intention reading. The experiment revealed that the children were not across the board more intelligent than the primate animals, but in fact only with respect to social cognition were they more advanced. At just two years of age, the children were already about twice as high on this indicator as the other two creatures. This social capacity enables us to take advantage of the skills and knowledge of others within a social group through cooperation.

The researchers noted that apes did have social cognitive skills, but they were mainly using their social understanding of others within contexts of competition. From this, the researchers proposed that on top of great apes skills for social cognition humans had evolved additional social cognitive capabilities for dynamics of cooperation, which involve greater complexity, but which can ultimately be seen as the foundations to advanced forms of civilization.

Thus they came to understand others as not just intentional goal seeking agents, but also as potential cooperative agents with whom they could work together to produce outcomes that neither could produce alone. This cognitive capacity along with communications enabled us to create the ever more complex social and cultural institutions for cooperation, that today form the foundation of our advanced systems of socio-economic coordination.

The researchers claimed that this distinction between apes and humans can be identified even in the earliest human economies. Noting how apes are individual foragers, where they will travel in small groups until they find a food source like a fig tree and then run up and grab the food separately without collaboratively producing it or sharing it. Humans, however, are collaborative foragers meaning that most traditional forager groups derive most of their daily nutrition from collaborative activities in different forms, such as hunting.

This is not to say that advanced forms of cooperation do not happen within other creatures. We just have to look at ant or bee colonies to see sophisticated coordination. However, these creatures have nowhere near the kind of individual cognitive capacity that apes and humans do, and thus we do not get the same kind of complex dynamic between the individual and the group that is at the heart of human social systems and the study of cooperation.

Dynamics of cooperation require more cognitive capabilities on the behalf of the individual because they are greatly more complex in nature than noncooperative situations.

Whereas non-cooperative dynamics are governed solely by the self-interest of the individual’s, cooperation involves a new level of organization, that of the group, and a complex dynamic between the individual and the overall group.

This central dynamic within cooperation is captured in the idea of the social dilemma. Social dilemmas are characterized by two properties: The social payoff to each individual for defecting behavior is higher than the payoff for cooperative behavior, regardless of what the other society members do, yet all individuals in the society receive a lower payoff if all defect than if all cooperate. It is a situation where individual rational behavior leads to a situation where everyone is worse off.

Social dilemmas are of interest to many because they reveal the core tension between the individual and the group, that is engendered in situations of cooperation. At their core, social dilemmas are situations in which self-interest is at odds with collective interests and they can be found in many situations of interdependence; from resource management to relationship development, to international politics, public goods provision and business management. In the next section, we will zoom in to look more closely at the workings of this social dilemma.

Game Theory 6: Pareto Optimality

PARETO OPTIMALITY

In non-cooperative game theory, the focus is on the agents in the game and strategies that optimize their payoffs, which results in some form of equilibrium.

As we can see in the prisoner’s dilemma game the issue arises in that what turns out to be the equilibrium is suboptimal for all the agents when taken as a whole. One way of defining what we mean by suboptimal for all is the idea of Pareto Optimality.

Named after Vilfredo Pareto, Pareto optimality is a measure of efficiency.

Whereas Nash Equilibrium is a solution concept of non-cooperative games, Pareto optimality in game theory answers a very specific question of whether an outcome can be better than the other?

Pareto optimality is a notion of efficiency or optimality for all the members involved.

An outcome of a game is Pareto optimal if there is no other outcome that makes every player at least as well off and at least one player strictly better off. That is to say, a Pareto optimal outcome cannot be improved upon without hurting at least one player.

To illustrate this let’s take the game called the stag hunt, wherein two individuals go out on a hunt. Each can individually choose to hunt a stag or hunt a hare. Each player must choose an action without knowing the choice of the other. If an individual hunts a stag, they must have the cooperation of their partner in order to succeed. An individual can get a hare by themselves, but a hare is worth less than a stag.

In the stag hunt, there is a single outcome that is Pareto efficient, which is that they both hunt stags. With this outcome, both players receive a payoff of three, which is each player’s largest possible payoff for the game. In this case, we cannot switch to any other outcome and make at least one party better off without making anyone worse off. The stag option is here the only Pareto optimal outcome.

One of the features of a Nash equilibrium is that in general, it does not correspond to a socially optimal outcome. That is, for a given game it is possible for all the players to improve their payoffs by collectively agreeing to choose a strategy different from the Nash equilibrium. The reason for this is that some players may choose to deviate from the agreed-upon cooperative strategy after it is made in order to improve their payoffs further at the expense of the group. A Pareto optimal equilibrium describes a social optimum in the sense that no individual player can improve their payoff without making at least one other player worse off. Pareto optimality is not a solution concept, but it can be an important attribute in determining what solution the players should play, or learn to play over time.

This is the interesting thing about the prisoner’s dilemma, that all options are Pareto optimal except for the unique equilibrium, which is for both to defect. This strong contrast between Pareto optimality and Nash equilibrium is what makes the prisoner’s dilemma a central object of study in game theory. The fact that all of the overall efficient outcomes are the ones that do not occur in equilibrium, makes it a classical illustration of the core dynamic between cooperation and competition.

This is a good segway into the next section of the book where we will be talking about the dynamics of cooperation. Where we will be looking specifically at the overall outcomes trying to optimize them instead of just individual payoffs.

Game Theory 5: Solution Concept

SOLUTION CONCEPT

As we talked about in the last video, the central aim in non-cooperative game theory is in trying to find the optimal strategy for agents to play within a game and trying to predict the outcomes of the game by finding points of equilibrium.

This equilibrium is called the Nash Equilibrium and is considered the best option given the absence of frameworks to support cooperation.

This is what we call a solution concept.

In game theory, a solution concept is a model or rule for predicting how a game will be played. These predictions are called “solutions”, and describe which strategies will be adopted by players and, therefore, the results of the game.

The most commonly used solution concepts are equilibrium concepts.

Where we look for a set of choices, one for each player, such that each person’s strategy is best for them when all others are playing their stipulated best response.

In other words, each picks their best response to what the others do.

In game theory the term best response refers to the strategy (or strategies) which produce the most favorable outcome for a player, taking other players’ strategies as given. Best response is when you know what others are going to do and you choose your best response.

DOMINANT STRATEGY

Sometimes one person’s best choice is the same no matter what the others do. This is called a “dominant strategy” for that player. Hence, a strategy is dominant if it is always better than any other strategy, for any profile of other players’ actions.

A strategy is termed strictly dominant if, regardless of what any other players do, the strategy earns a player a strictly higher payoff than any other. If a player has a strictly dominant strategy then they will always play it in equilibrium.

A strategy is weakly dominant if, regardless of what any other players do, the strategy earns a player a payoff at least as high as any other strategy.

If there are better strategies to take within a game then there must also be worse strategies to take and we call these worse strategies dominated. A dominated strategy in a game means that there is some other choice for the agent to make that will have a better payoff than that one.

When the game is non-cooperative and players are assumed to be rational, strictly dominated strategies are eliminated from the set of strategies that might feasibly be played. Thus the search for an equilibrium typically begins by looking for dominant strategies and eliminating dominated ones.

For example, in a single iteration of the prisoner’s dilemma game cooperation is strictly dominated by defect for both players. Because either player is always better off playing defect, regardless of what their opponent does. In searching for the equilibrium to this game we would simply look at each cell and ask is there a better option for the play? If so then the cell is dominated and we should not choose it. Once we have done this for both players we can identify a corresponding cell or number of cells that is optimal for each, giving us the equilibrium or possibly a number of different equilibria.

MINIMAX/MAXMINI

In games of conflict and competition, we are often interested in knowing what is the strategy that one can play that will reduce one’s exposure to some negative event.

For example, this might be a scenario of war, where we have a number of different options as to the route along which we will send our food supply to our troops. Along any of these routes, there is the possibility that they will get bombed. We would then try to choose the option that will minimize the amount of damage that might possibly be caused to the convoy. This is captured in the term minimax. Minimax is a decision rule for minimizing the possible loss for a worst case scenario. The minimax value of a player is the smallest value that the other players can force the player to receive, without knowing the agent’s actions.

A minimax strategy is commonly chosen when a player cannot rely on the other party to keep any agreement or they have in their interest that you gain the minimum payoff, such as in a zero-sum game.

Calculating the minimax value of a player is done in a worst-case approach: for each possible action of the player, we check all possible actions of the other players and determine the worst possible combination of actions – the one that gives the player the smallest value. Then, we determine which action the player can take in order to make sure that this smallest value is the largest possible.

A maximin strategy is one where the player attempts to earn the maximum possible benefit available. This means they will prefer the option which offers the chance of achieving the best possible outcome – even if a highly unfavorable outcome is possible when taking that strategy. This maximin strategy that is often referred to as the best of the best, is also seen as ‘naive’ and an overly optimistic strategy, in that it assumes a highly favorable environment for decision making. In contrast, the minimax strategy is a more realistic strategy in that it takes account of the worst case scenario and prepares for that eventuality.

Game Theory 4: Non-Cooperative Games

NON-COOPERATIVE GAMES

In studying the dynamics of cooperation and competition between actors, understanding the structure of the game that is being played is central to understanding the system of interest.

In game theory, a primary distinction is made between those game structures that are cooperative and those that are non-cooperative.

As we will see the fundamental dynamics surrounding the whole game are altered as we go from games whose structure is innately competitive to those games where cooperation is the default position.

A cooperative game is one wherein the agents are able to resort to some institution or third party in order to enable cooperation and optimal results for all.

A game is noncooperative if players cannot form the structures required to enable cooperation.

For example, we might think about two people wishing to make a commercial transaction online. Given two anonymous people interacting without some institution to enable cooperation, there is no reason for either to think that the other will carry through with the transaction as promised.

The seller is incentivised to take the money and not send the item while the buyer is likewise incentivised to take the product without sending the money. In the absence of some cooperative structure that would enable each party to trust the other and thus cooperate, the game would naturally gravitate towards defection and the potentially valuable transaction would not take place.

Thus we can see how in the absence of cooperative mechanisms each player may follow the course that renders them the best payoff without regard for what the other does, or what is optimal for the overall system and this can result in suboptimal outcomes for all.

In non-cooperative games, each agent in the game is assumed to act in their self-interest, and this self-interested agent is the primary unit of analysis within noncooperative games because there is no cooperative structure.

This is in contrast to cooperative game theory that treats groups or subgroups of agents as the unit of analysis and assumes they can achieve certain outcomes among themselves through binding cooperative agreements.

Game theory historically has been very much focused on non-cooperative games and trying to find optimal strategies within such a context. This is likely because non-cooperative games are very much amenable to our standard mathematical framework and thus offer nice closed form solutions.

But it is important to note that the real world is made up of situations that are sometimes cooperative, sometimes non-cooperative, and often involve elements of both.

As previously mentioned, non-cooperative games arise due to a number of factors. Firstly the game may be inherently zero-sum, meaning what one wins the other loses and thus there is an inherent dynamic of competition.

Many sports games are specifically designed to be zero-sum in their structure, so as to create a dynamic of competition. In such a case there is only one prize, and if someone else gets it, you don’t. There is no incentive for cooperation and every incentive for competition and thus the best option is for the actor to focus on maximizing their payoff irrespective of all else.

This is called a strictly competitive game. A strictly competitive game is a game in which the interests of each player are diametrically opposed.

Likewise, a game may be non-competitive due to the incapacity to create cooperative structures. Most people, when engaged in a game, will wish to not only optimize their own payoff but will wish to optimize the overall outcome as well.

In general, people do not like the idea of waste or of unfairness and we typically search for some optimal solution given both our own interests and some consideration for the overall organization.

The real world of social interaction is full of all sorts of informal social and cultural institutions designed to enable trust, cooperation and optimal outcomes for all.

Almost as soon as two people start to interact they will start to look for commonalities and shared interests that enable them to develop trust and cooperation.

Thus, non-cooperative games are typically those where the actors can not interact and form the trust required for cooperation. Indeed, there will be certain games that we construct where we specifically want competition and we do that by not allowing the players to cooperate, such as in a competitive market.

Lastly non-cooperative games can be a product of an incapacity to enforce binding contracts. If there is a third party involved to ensure optimal outcomes for the overall organization through sanctions and incentives, this can form a solid basis for cooperation – in the way that a government does by enforcing laws.

This is famously captured in Thomas Hobbes’ conception of the state of nature. Where he pondered “What was life like before civil society?” He went on to write “during the time men live without a common power to keep them all in awe, they are in that condition which is called war, and such a war as is of every man against every man.”

In this state, every person has a natural right or liberty to do anything one thinks necessary for preserving one’s own life.

Hobbes’ ideas illustrate vividly how in the absence of a third party to enforce cooperation, competition can prevail.

EQUILIBRIUM ANALYSIS

Non-cooperative games create a specific dynamic within a game, where we are taking the individual and their payoff as the basic unit of analysis. In such a circumstance we do not need to consider what is best for all if given some form of cooperation because this is not possible within the context.

We are solely interested in how the individuals will act.

The question of how should they act to optimize their own payoff, and given the assumption that both are performing this optimization what will be a stable solution to the game.

Given these assumptions, both players should search for a strategy that optimizes their payoff, and where those strategies of the players interact we should have a stable outcome, that we should be able to predict will occur.

This stable outcome is what we call an equilibrium.

Where equilibrium, in the general sense, means a state in which opposing forces are balanced, thus creating a point of stability and stasis.

When we see a ball at the bottom of a bowl it is in a state of equilibrium, because if we put it anywhere else in the bowl the force of gravity would act on it to pull it back to this static point. This is the same for the actors in a non-cooperative game because they are both trying to optimize their payoff they will both naturally gravitate towards the strategy that gives them the highest payoff.

But because their payoff is dependent on what strategy the other chooses and because they can not depend upon cooperation between them, they have to choose the best strategy assuming that the other will work to optimize their payoff without cooperating.

This point of equilibrium in a game is called the Nash equilibrium after the famous mathematician John Nash.

In game theory, the Nash equilibrium is a solution concept of a non-cooperative game involving two or more players in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his or her own strategy. If each player has chosen a strategy and no player can benefit by changing strategies while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding payoffs constitute a Nash equilibrium.

The Nash equilibrium is one of the foundational concepts in game theory.

The basic intuition of the Nash equilibrium is in predicting what others will do given their self-interest only and then choosing your optimal strategy given that assumption.

Nash equilibrium is a point where all players are doing their best given the absence of cooperation. It is a law that no one would want to change in the absence of some effective overall structure for coordination.

PRISONERS GAME

Nash equilibrium is best illustrated through the prisoner’s dilemma game.

The prisoner’s dilemma game is a classic two player game that is often used to present the concept of Nash equilibrium in a payoff matrix form.

Conceive of two prisoners detained in separate cells, interrogated simultaneously and offered deals in the form of lighter jail sentences for betraying the other criminal. They have the option to “cooperate” with the other prisoner by not telling on them, or “defect” by betraying the other.

However, if both players defect, then they will both serve a longer sentence than if neither said anything. Lower jail sentences are here interpreted as higher payoffs.

The prisoner’s dilemma has a similar matrix as depicted for the coordination game, but the maximum reward for each player is obtained only when the players’ decisions are different. Each player improves their own situation by switching from “cooperating” to “defecting”, given the knowledge that the other player’s best decision is to “defect”. The prisoner’s dilemma thus has a single Nash equilibrium: where both players choose to defect.

What has long made this an interesting case to study is the fact that this scenario is globally inferior to “both cooperating”. That is, both players would be better off if they both chose to “cooperate” instead of both choosing to defect. However, each player could improve their own situation by breaking the mutual cooperation, no matter how the other player changes their decision.

PREDICTION

The central aim of non-cooperative game theory then is in trying to predict people’s actions within a game by finding the Nash equilibria and assuming they will play that because it is their best option.

It is then legitimate for us to ask does equilibrium analysis give us any predictive capacity over what happens in the real world? Often the outcome of experiments is not an equilibrium as predicted by the theory. This is mainly because people do not fully reason through the game in a fully logically consistent fashion.

Equilibrium is a point where everyone has figured out what everyone else will do, thus behaviorally it often does not predict what people will do the first time they play the game.

Equilibrium should more be interpreted as what will happen over a number of iterations within a non-cooperative game, as players come to better understand the game and how to reason through it.

Similar to putting a ball in a bowl, it takes time before it arrives at an equilibrium and this is what is seen in game experiments they tend over time towards the equilibrium.

For example, in a game, people are asked to choose a number between 0 to 100, with the winner being the person who is able to guess what will be 2/3 of the average figure proposed by others.

So everyone is being asked to guess a bit below the average number proposed.

In this game, only a small percentage choose the equilibrium point – which is zero – and because other people did not act rationally in this game they were wrong.

In many ways then choosing this equilibrium as a prediction of what would happen is not a good option. And this clearly diverges dramatically from what the theory tells us.

However, overtime, as the game is iterated upon the numbers chosen by people does move towards the equilibrium. Thus it tells us something about statistical averages of the system but not very much about how it will behave in the real world the first iteration of the game.

Game Theory 3: Elements of Games

ELEMENTS OF GAMES

Games in game theory involve a number of central elements which we can identify as players, strategies, and payoffs. In this chapter we are going to zoom in to better understand each of these different elements to a game, talking first about the players and rationality, then strategies and payoffs.

PLAYERS

As we touched upon in a previous videos agents are abstract models of individuals or organizations which have agency. Agency means the capacity of actors to make choices and to act independently on those choices to affect the state of their environment and they do this in order to improve their state within that environment.

In order to act and make choices, agents need a value system and need some set of rules under which to make their choices so as to improve their state with respect to their value system.

A big idea here is that of rationality, and we have to be careful how we defined this idea of rationality. A dictionary definition of rationality would read something like this “based on or in accordance with reason or logic”. Rationality simply means acting according to a consistent set of rules, that are based upon some value system that provides the reason for acting.

To act rationally is to have some value system and to act in accordance with that value system.

When a for-profit business tries to sell more products, it is acting in a rational fashion, because it is acting under a set of rules to generate more of what it values.

When a person who values their community does community work, they are acting rationally. Because their actions are in accordance with their value system and thus they have a reason for acting in that fashion.

Standard game theory makes a number of quite strong assumptions about the agents involved in games. A central assumption of classical game theory is that players act according to a limited form of rationality, what is sometimes call hyperrationality.

A player is rational in this sense if it consistently acts to improve its payoff without the possibility of making mistakes, has full knowledge of other players’ interactions and the actions available to them, and has an infinite capacity to calculate a priori all possible refinements in an attempt to find the “best one.” If a game involves only rational agents, each of whom believes all other agents to be rational, then theoretical results offer accurate predictions of the games outcomes.

Agents have a single conception of value, i.e. all value is reduced to a single homogeneous form called utility. Preferences and value are well defined.

Rational agents have unlimited rationality, the idea of omnipotence, i.e. they know all relevant information when making a choice, they can compute this information and all of its consequences. Within this model, agents have perfect information, and any uncertainty can be reduced to some probability distribution. The agent’s behavior is then seen to be an optimization algorithm over their set of possibilities.

Game theory is a young field of study—less than a century old. In that time, it has made remarkable advances, but it remains far from complete.

Traditional game theory assumes that the players of games are hyperrational — that they act in best accordance with their own desires given their knowledge and beliefs. This assumption does not always appear to be a reasonable one. In certain situations, the predictions of game theory and the observed behavior of real people differ dramatically.

People in the real world operate according to a multiplicity of motives, some of the time people are in a situation where they are simply trying to optimize a single metric, but more often they are not. They are embedded within a context where they are trying to optimize according to a number of different metrics.

The fact that people aren’t always optimizing according to a single metric is illustrated in the many games where people don’t choose actions that give them the greatest payoff within that single value system.

The best empirical examples of this are taken from the dictator game. The dictator game is a very simple game, where one person is given a sum of money, say 100 dollars, this person plays the role of “the dictator,” and is then told that they must offer some amount of that money to the second participant, even if that amount is zero. Whatever amount the dictator offers to the second participant must be accepted. The second participant, should they find the amount unsatisfactory, cannot then punish the dictator in any way.

Standard economic theory assumes that all individuals act solely out of self-interest. Under this assumption, the predicted result of the dictator game is that the “dictator should keep 100% of the cake, and give nothing to the other player.” This effectively assigns the value of what the dictator shares with the second player to zero.

The actual results of this game, however, differ sharply from the predicted results. With a “standard” dictator game setup, “only 40% of the experimental subjects playing the role of dictator keep the whole sum.” In research by Robert Forsythe, et al, they found the average amount given, under these standard conditions, to be around 20% of the allocated money.

In any case, in the majority of these game trials, the dictator assigns the second player a non-zero amount.

The obvious reason for this is that the dictator is not simply trying to optimize according to a single monetary value – that a strict conception of rationality would posit – but is acting rationally to optimize according to a number of different value systems.

They want the money, yes, but they are also optimizing according to cultural and social capital that motivates them to act in accordance with some conception of fairness and it is out of the interaction of these different value systems that we get the empirical results.

What agents value can be simple or it can be complex.

A financial algorithm is a form of agent that acts according to some set of rules designed to create a financial profit; this is an example of a very simple value system.

In contrast, what a human being value is typically many things. People value social capital, that is to say, their relationships with other people and their roles within social groups. They care about cultural capital, how they perceive themselves and how others perceive them. They care about financial capital and natural capital. They often care about their natural environment to a greater or lesser extent.

Likewise, the set of instructions or rules can be based on some simple linear cause and effect model – what may be called an algorithm – or they may be much more complex models – what may be called a schema.

Thus when we say that someone is acting rationally and maximizing their value payoff, this can be in many different contexts. A person helps an old lady onto the bus, not because they are going to get paid for this, but what they do get from this is some sense of being a decent person and they gain some payoff in that sense.

Thus it is not the concept of rationality or that people try to optimize their payoff that needs to be revised. It is the narrow definition of rationality as optimizing according to a single metric that needs to be expanded within many contexts that involve social interaction.

The classical conception of strict rationality based upon a single metric will apply in certain circumstances. It will be relevant to many games in ecology, where creatures have a simple conception of value maximization.

Likewise, it will often be relevant to computer algorithms and software systems and sometimes relevant for socioeconomic interactions, or at least partially relevant.

As the influential biologist Maynard Smith, in the preface to the book Evolution and the Theory of Games, “paradoxically, it has turned out that game theory is more readily applied to biology than to the field of economic behavior for which it was originally designed.”

If we want an empirically accurate theory of games between more complex agents it will need to be expansive in its conception of value and rationality to include the more complex set of value systems and reasoning processes that are engendered in such games. We have spent quite a bit of time talking about this idea of rationality as it is a major unresolved flaw within standard game theory, one that is important to be aware of.

GAME STRATEGIES

Strategy is the choice of one’s actions.

In game theory, player’s strategy is any of the options they can choose in a setting where the outcome depends on the action of others. A strategy, in the practical sense, is then, a complete algorithm for playing the game, telling a player what to do for every possible situation throughout the game.

For example, the game might be a business entering a new market and trying to gain market share against other players. This will not just happen overnight but they will have to take a series of actions that are all coordinated towards their desired end result. They might first have to organize production processes and logistics, then advertising, then pricing etc. Each of these actions we would call a move in the game, and the overall strategy consists of a set of moves.

A player’s strategy set defines what strategies are available for them to play. For instance, in a single game of rock-paper-scissors, each player has the finite strategy set of rock, paper, scissors.

Likewise, a player’s strategy set can be infinite, for example in choosing how much to pay when making an offer to purchase an item in a process of bartering, this could be potentially infinite, it could be any increment.

PURE / MIXED STRATEGY

In some games, there will not be one primary strategy that an agent will always choose but in many circumstances, they may have a number of options and choose between them with some given probability. This will often be the case when they don’t want the other player to know in advance which move they will take.

For example, in smuggling goods across the Vietnam-Chinese border, the smugglers have many different points of entry available to them and the police have many different points that they could secure. In such a case neither side wants always to choose the same location, they want some degree of randomness in the strategy that they choose.

This gives us a distinction in games between those with strategies that one will always play and those that one will play only with a given probability. This distinction is captured in the terms mixed and pure strategy.

Pure strategies are ones which do not involve randomness and tell us what to do in every situation. A pure strategy provides a complete definition of how a player will play a game. In particular, it determines the move a player will make for any situation they may face.

Strategies that are not pure—that depend on an element of chance—are called “mixed strategies.” In mixed strategies, you have a number of different options and you ascribe a probability to the likelihood of playing them. As such we can think about a mixed strategy as a probability distribution over the actions players have available to them.

PAYOFFS

For every strategy taken within a game, there is a payoff associated with that strategy.

A player’s payoff defines how much they like the outcome of the game.

The payoffs for a particular player reflect what that player cares about, not what another player thinks they should care about. Payoffs must reflect the actual preferences of the players, not preferences anyone else ascribes to them.

Game theorists often describe payoffs in terms of utility — the general happiness a player gets from a given outcome. Payoffs can represent any type of value, but only the factors that are incorporated into the model. Thus we have to be careful in asking what do the agents really value.

Payoffs are then essentially numbers which represent the motivations of players. In general, the payoffs for different players cannot be directly compared, because they are to a certain extent subjective.

Payoffs may have numerical values associated with them or they may simply be a set of ranking preferences. If the payoff scale is only a ranking, the payoffs are called “ordinal payoffs.” For example, we might say that Kate likes apples more than oranges and oranges more than grapes.

However if the scale measures how much a player prefers one option to another, the payoffs are called “cardinal payoffs.” So if the game was simply one for money then we could ascribe a value to each payoff, that would be the quantity of money gained.

In many games all that matters is the ordinal payoffs, all we need to know is which options they prefer without actually knowing how much they prefer them. This is useful because in reality people don’t really go around ascribing specific values to how much they like things, but they do think about whether they prefer one thing or another. Kate may know that she likes apples more than oranges but she would probably laugh if you asked her to put values on how much more she likes them.

In the next section, we start to play some games, looking at how to solve games, how we find the best strategies and talk about the important idea of equilibrium.