Playing Games with Genetic Algorithms

Robert E. Marks
Australian Graduate School of Management,
The Universities of Sydney and New South Wales,

1. Introduction

Over the past 25 years, non-coöperative game theory has moved from the periphery of mainstream economics to the centre of micro-economics and macro-economics. Issues of information, signalling, reputation, and strategic interaction can best be analysed in a game-theoretic framework. But solution of the behaviour and equilibrium of a dynamic or repeated game is not as simple as that for a static or one-shot game. The multiplicity of Nash equilibria of repeated games has led to attempts to eliminate many of these through refinements of the equilibrium concept. At the same time, the use of the rational expectations assumption to cut the Gordian Knot of the intractability of dynamic problems has led to a reaction against the super-rational Homo calculans model towards boundedly rational models of human behaviour.

In the late 'eighties the Genetic Algorithm (GA) was first used to solve a dynamic game, the repeated Prisoner's Dilemma (RPD). Mimicking Darwinian natural selection, as a simulation it could only elucidate sufficient conditions for its Markov Perfect equilibria (MPE), rather than the necessary conditions of closed-formed solution, but over the past twenty-odd years, its use in economics has grown to facilitate much greater understanding of evolution, learning, and adaptation of economic agents. This brief survey attempts to highlight emergence of the marriage of GAs and game theory.

2. Deductive versus evolutionary approaches to game theory

Ken Binmore and Partha Dasgupta (1986) argued that the use of repeated games in economics can be viewed from two perspectives; first, deductively determining what the equilibrium outcome of a strategic interaction will be, using the Nash equilibrium concept as a solution method, and, second, evolutively, asking how the strategic players will learn as they play, which will also result in equilibrium. In general, the possible equilibria which follow from the second approach are a subset of those that follow from the first. Indeed, "learned" equilibria are one attempt to "refine" the possible Nash equilibria of the deductive approach. A good summary of the rationale and techniques of the "learning" approach to solving game equilibria is given in Fudenberg and Levine (1998), who mention the GA as a means of exploring the space of strategies to play repeated games.

3. The repeated Prisoner's Dilemma

Several years earlier, Robert Axelrod (1984) had set up a tournament among computer routines submitted to play the RPD. Pairs of routines were matched in the repeated game, and their scores used to increase or reduce the proportion of each routine in the total population of potential players. Although Axelrod was a political scientist, he was influenced by William Hamilton, a biologist, and this simulated "ecology" mimicked in some way the natural selection undergone by species competing for survival in an ecological niche. It is well known that Anatol Rapoport's Tit for Tat emerged as a robust survivor among the many routines submitted. Foreshadowing the analytical tool of replicator dynamics (see below), however, the number of strategies in the game was fixed at the start: no new strategies could emerge.

4. Boundedly rational players

Axelrod's 1984 tournament pitted routines that were perforce boundedly rational. That one of the simplist, Tit for Tat, emerged as the best, as measured by its high score (although not always: see Marks 1989a), might only have been because of the bounded nature of its strategy (start off "nice", and then mimic your opponent's last move).

Deductive game theory has not assumed any limits to agents' abilities to solve complex optimisation problems, and yet Herbert Simon has been arguing for many years what applied psychologists and experimental economists are increasingly demonstrating in the laboratory: human cognitive abilities are bounded, and so descriptive models should reflect this. But one virtue of the assumption of perfect rationality is that global optima are reached, so that there is no need to ask in what manner our rationality is bounded. It may be no coincidence that two recent monographs on bounded rationality in economics (Thomas Sargent 1993, and Ariel Rubinstein 1998) are written by authors who have previously been concerned with repeated games played by machines (Rubinstein 1986) and one of the first published uses of GAs in economics (Marimon et al. 1990). What is the link between these two areas? Some background will clarify.

5. Game-playing automata

The GA was developed by John Holland (1975, 1992) and his students, including David Goldberg (1988). The original applications were in engineering and were predominantly optimisation problems: find x = arg max f ( x ). The comparative advantage of GAs was that f (.) was not required to be continuous or convex or differentiable or even explicitly closed-form -- for a recent overview, see Mitchell (1996). So the original focus was on optimisation of closed-form functions.

But the GA can search for more general optima. Axelrod (1997) recounts how his colleague at Michigan, John Holland, mentioned that there was this new technique in Artificial Intelligence (or what has become known as Machine Learning) which, by simulating natural selection, was able to search for optima in extremely non-linear spaces. Axelrod (1987), the first published research to use the GA with repeated games, was the result: against the same niche of submitted routines that he had used in his 1984 study, Axelrod used the GA to see whether it could find a more robust strategy in the RPD than Tit for Tat. I came across a reference to this then-unpublished work in Holland et al. (1986) and obtained both the draft and the code in early 1988 from Axelrod. I had been interested in solutions to the RPD since a routine of mine had won the second MIT competitive strategy tournament (Fader and Hauser 1988), an attempt to search for a generalisation for Tit for Tat in a three-person game with price competition among sellers of imperfect substitutes. What could this new technique tell me about my serendipitous routine?

Axelrod (with the programming assistance of Stephanie Forrest 1991) modelled players in his discrete RPD game as stimulus-response automata, where the stimulus was the state of the game, defined as both player's actions over the previous several moves, and the response was the next period's action (or actions). That is, he modelled the game as a state-space game (Fudenberg and Tirole 1991, Slade 1995), in which past play influences current and future actions, not because it has a direct effect on the game environment (the payoff function) but because all (or both) players believe that past play matters. Axelrod's model focused attention on a smaller class of "Markov" or "state-space" strategies, in which past actions influence current play only through their effect on a state variable that summarises the direct effect of the past on the current environment (the payoffs). With state-space games, the state summarises all history that is payoff-relevant, and players' strategies are restricted to depend only on the state and the time.

Specifically, Axelrod's stimulus-response players were modelled as strings, each point of which corresponds to a possible state (one possible history) and decodes to the player's action in the next period. The longer the memory, the longer the string, because the greater the possible number of states. Moving to a game with more than the two possible moves of the RPD will lengthen the string, holding the number of players constant. Increasing the number of players will also increase the number of states. Formally, the number of states is given by the formula a ^ ( m p }, where there are a actions, p players, each remembering m periods (Midgley et al. 1997).

Although the implicit function of Axelrod's 1987 paper is non-linear and open-form, in one way this pioneering work was limited: the niche against which his players interact is static, so the GA is searching a niche in which the other player's behaviour is determined at time zero: these routines do not learn. This type of problem is characterised as open-loop, and is essentially non-strategic (Slade 1995).

6. Co-evolution of automata

The interest of game theory is in strategic interaction, in which all players respond to each other, and can learn how better to respond. I thought that GAs playing each other would be more interesting than Axelrod's static niche (which I replicated in Marks 1989a) and extended Axelrod's work to co-evolution of stimulus-response players in Marks (1989b), later published as Marks (1992b), although I termed the simultaneous adaptation of each player in response to all others' actions as "boot strapping", being unaware of the biologist's term.

Another approach to modelling players as co-evolving stimulus-response machines was taken by John Miller, in his 1988 Michigan PhD thesis (later published as Miller 1996). He modelled the finite automaton explicitly, and let the GA derive new automata. This is explained further in Marks (1992a). Miller's approach has the advantage that offspring of the genetic recombination of the GA are functioning finite automata, whereas offspring in the Axelrod approach may include many irrelevant states which -- via the curse of dimensionality (Marks 1998) -- dramatically increases the computation costs. Holland and Miller (1991) argued for the merits of simulation in general (anticipating some of the arguments made by Judd 1998), and bottom-up artificial adaptive agents (such as the population of strategies playing the RPD to derive a fitness score for the GA) in particular.

As discussed at length in the two monographs, Sargent (1993) and Rubinstein (1998), modelling players as boundedly rational poses important questions for the modeller: which constraints are realistic? which ad hoc? Deterministic finite automata playing each other result in Markov perfect equilibria. But as soon as the researcher uses a (necessarily finite) automaton to model the player -- or its strategy -- the question of boundedness must be addressed.

Eight years ago, the literature on finite automata in repeated games could be categorised into two distinct branches: the analysis of the theoretical equilibrium properties of machine games, and the effect of finite computational abilities on the support of coöperative outcomes (Marks 1992a). Since then, as discussed in the section. Empirical Games, below, there is a growing literature which uses such techniques as the GA and neural nets to explore historical market behaviour, especially in oligopolies.

Rubinstein (1986) was using automata as players to explore some theoretical issues associated with the RPD. Others have followed, without necessarily using the GA to solve the strategy problem. Megiddo and Wigderson (1986) used Turing machines (potentially of infinite size) to play the RPD; Chess (1988) used simulations to generate best-response strategies in the RPD, and generated simple routines, but he did not use the GA; Cho (1995, 1996a, 1996b) and Cho and Li (1999) used "perceptrons" -- neural nets -- to demonstrate support of the Folk Theorem in RPD games, with simple strategies, with imperfect monitoring, and with nets that are of low complexity. Fogel and Harrald (1994) and Marks and Schnabl (1999) compare neural nets and the GA for simulating the RPD, and Herbrich et al. (1999) survey the use of neural networks in economics.

7. Learning

The earliest published paper to use the GA to solve the RPD is Fujiki and Dickinson (1987), but instead of explicitly using automata, they modelled the players as LISP routines, and allowed the GA to search the space of LISP routines for higher scoring code. The earliest publication in a peer-reviewed economics journal of a GA used in economics modelling was Marimon et al. (1990), but it was not explicitly about using the GA to solve games. Instead, in a macro-economic model it used the GA to model learning, the first of many papers to do so. I must admit that in an early 1989 conversation with Tom Sargent, I discounted the mechanisms of the GA -- selection, crossover, and mutation -- as models of the learning process, but Jasmina Arifovic, a student of Sargent's, in a continuing of series of papers over the past decade which model GA learning, has shown that this conclusion was wrong. Arifovic (1990) uses the GA to simulate the learning of rational expectations in an overlapping-generations (OLG) model; her first published paper (Arifovic 1994) was on learning in that canonical model of equilibrium determination: the Cobweb Model, in which she introduces the election operator, the first contribution by an economist to the practice of GAs, previously dominated by, first, engineers, and, second, mathematicians; Arifovic (1995) shows that using the GA in an OLG model results in convergence to rational expectations and reveals behaviour on the part of the artificially adaptive agents which mimics that observed by subjects in experimental studies; Arifovic (1998) finds a two-period stable equilibrium in an OLG model; Arifovic and Eaton (1995) uses the GA in a coördination game. The Cobweb Model has proved a popular subject of investigation using the GA: Dawid and Kopel (1998) explore two models -- one with only quantity choices and one with the prior decision of whether to exit or stay in the market; Franke (1998) explores the stability of the GA's approximation of the moving Walrasian equilibrium in a Cobweb economy, and provides an interpretation and extension of Arifovic's (1994) election operator.

One of the traits of Tit for Tat as characterised by Axelrod (1984) is that it is easily recognised. That is, other players can soon learn the type of player they are facing. A more complex strategy would not necessarily be as readily recognised. Dawid (1996a, 1996b, 1999) argues that GAs may be an economically meaningful model of adaptive learning if we consider a learning process incorporating imitation (crossover), communication (selection), and innovation (mutation).

There are several other relevant papers on this topic. Bullard and Duffy (1998) explore how a population of myopic artificially adaptive agents might learn forecasting rules using a GA in a general equilibrium environment, and find that coördination on steady state and low-order cycles occurs, but not on higher-order periodic equilibria. Bullard and Duffy (1999) use a GA to update beliefs in a heterogeneous population of artificial agents, and find that coördination of beliefs can occur and so convergence to efficient, rational expectations equilibria. Riechmann (1999a) uses insights from the mathematical analysis of the behaviour of GAs (especially their convergence and stability properties) to argue that the GA is a compound of three different learning schemes, with a stability between asymptotic convergence and explosion. Riechmann (1999b) demonstrates that economic GA learning models can be equipped with the whole box of evolutionary game theory, that GA learning results in a series of near-Nash equilibria which then approach a neighbourhood of an evolutionary stable state. Dawid and Mehlmann (1996), and Chen et al. (1996) also consider genetic learning in repeated games. The recent volume edited by Brenner (1999), on computational techniques of modelling learning in economics, includes several papers of interest, in particular Beckenbach (1999) on how the GA can be re-shaped as a tool for economic modelling, specifically to model learning

8. Replicator dynamics

But the GA is not the only technique to come from biology into economics. Convergence of a kind was occurring as economics in general, and game theory in particular, was borrowing insights from a second source in biology. For the past fifty years there has been some interest in the Darwinian idea of natural selection as applied to firm survival in the (more or less) competitive marketplace, as summarised in Nelson and Winter (1982). Borrowing game theoretical ideas from economics, John Maynard Smith (1982) introduced the concept of the evolutionarily stable strategy (ESS) to biology, and this in turn was used by economists eager to reduce the number of possible equilibria in repeated games. ESS is concerned with the invasion of a new species (or strategy) into an ecology (market) in which there is an existing profile of species (strategies). Axelrod (1984) had been mimicking these changes as he allowed the proportions of his RPD-playing routines to alter in response to their relative success. The most widely discussed paper, and one of the earliest to use GA's in examining the changing profiles of strategies in the RPD, is Lindgren (1991). Two recent investigations of the RPD are Ho (1996), with information processing costs, and Wu and Axelrod (1995), with noise.

Axelrod (1984) had argued that simple strategies such as Tit for Tat were collectively evolutionarily stable, and so could sustain coöperation in the finitely RPD. In an unpublished paper, Slade and Eaton (1990) argue that Axelrod's conclusion is not robust to small deviations from the Axelrod setup, in particular, allowing agents to alter their strategies without announcement. Similarly, Nachbar (1988, 1992) questions the robustness of Axelrod's findings, arguing that prior restrictions on the strategy set result in a convergence to coöperation with Tit for Tat, and that without these restrictions coöperation would eventually be exhausted.

Replicator dynamics (Hofbauer and Sigmund 1988, 1998) exhibit similar behaviour, as the profile of initial strategies changes, but unlike the GA as applied to stimulus-response machines (strategies) replicator dynamics cannot derive completely new strategies. Replicator dynamics have provided the possibility of closed-form investigation of evolutionary game theory (Friedman 1991, Selten 1991, Mailath 1998), but evolutionary game theory in general, and replicator dynamics in particular, despite the word "evolutionary", are not directly related to GAs (see Dawid 1996a, 1996b for comparisons).

9. Other refinements

The literature on the RPD cited above restricts the player's decisions to its next action, but does not model an earlier decision: whether or not to play. This constraint is relaxed -- players get to decide with whom to interact -- in a series of papers by Leigh Tesfatsion and co-authors (Stanley et al. 1994, Ashlock et al. 1996), culminating in her simulation of trading (Tesfatsion 1997).

The GA has been used to search for the emergence of risk-averse players, in a simple model (Szpiro 1997), and more recently in a robust model of the farmer's decision faced with high- and low-risk options (Cacho and Simmons 1999). Huck et al. (1999) use replication and mutation (but not crossover or the GA) to search for the emergence of risk aversion in artificial players of repeated games.

Tomas Basar has been prominent in the mathematics of game theory: the book (Basar and Oldser 1982) solved many issues in dynamic, continuous, differential games before many economists had become aware of the problems. Recently, he has used GAs to solve for Stackelberg solutions, in which one player moves first (Vallee and Basar 1999).
Following in his tradition, Özyildirim (1997) applied the GA to search for approximations of the non-quadratic, non-linear, open-loop, Nash Cournot equilibria in a macro-economic policy game. She goes beyond many earlier studies in using historical data to explore the interactions between the Roosevelt administration and organised labour during the New Deal of the 'thirties. In a second paper (Alemdar and Özyildirim 1998), the GA is used to search for optimal policies in a trading game with knowledge spillovers and pollution externalities.

10. Empirical games

The two papers by Özyildirim, discussed above, show the future, I believe: moving from stylised games, such as the RPD, to actual market interactions. Slade (1995) provides a clear structure for this project, as well as reporting preliminary results. She finds that the Nash equilibrium of the one-shot game is emphatically not a reasonably approximation to real-world repeated interactive outcomes, and that most games in price or quantity appear to yield outcomes that are more collusive than the one-shot outcome, which is consistent with the Folk Theorem.

Marks and his coauthors have been involved in a project to use GAs to solve for MPE using historical data on the interactions among ground-coffee brands at the retail level, where these players are modelled as Axelrod/Forrest finite automata in an oligopoly. This work is a generalisation of Axelrod (1987) and Marks (1992b), and uses the ability of the GA to search the highly disjoint space of strategies, as Fudenberg and Levine (1998) suggest.

The first two papers (Marks et al. 1995, Midgley at al. 1997) build on a "market model" that was estimated from historical supermarket-scanner data on the sales, prices, and other marketing instruments of up to nine distinct brands of canned ground coffee, as well as using brand-specific cost data. The market model in effect gives the one-shot payoffs (weekly profits) for each brand, given the simultaneous actions of all nine (asymmetric) brands. There was natural synchrony in brands' actions: the supermarket chains required their prices and other actions to change together, once a week, at midnight on Saturdays.

Modelling the three most rivalrous historical brands as artificially adaptive agents (or Axelrod strings), the authors use the GA to simulate their co-evolution, with the actions of the other six brands taken as unchanging, in a repeated interaction to model the oligopoly. After convergence to an apparent MPE in their actions, the best of each brand is played in open-loop competition with the historical data of the other six in order to benchmark its performance. The authors conclude that the historical brand managers could have improved their performance given insights from the response of the stimulus-response artificial managers, although this must be qualified with the observation that closed-loop experiments (Slade 1995), allowing the other managers to respond, would be more conclusive.

Marks et al. (1998) refines the earlier work by increasing the number of strategic players to four, by using four distinct populations in the simulation (against the one-string-fits-all single population of the earlier work), and by increasing the number of possible actions per player from four to eight in order to allow the artificial agents to learn which actions (prices) are almost always inferior and to be avoided. Marks (1998) explores some issues raised by the discrete simulations and the curse of dimensionality, and uses entropy as a measure of the loss of information in partitioning the continuous historical data (in prices etc.). Slade (1992, 1995) considers two markets with similar rivalrous behaviour -- gasoline and saltine crackers -- but does not use a GA.

Curzon Price (1997) uses a GA to model several standard industrial-organization models in order to demonstrate that the GA performs well as a tool in applied economic work requiring market simulation. He considers simple models of Bertrand and Cournot competition, a vertical chain of monopolies, and an electricity pool.

11. Conclusion

When John Holland (1975) invented the GA, his original term for it was an "adaptive plan" (Beckenbach 1999), which looked for "improvement" in complex systems, or "structures which perform well." Despite that, most research effort, particularly outside economics, has been on its use as a function optimiser. But, starting with Axelrod (1987), the GA has increasingly been used as an adaptive search procedure, and latterly as a model of human learning in repeated situations. In the 1992 second edition of his 1975 monograph, Holland expressed the wish that the GA be seen more as a means of improvement and less on its use as an optimiser.

This survey, although brief, has attempted to show that use of the GA in economics in general, and in game theory in particular, has been more and more focusing on its use as an adaptive search procedure, searching in the space of strategies of repeated games, and providing insights into historical oligopolistic behaviour, as well as into human learning processes.


The work in this study was partly supported by an Australian Research Council grant and the Australian Graduate School of Management. I wish to thank the editor, Shu-Heng Chen, for his encouragement in the face of my recent bereavement, and my colleague, Robin Stonecash, for her support.


Alemdar N.M., Özyildirim S. (1998) A genetic game of trade, growth and externalities, Journal of Economic Dynamics and Control 22(6): 811-32.

Arifovic J. (1990) Learning by Genetic Algorithms in economic environments, SFI Econ Res Prog WP 90-001.

Arifovic J. (1994) Genetic Algorithm learning and the Cobweb Model, Journal of Economic Dynamics and Control 18(1): 3-28.

Arifovic J. (1995) Genetic Algorithms and inflationary economies, Journal of Monetary Economics 36(1): 219-243.

Arifovic J. (1998) Stability of equilibria under Genetic Algorithm adaptation: an analysis, Macroeconomic Dynamics 2(1): 1-21.

Arifovic J., Eaton B.C. (1995) Coördination via genetic learning, Computational Economics 8(3): 181-203.

Arifovic J., Eaton B.C. (1998) The evolution of type communication in a sender/receiver game of common interest with cheap talk, Journal of Economic Dynamics and Control 22(8-9): 1187-1207.

Ashlock D., Smucker M.D., Stanley E.A., Tesfatsion L. (1996) Preferential partner selection in an evolutionary study of Prisoner's Dilemma, BioSystems 37 (1-2): 99-125.

Axelrod R. (1984) The Evolution of Coöperation, N.Y.: Basic Books.

Axelrod R. (1987) The evolution of strategies in the iterated Prisoner's Dilemma, in Genetic Algorithms and Simulated Annealing, L. Davis (ed.), London: Pittman. pp.32-41.

Axelrod R. (1997) The Complexity of Coöperation, Princeton: P.U.P.

Basar T., Oldser G.J. (1982) Dynamic Non-Coöperative Game Theory, New York: Academic Press.

Beckenbach F. (1999) Learning by Genetic Algorithms in economics? in Brenner (1999) op. cit., pp. 73-100.

Binmore, K., Dasgupta, P. (1986) Game theory: a survey. In Economic Organizations as Games (edited by K. Binmore and P. Dasgupta), pp.1-45. Oxford: B. Blackwell.

Brenner T. (ed.) (1999) Computational Techniques for Modelling Learning in Economics, in the series Advances in Computational Economics 11, Dordrecht: Kluwer Academic Publishers.

Bullard, J., Duffy, J. (1998) Learning and the stability of cycles, Macroeconomic Dynamics 2(1): 22-48.

Bullard J., Duffy J. (1999) Using Genetic Algorithms to model the evolution of heterogeneous beliefs, Computational Economics 13(1): 41-60.

Cacho O., Simmons P. (1999) A genetic algorithm approach to farm investment, Australian Journal of Agricultural and Resource Economics, 43(3): 305-322.

Chen S.H., Duffy J., Yeh C.H. (1996) Equilibrium selection via adaptation: using genetic programming to model learning in a co-ordination game mimeo.

Chess D.M. (1988) Simulating the evolution of behavior: the Iterated Prisoners' Dilemma, Complex Systems, 2: 663-670.

Cho, I.-K. (1995) Perceptrons play the repeated Prisoner's Dilemma, Journal of Economic Theory, 67: 266-284.

Cho, I.-K. (1996a) On the complexity of repeated principal agent games, Economic Theory, 7(1): 1-17.

Cho, I.-K. (1996b) Perceptrons play repeated games with imperfect monitoring, Games and Economic Behavior 16(1): 22-53.

Cho I.-K., Li H. (1999) How complex are networks playing repeated games? Economic Theory 13(1): 93-123.

Curzon Price T. (1997) Using co-evolutionary programming to simulate strategic behaviour in markets, Journal of Evolutionary Economics 7(3): 219-254.

Dawid H. (1996a) Genetic Algorithms as a model of adaptive learning in economic systems, Central European Journal for Operations Research and Economics 4(1): 7-23.

Dawid H. (1996b) Learning of cycles and sunspot equilibria by Genetic Algorithms, Journal of Evolutionary Economics 6(4): 361-373.

Dawid H. (1999) Adaptive Learning by Genetic Algorithms: Analytical Results and Applications to Economic Models, Lecture Notes in Economics and Mathematical Systems, vol 441. Heidelberg: Springer-Verlag, 2nd ed.

Dawid H., Kopel M. (1998) On economic applications of the Genetic Algorithm: a model of the cobweb type, Journal of Evolutionary Economics 8(3): 297-315.

Dawid H., Mehlmann A. (1996) Genetic learning in strategic form games, Complexity 1(5): 51-59.

Fader P.S., Hauser J.R. (1988) Implicit coalitions in a generalised Prisoner's Dilemma, Journal of Conflict Resolution 32: 553-582.

Fogel, D.B., Harrald, P.G. (1994) Evolving continuous behaviors in the iterated Prisoner's Dilemma in: Sebald, A., Fogel, L. (eds.) The Third Annual Conference on Evolutionary Programming, Singapore: World Scientific, pp.119-130.

Forrest S. (ed.) (1991), Emergent Computation: Self-Organizing, Collective, and Coöperative Phenomena in Natural and Artificial Computing Networks, Cambridge: MIT Press.

Franke R. (1998) Coevolution and stable adjustments in the cobweb model, Journal of Evolutionary Economics 8(4): 383-406.

Friedman D. (1991) Evolutionary games in economics, Econometrica, 59(3): 637-666.

Fudenberg D., Levine D.K. (1998) The Theory of Learning in Games, Cambridge: M.I.T. Press.

Fudenberg D., Tirole J. (1991) Game Theory, Cambridge: M.I.T. Press.

Fujiki, C., Dickinson, J. (1987) Using the genetic algorithm to generate Lisp source code to solve the Prisoner's Dilemma. In: Grefenstette J.J. (ed.) Genetic Algorithms and their Applications, Proceedings of the 2nd. International Conference on Genetic Algorithms, Hillsdale, N.J.: Lawrence Erlbaum.

Goldberg D.E. (1988) Genetic Algorithms in Search, Optimization, and Machine Learning, Reading, Mass.: Addison-Wesley.

Herbrich R., Keilbach M., Graepel T., Bollmann-Sdorra P., Obermayer K., (1999) Neural networks in economics, in Brenner (1999), op. cit., pp. 169-196.

Ho T.H. (1996) Finite automata play repeated Prisoner's Dilemma with information processing costs, Journal of Economic Dynamics and Control 20(1-3): 173-207.

Hofbauer J., Sigmund C. (1988) The Theory of Evolution and Dynamic Systems, Cambridge: C.U.P.

Hofbauer J., Sigmund C. (1998) Evolutionary Games and Population Dynamics, Cambridge: C.U.P.

Holland, J.H. (1975) Adaptation in Natural and Artificial Systems, Ann Arbor: Univ. of Michigan Press. (A second edition was published in 1992: Cambridge: MIT Press.)

Holland J.H. (1992) Genetic algorithms, Scientific American, pp. 66-72, July.

Holland J.H., Holyoak K.J., Nisbett R.E., Thagard P.R. (1986) Induction: Processes of Inference, Learning, and Discovery, Cambridge: MIT Press.

Holland J.H., Miller J.H. (1991), Artificial adaptive agents in economic theory, American Economic Review: Papers and Proceedings, 81(2): 365-370.

Huck S., Müller W., Strobel M. (1999) On the emergence of attitudes towards risk, in Brenner (1999), op. cit., pp. 123-144.

Judd K.L. (1998) Numerical Methods in Economics, Cambridge: MIT Press.

Lindgren K. (1991) Evolutionary phenomena in simple dynamics, pp. 295-324 in C. Langton, C. Taylor, J. Farmer, and S. Rasmussen, (ed.), Artificial Life II, Vol. 10, Reading: Addison-Wesley.

Mailath G.J. (1998) Do people play Nash equilibrium? Lessons from evolutionary game theory, Journal of Economic Literature, 36(3): 1347-74.

Marimon R., McGrattan E., and Sargent T.J. (1990) Money as a medium of exchange in an economy with artificially intelligent agents, Journal of Economic Dynamics and Control 14: 329-373.

Marks, R.E. (1989a) Niche strategies: the Prisoner's Dilemma computer tournaments revisited. AGSM Working Paper 89-009.

Marks R.E. (1989b). Breeding optimal strategies: optimal behavior for oligopolists. In J. David Schaffer (ed.), Proceedings of the Third International Conference on Genetic Algorithms, 198-207. San Mateo, CA.: Morgan Kaufmann.

Marks R.E. (1992a) Repeated games and finite automata, in Recent Developments in Game Theory, ed. by J Creedy, J. Borland, and J. Eichberger, Aldershot: Edward Elgar, 43-64.

Marks R.E. (1992b) Breeding hybrid strategies: optimal behaviour for oligopolists, Journal of Evolutionary Economics 2: 17-38.

Marks, R.E. (1998) Evolved perception and behaviour in oligopolies, Journal of Economic Dynamics and Control 22(8-9): 1209-1233.

Marks R.E., Midgley D.F., and Cooper L. (1995) Adaptive behavior in an oligopoly, in Evolutionary Algorithms in Management Applications, ed. by J. Biethahn and V. Nissen, Berlin: Springer-Verlag, pp.225-239.

Marks R.E., Midgley D.F., Cooper L.G. (1998). Refining the Breeding of Hybrid Strategies, Australian Graduate School of Management Working Paper 98-017, Sydney.

Marks R.E., Schnabl H. (1999) Genetic Algorithms and Neural Networks: a comparison based on the Repeated Prisoner's Dilemma, in Brenner (1999), op. cit., pp. 197-219.

Maynard Smith J. (1982) Evolution and the Theory of Games, Cambridge: C.U.P.

Megiddo N., Wigderson, A. (1986) On play by means of computing machines. In Reasoning About Knowledge (edited by J.Y. Halpern) pp.259-274. Los Altos: Kaufmann.

Midgley D.F., Marks R.E., Cooper L.G. (1997) Breeding competitive strategies, Management Science, 43(3): 257-275.

Miller, J.H. (1996) The coevolution of automata in the repeated Prisoner's Dilemma, Journal of Economic Behavior and Organization 29: 87-112.

Mitchell M. (1996) An Introduction to Genetic Algorithms, Cambridge: MIT Press.

Nachbar J.H. (1988) The evolution of coöperation revisited, mimeo., Santa Monica: RAND Corp., June.

Nachbar J.H. (1992) Evolution in the finitely repeated Prisoner's Dilemma, Journal of Economic Behavior and Organization 19(3): 307-326.

Nelson R.R., Winter S.G. (1982) An Evolutionary Theory of Economic Change, Cambridge: Belknap Press of Harvard University Press.

Özyildirim S. (1997) Computing open-loop noncoöperative solution in discrete dynamic games, Journal of Evolutionary Economics 7(1): 23-40.

Riechmann T. (1999a) Learning and behavioural stability: an economic interpretation of Genetic Algorithms, Journal of Evolutionary Economics 9(2): 225-242.

Riechmann T. (1999b) Genetic Algorithm learning and evolutionary games, mimeo.

Rubinstein A. (1986) Finite automata play the repeated Prisoner's Dilemma, Journal of Economic Theory 39: 83-96.

Rubinstein A. (1998) Modeling Bounded Rationality, Cambridge: MIT Press.

Sargent T.J. (1993) Bounded Rationality in Macroeconomics, Oxford: O.U.P.

Selten R. (1991) Evolution, learning, and economic behavior, Games and Economic Behavior 3: 3-24.

Slade M.E. (1992) Vancouver's gasoline-price wars: an empirical exercise in uncovering supergame strategies, Review of Economic Studies, 59: 257-274.

Slade M.E. (1995) Empirical games: the oligopoly case, Canadian Journal of Economics, 28(2): 368-402.

Slade M.E., Eaton C.B. (1990) Evolutionary equilibrium in market supergames, University of British Columbia Department of Economics Discussion Paper: 90-30.

Stanley E.A., Ashlock D., Tesfatsion L. (1994) Iterated Prisoner's Dilemma with choice and refusal of partners, in C. Langton, Artificial Life III, Vol. 17, Santa Fe Institute Studies in the Sciences of Complexity, Redwood City: Addison-Wesley, pp. 131-175.

Szpiro G. (1997). The emergence of risk aversion, Complexity 2: 31-39.

Tesfatsion L. (1997) A trade network game with endogenous partner selection, in H.M. Amman, B. Rustem, and A.B. Whinston (eds.), Computational Approaches to Economic Problems, Dordrecht: Kluwer Academic Publishers, pp. 249-269.

Vallee T., Basar T. (1999) Off-line computation of Stackelberg solutions with the Genetic Algorithm, Computational Economics, 13(3): 201-209.

Wu J., Axelrod R. (1995) How to cope with noise in the Iterated Prisoner's Dilemma, Journal of Conflict Resolution 39(1): 183-189.

© Robert E. Marks, February 2000.