Metagame: AI Challenge in Game Learning
Metagame: AI Challenge in Game Learning
Allis, editors,
Heuristic Programming in Artificial Intelligence 3 – The Third Computer Olympiad.
Ellis Horwood, 1992.
                             Barney Pell
                       University of Cambridge
                           Cambridge, UK
                       E-mail: bdp@cl.cam.ac.uk
                                      Abstract
           In most current approaches to Computer Game-Playing, including those
       employing some form of machine learning, the game analysis mainly is
       performed by humans. Thus, we are sidestepping largely the interest-
       ing (and difficult) questions. Human analysis also makes it difficult to
       evaluate the generality and applicability of different approaches.
           To address these problems, we introduce a new challenge: Metagame.
       The idea is to write programs which take as input the rules of a set of
       new games within a pre-specified class, generated by a program which
       is publicly available. The programs compete against each other in many
       matches on each new game, and they can then be evaluated based on their
       overall performance and improvement through experience.
           This paper discusses the goals, research areas, and general concerns
       for the idea of Metagame.
   Parts of this work have been supported by RIACS, NASA Ames Research Center [FIA],
   
                                           1
1 Introduction
One reason why game-playing is an exciting activity for humans is that it cou-
ples intellectual activity with direct competition: better thinking and learn-
ing generally implies winning more games. Thus we can test out and refine
our intellectual skills by playing games against opponents, and evaluate our
progress based on the results of the competition.
    The same motivation accounts for much of the original interest in Com-
puter Game-Playing (  ) as a problem for Artificial Intelligence: programs
which think better, may play better, and so win more games. Thus we can
test out and refine different theories of intelligence by writing game-playing
programs which embody these different theories, and then play the programs
against each other, and consider the more intelligent program to be the one
which wins the most games. In the discussion which follows, we shall call this
link between winning games and presumed intelligent behaviour the compet-
itive performance metric for intelligence.
    Unfortunately, most current approaches to  , including those employ-
ing some forms of machine learning, rely on the existence of a great degree
of previous human analysis of particular games. This means that the human
researchers usually wind up doing most of the interesting game analysis, and
makes it difficult to evaluate the generality and applicability of different ap-
proaches. In short, the fact that a program wins most of its games is not
actually evidence that the program (and not its programmer) is doing much
interesting from an AI perspective.
    To encourage researchers to address the more general issues in game-playing,
while maintaining the competitive performance metric which makes  so
attractive as a research problem, we introduce a new challenge: Metagame.
The idea is to write programs which take as input the rules of a set of new
games within a pre-specified class. The programs compete against each other
in many matches on each new game, and the programs can then be evaluated
based on their overall performance and improvement through experience.
    Besides being a more general challenge, Metagame presents opportunities
for addressing many interesting research issues in games and learning, in-
cluding opponent modelling, incomplete and imperfect information, utility of
computation, change of representation, strategic analysis, learning from ex-
perience, and discovery.
    A companion paper ([Pel92]) presents a definition and generator for a spe-
cific class of games, called symmetric chess-like games, and discusses an ex-
ample game produced by the generator.
    The rest of this paper elaborates on the issues presented above. Section 2
substantiates the claim that  is too specialised, and discusses why this is
undesirable. Section 3 introduces the Metagame idea, and Section 4 presents
some interesting research issues possible within the context of Metagame.
                                      2
2 The Gamer’s Dilemma
Before discussing the assumptions behind particular methods in  , I illus-
trate what I mean by the terms specialisation and game-specific, and explain
why these qualities may be undesirable. I begin with a thought experiment
which I shall call The Gamer’s Dilemma:
If the game is Chess, then use the latest version of Deep Thought.
    Ideally, we would like to hand over a general game-playing program, which could study
   
any game for a while on its own (to use its time wisely), and which would become an expert
shortly after having encountered this group of players. Although it is unlikely we will ever
have a program which is totally general, it is useful to bear this ideal goal in mind.
                                             3
    If the game given to the researcher happens to be on this list, this advice
would be extremely helpful. Unfortunately, if the game falls outside the list,
this advice would be of little use. It is becoming a common perception in Ar-
tificial Intelligence that such a list of advice is all that many researchers in
 have to offer.
Game-Tree Search An answer which more accurately reflects the lessons
learned by current approaches advocates the use of a brute-force search method
(e.g., minimax with     pruning and singular extensions), combined with ex-
                                 
tremely fast routines for updating board positions. Although this technique
has proven effective on several games, it presupposes that the researcher has
a good evaluation function, which requires specific knowledge of the game.
    The burden on game analysis thus shifts to the researcher, who must choose
an appropriate set of features and weights for this function. Although there
are some general approaches to learning weights (discussed in Section 2.3),
this approach has offered very little explicit advice about the construction of
appropriate features. However, we are now beginning to understand the im-
portance of some features which may be essential in a variety of games, such
as mobility ([Don92, Har87, LM88]).
              Given the above, are there enough resources to solve the problem?
   By the time the human has answered these questions, it could well be ar-
gued again that much of the real work has already been done.
    The theme of chess-engineering in AI is discussed further in a panel at IJCAI–91
  
([LHM 91]).
                                              4
2.3            Machine Learning
Recently, several researchers have investigated the problem of developing com-
puter programs which could learn how to play games ([Len83, Tad89, Eps91,
CBKF91, LM88, LS91, CFR91]). However, this research is difficult to eval-
uate, and even more difficult to use for learning in different games, because
the games, representations, learning methods, and amount of knowledge en-
gineering vary with each researcher.
    In addition, most learning methods are designed to be improved based on
watching or playing against knowledgeable opponents (e.g., [Tad89, Eps91,
CBKF91, LM88, LS91, CFR91]). Although it is certainly important to under-
stand how a program (or person) could learn from good players, it is equally
important to know how those good players became good in the first place. Until
we have an understanding of discovery ([Len83]), progress in machine learn-
ing will not save us from having to preface advice to other researchers with
the statement: “first, find a human expert.”
    An example from chess may illustrate this problem. It is well known that
obtaining a passed pawn may increase one’s winning chances in some posi-
tions. While the knowledge engineering approach would build this knowledge
in, a machine learning approach would have a program discover this, pre-
sumably by one player in the game creating a passed pawn, promoting it to a
queen, and going on to win the game. But since it requires many careful moves
to achieve the promotion of a pawn, this will generally not happen unless one
player is actively trying to do so. Now, two lazy learning programs competing
against each other will each try to gain their knowledge from their opponent.
But since neither one has this knowledge yet, they are unlikely to observe it in
practice, so neither is likely to learn this concept. Thus neither program will
become even reasonably good at chess. This only serves to show that learning
approaches which depend on better players to show the way do not address
the important issue of knowledge origins, which must be addressed in order to
develop good game-players without relying on humans to do the real analysis.
    Flann ([FD89]) discusses the “fixed representation trick”, in which many developers of
  
learning systems spend much of their time finding a representation of the game which will
allow their systems to learn how to play.
                                            5
          
   Thus, despite our being a field full of experts on getting computers to play
games, and having developed world-champion-level game-specific programs,
we are forced to leave most of the real game analysis to be done by the human
researcher, and not by the computer program.
                                              6
3.1.1 Desiderata for a Class of Games
Many different variants of Metagame can be played, depending on what class
of games it is based upon. Here we state a few desiderata for classes of games:
                   
    One type of game which has been studied extensively is positional games,
in which pieces are never moved or removed once they are played. This class of
games has been the domain for several general game-learning systems ([Eps91,
Kof68]), and could easily serve as a Metagame class definition.
    However, this class is both simple and regular (thus falling short on several
desiderata), and there are well-researched games which fall outside this re-
stricted class. In particular, it would be interesting to play Metagame based on
a class complex enough to include the chess-like games, which have received
much of the attention thus far in  . A companion paper ([Pel92]) devel-
ops the necessary components to play Metagame in the class of symmetric
chess-like games.
3.1.2 Representation
In order to write programs which can accept a set of different games, we must
specify how these games will be represented. Although fully-general represen-
tation languages are possible (like first-order logic or Turing Machines), it is
likely that classes of games will be much more specific, especially those which
can actually be produced by a generator. So, any representation language can
    The class of mathematical games ([BCG82]) is also fully general, which suggests that this
   
                                                       7
be used, so long as the games produced are guaranteed to be unambiguous
in the chosen representation, and so long as the semantics corresponding to
the representation is clear. A natural method of representation, pursued in
([Pel92]), is by means of a game grammar.
    If we do take humans out of the loop, what do they do in the meantime? One suggestion
  
is have them play Metagame also on the new games, and the winning human can play the
winning computer for a real test of human vs. machine!
                                           8
3.3    Game Generator
Given a class of games, there are two ways to produce new instances within
this class: we can either use human-generated games, or program-generated
games.
    Another way of stating this is that writing programs to play any games generated by
  
humans may result in researchers attempting to hit a moving target. A transparent generator
is needed to make the problem well-defined.
                                            9
thus far.
    The link between game-evolution and game-playing may be even tighter than we imagine.
  
Games often change when strategic analysis has rendered them either too difficult or too
boring, and each change to a game introduces new opportunities for strategic analysis, so
that games and their strategies evolve together.
    It would be interesting to allow programs to negotiate over draws, to avoid them play-
      
ing out games that neither player can win. However, this ability complicates the rules of
competition, and so will be left as an idea for the future.
                                              10
    A concern which arises if we decide on fixed limits in real time is that power-
ful computers would have an advantage. It could be argued that a Metagame
tournament is a chance to require all programs to have equivalent resources,
as it would again be unfortunate to have clever programs defeated by brute-
force programs running on vastly more powerful machines. However, limits in
real time are probably unavoidable, as it would
                                             
                                                   be very difficult to make this
comparison across different architectures. Further, the presence of powerful
machines in a tournament presents the possibility for interesting confronta-
tions between sophisticated and brute-force game-playing methods.
4 Research Areas
4.1      The Full Issues
So Metagame encourages researchers to consider the full issues behind game-
playing. Because the games are possibly new, we cannot provide our programs
with many ready-made features other than the rule-system defining the games
and the goals. Because the programs have resource limits, and the games have
grossly varying search spaces, we cannot assume that a fixed search strategy
(like minimax) will necessarily be applicable. So good programs will modify
their search strategies to meet the new games. Because they are not playing
against experts, they cannot assume that the moves played by the winner were
necessarily better, and learning systems must begin to address the problems
of discovery. Further prominent issues to be addressed in this context are the
tradeoff between exploration and exploitation ([Pel91]), opponent modelling,
transfer of learned knowledge across games, and limited rationality. Perhaps
most importantly, in this general context it would be very interesting to eval-
uate the tradeoff between knowledge and search: can a brute-force program
still defeat knowledge-based reasoners, even in Metagame?
                                                11
pre-compile search knowledge, while others might refine after playing. An en-
tire spectrum of game-learning strategies could be compared and tested in a
performance context of competition.
5 Conclusion
So, Metagame presents a framework and a test bed for addressing many in-
teresting and important questions about game analysis, which until now has
largely been performed by humans and built directly into programs. This pa-
per has presented the general problems and concerns with this new challenge,
and a companion paper ([Pel92]) addresses the specific issues involved in con-
structing the necessary components to play it in a particular class of games.
   Leaving instruction and communication to future research, then, the next
step in Metagame is to have a game-learning tournament, where programs
compete against different opponents in widely-varying games, analyse new
games without the help of human game-specific knowledge, learn by play-
ing and exploring, and trade off chances of gaining useful information with
chances of winning a present game.
                                     12
   In conclusion, meeting the challenge of Metagame may shift the field of
computer game-playing back from an engineering to a research discipline,
wherein winning a game would again be a sign that the program, and not
simply its programmer, is doing something intelligent.
6 Acknowledgements
Thanks to Victor Allis, Nick Flann, Mike Frank, Innes Ferguson, Robert Levin-
son, Karl MacDorman, Dan Pehoushek, Mark Torrance, Prasad Tadepalli, and
David Wilkins for interesting discussions. Thanks also to Jaap van den Herik
for careful proofreading. Thanks especially to my supervisors, Steve Pulman
and Manny Rayner, and to my mentor-in-absentia, Peter Cheeseman.
References
[AvdHH91]     L.V. Allis, H.J. van den Herik, and I.S. Herschberg. Which
              games will survive. In D.N.L. Levy and D.F. Beal, editors,
              Heuristic Programming in Artificial Intelligence 2 – The Sec-
              ond Computer Olympiad. Ellis Horwood, 1991.
[AvdMvdH91] L.V. Allis, M. van der Meulen, and H.J. van den Herik.
            Databases in awari. In D.N.L. Levy and D.F. Beal, editors,
            Heuristic Programming in Artificial Intelligence 2 – The Sec-
            ond Computer Olympiad. Ellis Horwood, 1991.
[BCG82]       E.R. Berlekamp, J.H. Conway, and R.K. Guy. Winning Ways
              for your mathematical plays. Academic Press, 1982.
[Don92]       Ch. Donninger. The relation of mobility, strategy and the mean
              dead rabbit in chess. In H.J. van den Herik and L.V. Allis, ed-
              itors, Heuristic Programming in Artificial Intelligence 3 – The
              Third Computer Olympiad. Ellis Horwood, 1992.
                                     13
[FD89]     Nicholas S. Flann and Thomas G. Dietterich. A study of
           explanation-based methods for inductive learning. Machine
           Learning, 4:187–226, 1989.
                                 14
          International Joint Conference on Artificial Intelligence, pages
          694–700, 1989.
15