abstract = "Monte-Carlo Tree Search algorithms (MCTS [4, 6]),
including upper confidence trees (UCT [9]), are known
for their impressive ability in high dimensional
control problems. Whilst the main test bed is the game
of Go, there are increasingly many applications [13,
12, 7]; these algorithms are now widely accepted as
strong candidates for high-dimensional control
applications. Unfortunately, it is known that for
optimal performance on a given problem, MCTS requires
some tuning; this tuning is often handcrafted or
automated, with in some cases a loss of consistency,
i.e. a bad behavior asymptotically in the computational
power. This highly undesirable property led to a stupid
behavior of our main MCTS program MoGo in a real-world
situation described in section 3. This is a big trouble
for our several works on automatic parameter tuning [3]
and the genetic programming of new features in MoGo. We
will see in this paper:
-- A theoretical analysis of MCTS consistency;
-- Detailed examples of consistent and inconsistent
known algorithms;
-- How to modify a MCTS implementation in order to
ensure consistency, independently of the modifications
to the scoring module (the module which is
automatically tuned and genetically programmed in
MoGo);
-- As a by product of this work, we'll see the
interesting property that some heavily tuned MCTS
implementations are better than UCT in the sense that
they do not visit the complete tree (whereas UCT
asymptotically does), whilst preserving the consistency
at least if consistency modifications above have been
made.",