Chapter 5: Adversarial Search

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

ProbCut

A forward pruning version of alpha-beta search that uses statistics gained from prior experience to lessen the chance that the best move will be pruned.

Cutoff-Test(s, d)

A function that evaluates the state (s) and depth (d), and decides whether to cut off search and use Eval. Cuts off search for terminal states, too. d is chosen so that a move is selected within a certain period of time. When time runs out, the program returns the move selected by the deepest search. A more robust approach is to apply iterative deepening.

Terminal Test

A function that is true when a game is over and false otherwise.

Zero-Sum Game

A game where the total payoff to all players is the same for every instance of the game (0+1, 1+0, .5+.5).

Transposition Table

A hash table of previously seen positions. Used to avoid storing transpositions - different permutations of the same move sequence that end up in the same position.

Eval(s)

A heuristic to calculate expected utility of the game from a given state, s. Replaced the Utility function. Most evaluation functions calculate features of the state (i.e., type and # of each chess piece) and either: a. Sorting states with similar features into categories and calculating the expected value of categories for each state b. Using a weighted linear or non-linear function to add up the values of features.

Policy

A mapping from every possible state to the best move in that state. Used in endgame when the search space is reduced.

Chance Node

A node in a game tree that denotes a possible state and the probability of reaching it.

Horizon Effect

A scenario where a program is facing an opponent's move that causes serious damage and is unavoidable, but can be temporarily avoided.

Strategy

A series of moves MAX makes in the initial state and all possible subsequent states. An optimal strategy leads to outcomes at least as good as any other strategy when one is laying an infallible opponent.

Forward Pruning

A strategy where some moves at a given node are pruned immediately without further consideration.

Search Tree

A tree that is superimposed on the full game tree, and examines enough nodes to allow a player to determine what move to make.

Game Tree

A tree where nodes are game states and edges are moves. Can be used to determine the optimal strategy from the minmax value of each node, Minmax(n). Can be defined by the initial state, Actions(s), and Result(s, a).

Beam Search

An approach to forward pruning. On each ply, consider only a "beam" of the n best moves rather than considering all possible moves. Dangerous, as it may prune the best move.

Game

Another term for adversarial search problems.

Alpha-Beta Pruning (Stochastic Games)

Can be done by placing upper and lower bounds on the utility function values so as not to search all children before finding the ExpectiMinMax value of a chance node.

Minmax Algorithm

Computes the minmax decision from the current state using a recursive computation of the minmax values of each successor state. The recursion proceeds down the tree to the leaf nodes, and then the minmax values are backed up through the tree as recursion unwinds. Performs complete depth-first search of the game tree. If max depth of the tree is m and there are b moves at each point, then time complexity is O(b^m) and space complexity is O(bm) (or O(m) if the algorithm generates actions one at a time). Impractical for many problems as it must generate the entire game search space.

Quiescent

Describes positions whose values are unlikely to change drastically in the near future.

Multiagent Environments

Environments in which agent needs to consider actions of other agent and how they affect its own welfare.

Quiescent Search

Extra search that expands non-quiescent positions until quiescent positions are reached.

Minmax

Function to find the minmax value.

Stochastic Game

Games that rely on chance, as well as skill. Requires the use of chance nodes to build a game tree. Can only calculate the expected value of a position (the average over all possible outcomes of the chance nodes).

Evaluation Function

Heuristics for approximating the true utility of a state without doing a complete search.

Pruning

Ignoring portions of the search tree that make no difference to the final choice.

Ply

One level of a game tree. In a game where each player can make a single move, the game would be 2-ply.

ExpectiMinMax

Operates like MinMax, but calculates the ExpectiMinMax value due to chance in stochastic games. Must also use ExpectiEval. Time complexity is O((b^m)(n^m)), where b is the branching factor, m is the maximum depth, and n is the # of distinct dice.

Alpha-Beta Pruning

Prunes leaves and entire sub-trees that are never reached in play. Uses two parameters to describe bounds on the backed-up values that appear anywhere along the path: α = highest-value choice found so far along path for Max β = lowest-value choice found so far along path for Min Impractical for deep trees as it must search all the way to terminal states for a portion of the search space.

Game as Search Problem

S0: Initial state Player(s): Defines which player has the move in a state. Action(s): Returns the set of legal moves in a state. Results(s, a): The transition model, which defines the result of a move. Terminal-Test(s): Function to determine if state s is a terminal state. Utility(s, p): A utility function defines the final numeric value for a game that ends in terminal state s for a player p.

ExpectiMinMax Value

Same as MinMax value, except that for chance nodes, whose expected value is calculated as the sum of the value over all outcomes, weighted by the probability of each chance action.

Belief State

Set of all logically possible board states given the complete percept history.

Monte Carlo Simulation (Stochastic Game)

Start with alpha-beta (or other) search algorithm. From a start position, have the algorithm play 1000 games against itself, using random dice rolls to find the move leading to the highest value. Called rollout for dice games.

Terminal State

State where a game has ended.

Singular Extension

Strategy to mitigate the horizon effect. Describes a move that is better than any other move in a given position. Once discovered, when search reaches the depth limit, the algorithm checks to see if it is a legal move; it it is, the move is considered.

Retrograde

Strategy where rules of search are reversed to find moves by Min that result in wins for Max. With Policy, this creates an infallible lookup table for an endgame.

ExpectiEval(s)

The heuristic must be a positive linear transformation of the probability of winning from a position (or expected utility of the position).

Minmax Decision

The optimal choice for Max (that leads to the state with the highest minmax value).

Minmax Value

The utility (for Max) of being in the corresponding state, assuming both players play optimally. Max prefers to move to a state of maximum value and Min vice versa. If Min play sub-optimally, then Max performs even better. Max minimizes loss for a worse case outcome.

State Estimation

Used to track the belief state as a deterministic partially-observable game progresses.

Game Theory

Views multiagent environments as games, provided that the impact of each agent on the others is "significant," regardless of whether the agents are cooperative or competitive.


Ensembles d'études connexes

DNA Replication & Protein Synthesis

View Set

United Kingdom Case Study - PSCI 3350 Exam 1

View Set

Cell Biology 1.b., FINAL EXAM FOR HIST

View Set

chem chapt 5: analytic techniques

View Set

SIPRNET Security Annual Refresher Training (1 hr) (FOUO)

View Set

Intermediate Accounting Chapter 21 Updated

View Set