5. Alpha-beta pruning

¡Supera tus tareas y exámenes ahora con Quizwiz!

Benefits

Addresses the time complexity issue of the minimax strategy by exploring fewer nodes. Can significantly reduce the search space in games with large state spaces.

Criteria for pruning & rationale

Alpha (α) represents the best (maximum) value that the maximizing player (Max) is assured of so far at any level above the current one. It is the value that Max can guarantee for themselves based on prior moves. Beta (β) represents the worst (minimum) value that the minimizing player (Min) is assured of so far at any level above the current one. It's the value that Min can guarantee for themselves based on prior moves. When you're at a Min node and you find a child node (a potential game state) whose value is less than the current alpha (α), it means that this path (through this particular child of the Min node) will not result in a better outcome for Max than what Max has already found elsewhere in the tree. Hence, Max would avoid this path. Therefore, if this is the case, then Max will never choose the path leading to the current Min node we are evaluating, since Max always goes for the maximum value. Hence, if the path's value is less than α, it's not optimal for Max. This gives us the criteria for pruning: if α ≥ β at a Min node, we can stop evaluating the rest of its children and prune the min node, as we know that the Max player will never choose this path in the first place. In essence, when at a Max node: If a child's value is greater than or equal to β, it signals that this path will provide a result that's too good for the Max player, and the Min player would avoid it because they have a better (lower) option elsewhere. Hence, the remaining children of the Max node can be pruned, as evaluating them would be futile; they won't change the Min player's decision. In summary, for a Max node, if you find a child node with a value greater than or equal to the current beta, you can prune the subsequent siblings of that node because the Min player will avoid this branch based on their better option elsewhere in the tree.

propagation

At a MAX node: The α (alpha) value represents the best (i.e., highest) value that the maximizing player can guarantee based on paths explored so far. The β (beta) value represents a kind of "threshold" that comes from a higher-level MIN node (if there is one). If the value at the MAX node ever exceeds or equals this β value, we know that the minimizing player has a better option elsewhere, and we can prune the remaining branches under this MAX node. The β value at a MAX node is not "natively" determined by that MAX node itself; it's inherited from a higher-level (parent or ancestor) MIN node. If no such MIN node has set a β value yet (e.g., if the MAX node is the root of the tree), then for practical purposes, the β value can be considered as positive infinity (i.e., no immediate threshold for pruning). Conversely, at a MIN node, the α value is similarly inherited from a higher-level MAX node. This propagation of α and β values through the tree is what facilitates Alpha-Beta pruning and helps to avoid unnecessary exploration of branches that won't affect the final decision.

Definition

Definition: An optimization technique used with the minimax algorithm for decision-making in two-player, zero-sum games (e.g., chess and checkers). Purpose: To reduce the number of nodes (game states) evaluated in the search tree, increasing search efficiency.

Algorithm

Eliminate branches in the search tree that we know will not be chosen by the players because they are not optimal. Uses two threshold values: alpha (α) and beta (β).α: Highest value found along the path so far (best score maximizing player can achieve).β: Lowest value found along the path so far (best score minimizing player can achieve). Initially:α = negative infinity (worst possible score for the maximizing player).β = positive infinity (worst possible score for the minimizing player). Each player tracks their best value.If β < α at any step, the current branch is pruned. Functions similarly to the minimax algorithm but prunes unwanted branches using the α and β values.

Pruning

Maintain alpha for the max node, maintain beta for the min node If the beta of a min node is less than the alpha value of its parent node, then prune the current node, algorithm will make another choice If the alpha of a max node is more than the beta value of its parent node, then prune the current node, algorithm will make another choice

Mini and Max Nodes

Starting at a Max node: Begin with alpha = -infinity (because you're looking for the maximum value and haven't seen any values yet). Explore the first child (which is typically a Min node). Once this child is fully evaluated, if its value is greater than the current alpha, update alpha with this value. Move to the next child. If at any point during the evaluation of this child (or any subsequent child), the current alpha is greater than or equal to the beta passed down from the parent, prune the remaining exploration of this child and any subsequent children. Starting at a Max node: Begin with alpha = -infinity (because you're looking for the maximum value and haven't seen any values yet). Explore the first child (which is typically a Min node). Once this child is fully evaluated, if its value is greater than the current alpha, update alpha with this value. Move to the next child. If at any point during the evaluation of this child (or any subsequent child), the current alpha is greater than or equal to the beta passed down from the parent, prune the remaining exploration of this child and any subsequent children.

DFS Traversal

Starting from the root (typically a Max node): Traverse down the leftmost path, navigating through alternating Max and Min nodes, until you reach a terminal node (leaf). Evaluate the utility of the terminal node. This is typically done using an evaluation function, especially if the leaf doesn't represent a game-ending state. Propagate the value upwards: If the leaf is a child of a Min node, set the initial value of that Min node to the value of the leaf. If the Min node has more children (other leaves or subtrees), compare their values and keep the minimum. This process continues until all children of the Min node have been evaluated. Now, propagate the value of the Min node up to its parent (a Max node). If this is the first child of the Max node you've evaluated, set the initial value of the Max node to that value. If the Max node has more children, traverse them in the same depth-first manner, and keep the maximum value found among its children. Alpha-Beta Pruning (if applied): While traversing and evaluating the children of a Min node, if at any point the value found is less than the alpha (the best value for the Max player so far), you can prune the branch because the Max player will never choose this path. Similarly, while traversing and evaluating the children of a Max node, if the value found is greater than the beta (the best value for the Min player so far), you can prune the remaining children because the Min player will avoid this path. Continue the traversal: After finishing with the leftmost child (subtree) of the root, move on to the next child and repeat the process until all children have been evaluated. Determine the best move: Once all children of the root have been evaluated, the value of the root will represent the best possible outcome for the maximizing player, given optimal play by both players. The move associated with that value will be the best move.

Strat

Strategy and Parameters: Performs a depth-first search (DFS) like minimax, propagating the values from the terminal nodes to the root, starting with the left most branch. Maintains α and β values throughout the search: -α: Largest value for Max across seen children (current lower bound on MAX's outcome). - β: Lowest value for Min across seen children (current upper bound on MIN's outcome). -Sends α and β values downwards during the search for pruning and updates them based on terminal node values. -For a MAX node: If α ≥ β of its parent MIN node, prune the rest of the branches. -For a MIN node: If β ≤ α of its parent MAX node, prune the rest of the branches. -explores other path otherwise.

Efficiency

You're right that you can't prune without some exploration. The power of alpha-beta pruning doesn't come from avoiding exploration entirely, but from avoiding the **full** exploration of certain branches. The idea is to recognize as soon as possible when a particular branch will not yield a better result than what you've already found elsewhere. Let's break this down: 1. **Initial Values of α (alpha) and β (beta)**: - When you start, α represents the best value (highest) the maximizing player can achieve so far, and it's initialized to negative infinity. - β represents the best value (lowest) the minimizing player can achieve so far, and it's initialized to positive infinity. 2. **Updating α and β**: - As you explore the tree and evaluate nodes, you continually update these values. α will be updated whenever the maximizing player finds a better (higher) value, and β will be updated whenever the minimizing player finds a better (lower) value. 3. **Pruning Logic**: - Whenever the algorithm is exploring a Min node (minimizing player's move) and finds a potential move (child node) that has a value less than α, it can stop exploring further children of that Min node because the maximizing player will never let the game reach this state (since there's already a better option available). - Similarly, whenever the algorithm is at a Max node and finds a potential move that has a value greater than β, it can prune the remaining children because the minimizing player would avoid this path. Here's a basic example to illustrate: Imagine you're the maximizing player. You've already explored one branch of the game tree and found a move that guarantees you a score of 5. This becomes your alpha value. Now, you start exploring another branch. You reach a minimizing node (opponent's move) deep down that branch, and the first potential move for the opponent you explore gives them a score of 4, meaning you get less than your alpha (5). At this point, you don't need to explore the other potential moves for the opponent in this branch because you already know they have a move that will limit you to 4. And since you've already found a move elsewhere that can get you 5, you'll never choose this branch. Thus, you can pr

Summary

you apply the pruning logic with the assumption that there's a parent node(i.e. the player before you) that will try to do the opposite of what the current node is doing, so you could either be trying to get the smaller or larger value. For pruning, you go from the parent and look down to the next levels, "which option would I be prevented from picking?". Also always fill the tree upwards to the parent, starting from left most branch, updating the pair of alpha and beta values accordingly. You will prune not the branch you just visited, but the unseen other branch of the parent of the node you just visited. Max player is always looking to maximize the value. When traversing the tree, if it finds a potential outcome that is greater than or equal to the current best known value for the Min player (represented by β), it knows that the Min player will never let the game reach this state. Therefore, it can safely prune the rest of the possibilities under this branch because it understands that the Min player has a more favorable option elsewhere. Min player, on the other hand, is always aiming to minimize the value. If it discovers a potential outcome that is less than or equal to the best known value for the Max player (represented by α), it understands that the Max player has a better option elsewhere and will never choose this path. Hence, it can prune the subsequent branches. This push and pull, or tug of war, between the two players is central to the logic of the algorithm. Each player is always assuming that the other is playing optimally. So, when evaluating potential moves, they not only consider what's best for themselves but also anticipate the best possible response from their opponent. The anticipatory nature of their decision-making is what allows for pruning, as certain branches of the game tree become irrelevant based on the optimal strategies of both players.


Conjuntos de estudio relacionados

Psychological Testing and Assessment-Review 2

View Set

Security Assesment & Penetration testing techniques

View Set

Ch. 4 Information Security Policy

View Set

CGF Unit 7 Practice Questions- Gastrointestinal Disorders

View Set

Classroom Assessment Chapter Four

View Set

Chapter 3: Theoretical Perspectives

View Set