CSCE 420 TAMU Exam 1
Maintaining Arc Consistency
a problem is solved when every node has just 1 value remaining if any domain is empty, MAC must back-track to previous choice point and try another value, followed by calling AC-3 to propagate consequences by reducing domains MAC is a wrapper algorithm around AC-3 that iteratively makes another choice and class AC-3
stochastic HC
choose any successor that is better than current state must still bias the search upward leads to simulated annealing
macro operators
create new operators from combinations of 2 or 3 actions, expanding number of successors
DeepBlue
custom ASICs for very fast minimax search end-game database
arc-consistency
a graph is arc-consistent if for every variable X, for every value a in dom(X), for every variable Y it is connected to (by a constraint), there is a value b for Y that is consistent with X=a
Genetic algorithms
- maintain a population of multiple candidate states (parallel search, not just curr) - mix and match states by recombination - use fitness to select winners for each round, akin to 'natural selection' (fitness(state) is synonymous to value(s) or quality(s))
Minimax Search
- recall that ui(s)=0 for non-terminal states - label alternating levels in the search tree as max nodes and min nodes - define minimax value for each state s
problems with board evaluation functions
-non-quiescence use dynamic IS-CUTOFF(s) test -horizon effect delaying the inevitable
Problems with Hill climbing
1. local maxima 2. plateau effect 3. ridge effect
Possible solutions to HC problems
1. random restart HC 2. stochastic HC 3. provide memory of previous states (leads to beam search) 4. macro operators ("macrops")
Graph search
BFS+checking for visited states (reached data structure)
CSP heuristics
MRV - select var based on minimum remaining values LCV - select value for var based on least constraining value degree heuristic: if all domains are equal sized, choose the variable that is involved in the most constraints (connected to the most other vars)
BFS: time and space complexity complete & optimal?
FIFO time: O(b^(d+1)) space: O(b^(d+1)) is complete is optimal assuming all operator have equal cost
DFS: time and space complexity complete & optimal?
LIFO where m is maximum depth time: O(b^m) space: O(bm) no & no
Complexity of AC-3
O(cd³) = O(n²d³) where c edges where d is max domain size: d= max|dom(Vi)|
Computational Complexity of CSPs
Solving CSPs is NP-hard Determining whether CSPs have a solution is NP-complete
Simulated annealing acceptance
accept with prob = e^(-∆E/T) where ∆E = value(curr) - value(child) - if child is only a little worse, ∆E is small, so accept with high prob -if child is much worse, ∆E is large, and acceptance is less likely where T ("temperature" controls) how loose or stringent we are - in the limit T→∞: all backward steps allowed - in the limit T=0: no backward steps are allowed
Beam search
adding "memory" to HC keep track of K best previous nodes (based on q(n)) allows some back-tracking, even if not complete enough to explore the whole space
ridge effect
all neighbors have same or lower score, even then there might be other close states that are better often related to limitations of successor function
heuristic function
an estimate of the distance (estimated path cost) remaining from n to the closest goal generally h(n) >= 0, and h(n) == 0 for goals
common strategy of heuristics
approximate how many steps it would take to solve if we relaxed the constraints
decision at root node
argmax{minimax(s') for s'∈succ(s)} i.e. choose the action that leads to the successor with the highest score, which has the highest expected payoff
α/β-pruning
at each node, keep track of 2 additional values α, β (along with minimax value) these represent the lower- and upper-bound on what minimax(s) could eventually be initially, set α,β = [-∞,+∞] as we process children, update these - at max nodes, update α: α=max{α, minimax(ch)} for each ch∈succ(s) - at min nodes, update β: β=min{β, minimax(ch)} for each ch∈succ(s)
Simultaneous Games
both agents act at same time, choosing from discrete action space usually characterized by a payoff matrix
b
branching factor average number of successors for each state
depth-limit while searching a game tree
need a board-evaluation function to assign scores to internal nodes estimates probability of winning or expected payoff from each state (heuristically) choose depth limit based on time available (and CPU speed) - expressed as a number of ply (moves or levels)
A* time complexity
depends on accuracy of heuristic boundary case 1: h(n) = 0, like uniform cost boundary case 2: h(n) = c(n), perfectly predicts true distance (in time linear in the path length) if the inaccuracy of the heuristic is bounded, search will be sub-exponential
Iterative deepening
depth-limited search -do DFS down to depth=1 -if goal not found, do DFS down to depth=2 -....
optimality
does ALGO guarantee to find the goal node with the minimum path cost?
Greedy search
extends iterative search algo to use heurisitc use priority queue for frontier; sort nodes based on h(n)
Constraint satisfaction
finding a configuration of the world that satisfies some requirements (constraints) which restrict the possible solutions
AC-3
formalization of constraint propagation as a graph algorithm let (V,E) be the constraint graph define arc consistency ensure the initial graph is arc consistent after making a choice for an initial var, it might rule out some choices in domains of neighbors, so must check that its neighbors are arc consistent put edges to be checked in a queue
Uniform cost algorithm
frontier is a priority queue finds least path cost when operators have diff cost
completeness
if a goal exists, does ALGO guarantee to find it?
Lamarckian evolution
improvements/ adaptation acquired during lifetime of individual can be passed on to offspring
Recombination or 'cross-over'
instead of an operator to generate successors from states, use recombination to combine parts of existing members of population by selecting parents at random and recombining them, you sometimes get the best of both and produce an improved state for chromosomes, splice their strings at random location
Monte Carlo Tree Search
instead of exploring search tree, sample random paths (rollouts) all the way to terminal states the value of a state is taken as the statistical average outcome of trajectories passing through it (back-propagate outcomes) also keep track of n (# trial trajectories passing through each node) and variance (σ²) at each state to assess certainty - selection policy (which state to start simulation from) - playout policy (approximation strategy to simulate reasonable moves)
Iterative improvement search
local search maximize "quality" of states, q(s) or value(s) different from path cost
Min-Conflicts Algorithm
local search for CSPs start by choosing a random variable assignment (which probably violates lots of constraints) pick a variable at random and change its values to something that causes less conflicts repeat until it "plateaus" (number of conflicts stops decreasing) note: this is NOT guaranteed to find a complete and consistent solution! but it works surprisingly well in practice
Hill climbing
maintain only single current state generate successors using operator, pick best 8 queens operator: move any queen to another row in the same column
Why use iterative deepening
maintains linear frontier size like DFS while searching level-by-level like BFS
GAs mutation
make random changes to state (like operator) at low frequency
space-complexity
maximum size to which the frontier grows
Sequential games
multiple steps - players take turns each player has a utility function +1 for win; -1 for loss; 0 for draw (tic-tac-toe)
Board evaluation functions
must guess the value of each state typically based on features
time-complexity
number of nodes goal-tested (# of loop iterations)
GAs optimization
power comes from competition survival of the fittest
8 queens problem: iterative improvement search
q(s) = -(number of pairs of queens that can attack each other) higher the better
domain knowledge
refers to anything we know about solving these types of problems
GAs and chromosomes
represent state as a bit string
Problem with applying Minimax to most games
search space is too large
AlphaGo
self-play deep neural network
what limits AI search
size of the frontier
A* algorithm
sort nodes in frontier based on f(n) = g(n) + h(n), where g(n) = path cost so far h(n) = heuristic estimate f(n) = estimate of total path cost
Iterative deepening: time and space complexity complete & optimal?
space: O(bd) time: O(b^(d+1)) complete & optimal
Expectiminimax
stochastic games - games with an element of chance interleave min and max nodes with a level of chance nodes at chance nodes, thee score is the weighted sum over the children, weighted by probability, i.e. expected outcome
Simulated Annealing
stochastic search choose next child randomly, but "bias it upward" always accept better states, and accept worse states probabilistically, proportional to how much lower the quality is
k-consistency
the concept of arc-consistency can be generalized to path-consistency (mutually consistent choice for 3 variables related by constraints), and to k-consistency (sequences of k nodes) in the limit: n-consistency (for n vars) means every node has at least 1 choice consistent with every other node
Why are games useful to AI
they represent adversarial environments DeepBlue AlphaGo
minimax(s)
ui(s) if s is a terminal state max{minimax(s') for s'∈succ(s)} if s is a max node min{minimax(s') for s'∈succ(s)} if s is a min node
Constraint satisfaction formal framework
variables: {Vi} domains: dom(Vi) = {a1...an} - a finite set of possible values for each variable constraints: diff for each problem solution: a complete variable assignment that satisfies all constraints
Forward-checking
very similar to MRV MRV is passive FC is active: every time you choose a value for a var, you remove inconsistent values in domains of other vars (like propagation)
how can CSPs be NP-complete if AC-3 runs in polynomial time, O(cd3)?
we might have to call it an exponential number of times from MAC before we find a complete and consistent solution
plateau effect
when all neighbors have same score and you "lose the gradient", even if not at top of hill
pruning condition
when interval of node and parent no longer overlap
Constraint propagation
whenever we make a choice at one node in the constraint graph, propagate the consequences to neighboring nodes
Uniform cost algorithm: time and space complexity optimal & complete?
where C* is total path cost of cheapest solution where ε is minimum cost of each step time: O(b^(1+C*/ε)) space: O(b^(1+C*/ε)) optimal and complete
Temperature schedules
• a critical part of SA is to start with a high temperature and gradually lower it • this allows the search to sample many local maxima initially, but over time, it becomes more selective and climbs up the best hill it it can find
