Artificial Intelligence Norvig Ch 4

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

AND nodes

AND nodes - in a nondeterministic environment branching is determined by how the environment reacts to the agents action. So if it is to choose Action #1 the agent must also have a plan for reacting to new states #2 and #3 either of which could occur in response to #1. The non-deterministic environment leads to an AND-OR Tree. A solution for an AND-OR search problem is a subtree that (1)has a goal node at every leaf (2)specifies one action at each of its OR nodes and (3)includes every outcome branch at each of its AND nodes. Note - you can use IF-THEN-ELSE, but if there are more than two branches at a node it might be better to use a CASE construct.

In the belief state search problem, what are actions?

Actions - if the agent is in belief state b = {s1,s2}, but ActionsP(s1) not equal to ActionsP(s2); then the agent is unsure of which actions are legal. If we assume that illegal actions have no effect on the environment, then it is safe to take the union of all the actions in any of the physical states in the current belief state b. On the other hand, if an illegal action might be the "end of the world", it is safer to allow only the intersection, that is, the set of actions legal in all the states.

In the belief state search problem, what is the belief state?

Belief States - the entire belief state space contains every possible set of physical states. If P(the physical problem) has N states, then the sensorless problem has up to 2^N states, although many may be unreachable from the initial state.

Belief states

Belief state - represents the agent's current belief about the possible physical states it might be in, given the sequence of actions and percepts up to that point.

Competitive ratio

Competitive ratio - the ratio of total path cost found by the online agent to the path cost the agent would have followed if he knew the search space in advance.

Complete search function

Complete search function - will always find a goal if one exists.

Contingency plan (aka strategy)

Contingency plan (aka strategy) - in partially observable or nondeterministic (or both) environments the agent must rely on percepts to narrow down the possible set of states he is in, and to tell the agent which of the possible results of his actions has actually occurred. In both cases, the future percepts cannot be determined in advance and the agent's future actions will depend on those future percept. So the solution to the problem is not a sequence but a contingency plan, aka a strategy.

Cyclic solution

Cyclic solution - where the agent is instructed to keep trying the same method until it succeeds. The agent will eventually reach his goal provided that each outcome of a nondeterministic action eventually occurs.

Dead end

Dead end - a state from which no goal state is reachable.

Discretize

Discretize - one way to avoid the issues created by searching thru continuous variables is to discretize the continuous space, i.e. turn it into a "checkerboard" of tiny squares.

Empirical Gradient

Empirical gradient is a concept in continuous search space.

Exploration problem

Exploration problem - where the agent must use its actions as experiments in order to learn enough to make deliberation worthwhile, e.g. in unknown environments. Maze problems are often of this type.

Genetic algorithm

Genetic Algorithms (GA) - is a variant of stochastic beam search in which successor states are generated by combining two parent states rather than by modifying a single state. Like beam searches, GA begins with a set of k randomly generated states, called the population. Each state, or individual, is represented as a string over a finite alphabet. The initial population is ranked by the fitness function and the most fit are mated resulting in offspring which are subject to mutation. fitness function - determines which are the best existing individuals and mates them. Mating - a crossover point in the string is chosen randomly and the first x points from one parent are combined with the last Total minus x points from the other parent to create the offspring. Mutation - each location on the string is subject to random mutation with a small independent probability.

In the belief search problem, what is the goal test?

Goal test - the agent wants a plan that is sure to work, which means that a belief state satisfies the goal only if all the physical states in it satisfy Goal-Test P. The agent may accidentally achieve the goal earlier, but won't know that it has done so.

What is a hill climbing algorithm? When does it fail?

Hill climbing (steepest ascent version) - is a loop that continually moves in the direction of increasing value (i.e. uphill). It does not maintain a search tree, and it does not look beyond the immediate neighbors of the current state. Hill climbing is often called "greedy local search" because it grabs a good neighbor state without thinking ahead about where to go next. Hill climbing often performs quite well because it is often easy to improve a bad position. Hill climbing often gets stuck because of • Local maxima - a peak that is higher than each of its neighboring states, but lower than the global maximum • Ridges - a sequence of local maxima that are no directly connected to each other • Plateau - a flat area of the state space landscape. It can be a flat local maximum, from which no uphill exit exists, or a shoulder from which progress is possible.

What do we know about the relation between a belief state and its subset belief states?

If an action sequence is a solution for a belief state b, it is also a solution for any subset of b. This is very useful for pruning in sensorless problem solving.

What are the various names for maintaining one's belief state in a partially observable environment?

In partially observable environments maintaining one's belief state is a core function of any intelligent system. This function goes under various names, including monitoring, filtering, and state estimation.

In the belief state search problem, what is the initial state?

Initial state - typically the set of all states in P, although in some cases the agent will have more knowledge than this.

Interleaving

Interleaving - rather than having a guarantied plan, the agent can deal with contingencies as they arise during execution. This is called "interleaving" and is useful for exploration problems and gaming.

Irreversible actions

Irreversible actions - lead to a state from which no action leads back to the previous state.

Learning Real Time A* (LRTA*)

Learning Real Time A* (LRTA*) - uses augmented hill climbing with memory which is more effective than random walk. The basic idea is to store the "current best estimate" H(s) of the cost to reach the goal from each state that has been visited. H(s) starts out being just the heuristic estimate h(s) and is updated as the agent gains experience. The estimated cost to reach the goal through a neighbor s' is the cost to get to s' plus the estimated cost to get to a goal from there: c(s,a,s') + H(s'). LRTA* builds a map of the environment in the Result table. It updates the cost estimate of the state it has just left and then chooses the "apparently best" move according to its current cost estimates. Actions that have not been tried in a state s are always assumed to lead immediately to the goal with the least possible cost, h(s). This "optimism under uncertainty" encourages the agent to explore new, possibly promising, paths.

Local beam search

Local beam search - keeps track of k states rather than the single state that local search keeps track of. It begins with k randomly generated states. At each step, all the successors of all states are generated. If one is the goal the algorithm halts. Otherwise it selects the k best successors from the complete list and repeats. A problem may occur where the search lacks diversity as all k searches become focused in a small region. A variant called stochastic beam search can alleviate this.

What is a local search algorithm, and when would you use it instead of a systematic search algorithm?

Local search algorithms operate using a single node, rather than multiple paths and generally move only to neighbors of that node. Typically the paths are not retained. Though not systematic, they use little memory, and can often find reasonable solutions in large or infinite (continuous) state spaces for which systematic algorithms are unsuitable. Local searches are also useful for solving pure optimization problems in which the aim is to find the best state according to an objective function.

Local methods in continuous space suffer from local maxima, ridges and plateaus in continuous space just as they do in discrete space. What can be done?

Local search methods suffer from local maxima, ridges and plateaus in continuous space just as they do in discrete space. Random restarts and simulated annealing can be used.

Non deterministic sensing uses a percept function which returns what?

Non deterministic sensing - if sensing is nondeterministic then we use a percept function that returns a set of possible percepts. Fully observable problems have Percept(s) = s for every state s, while sensorless problems have Percept(s) = null.

What is the "And Or" search model? When is it used?

Now that we know how to derive the Results function for a nondeterministic belief state problem we can apply the And-Or search algorithm to derive a solution. The solution is a conditional plan rather than a sequence. If the first step is an if-then-else expression, then the agent will need to test the condition in the if-part and execute the then-part or the else-part accordingly.

OR nodes

OR nodes - in a deterministic environment there is only one kind of node: the OR node. You can do X OR you can do Y.

Offline search agent

Offline search agents - compute a complete solution before going into the real world to execute it.

What is the difference between the online and offline agents in regards to succesor states? What difficulties does this create?

Online agents can discover succesors only for the node it occupies, cf offline agents which can discover succesors for any node. The difficulty for an online agent occurs when the agent has explored all the actions in a state and must physically backtrack After each action an online agent receives a percept telling it what state it has reached, from this information it augments its map of the environment. Online agents save the results of their actions from each state in a result table: Result[s,a].

Online search agents What kind of environments is it good for? What is the trade off between online and offline search agents?

Online search agents - interleaves computation and action. First it takes an action then it observes the environment and computes the next action. Online search is good for dynamic environments where there is a penalty for sitting around and computing too long. It is also useful in nondeterministic environments where it allows the agent to focus on the contingencies that actually arise rather than those that might happen but probably won't. There is a tradeoff: the more the agent plans ahead the less likely it is to run into problems far in the future. It is necessary for unknown environments where the agent does not know what states exist or what its actions do. Note - the term "online" is often used in computer science to refer to algorithms that must process input data as they are received rather than waiting for the entire data set to become available.

Optimal algorithm

Optimal algorithm - always finds a global minimum (or maximum).

In the belief search problem, what is path cost

Path cost - this can be tricky since if the same action can have different costs in different states, then the cost of taking an action in a given belief state could be one of several values.

Random walk search

Random walk search - agent selects a random available action from the current state. It will eventually find a goal or complete its exploration if the space is finite. Random walks are complete in 1-D and 2-D grids. On a 3-D grid the probability of a random walk returning to its starting point is 0.3405. There are also some real world spaces whose topology causes "traps" for random walks.

Recursive state estimator

Recursive state estimator - computes the new belief state from the previous one cf examining the entire percept sequence. Agent's belief state is updated by computing the conditional probability distribution over all the states given the sequence of observations and actions so far.

Safely explorable

Safely explorable - some goal state is reachable from every reachable state.

In genetic algorithms, what is a schema?

Schema - a substring in which some of the positions can be left unspecified, e.g. 159***. Strings that match the schema (e.g. 159123) are called instances of the schema. It can be shown that if the average fitness of the instances of the schema is above the mean, then the number of instances of the schema within the population will grow over time. GA works best when schemata correspond to meaningful components of a solution (e.g. parts of an antenna such as reflectors and deflectors). A good component is likely to be good in a variety of designs. A successful genetic algorithm requires careful engineering of the representation.

Sensorless (aka conformant) problem

Sensorless (aka conformant) problem - when the agent's percepts provide no information at all.

Sensorless problem solving can be difficult because the size of each belief state is often enormous. What are two potential solutions?

Sensorless problem solving is often difficult because the size of each belief state is often enormous. One solution is to represent the belief state by some more compact description. Another approach is to avoid standard search algorithms which treat belief states as black boxes just like any other problem state. Instead, we can look inside the belief states and develop incremental belief state algorithms that build up the solution one physical state at a time. For example find a solution for State 1, check if it works for state 2, if not, go back and find a new solution for state 1.

Simulated annealing

Simulated annealing attempts to keep the hill climbing algorithm from getting stuck in a local maximum. The idea is to "knock" the algorithm out of the local maximum. Instead of picking the best move, as hill climbing does, simulated annealing chooses a random move. If the move improves the situation it is always accepted. Otherwise the algorithm accepts the move with some probability less than one. The probability decreases exponentially with the badness of the move, the amount by which the evaluation is worsened. The probability also decreases as the "temperature" T goes down. Bad moves are more likely to be allowed at the start when T is high, and become less likely as T decreases.

Solutions for non-deterministic problems are not sequences, what are they?

Solutions for nondeterministic problems can contain nested if-then-else statements. They are trees, not sequences. As in the deterministic environment in the non-deterministic environment we set up search trees to find our goal.

What are the two parts to a state space landscape?

State space landscape - has both "location" (defined by the state) and "elevation" defined by the value of the heuristic cost or objective function.

Sometimes we need a path to a goal, but sometimes we need something less. Sometimes we only need __________?

The searches discussed in Ch3 showed us systematic ways to find a path to our goal. Sometimes we do not need a path (the series of actions to get to the goal), all we want is to know the goal itself.

There are a number of variants of hill climbing: Stochastic hill climbing First choice hill climbing Random restart hill climbing

There are a number of variants of hill climbing • Stochastic hill climbing - chooses at random from the uphill moves, the probability of selection can vary with the steepness of the uphill move. • First choice hill climbing - implements stochastic hill climbing by generating successors randomly until one is generated that is better than the current state. • Random restart hill climbing - conducts a series of hill climbing searches from randomly generated initial states.

How do we solve the sensorless problem?

To solve sensorless problems we search in the space of belief states rather than physical states. Note that in the belief state space, the problem is fully observable because the agent always knows its own belief state. Furthermore the solution (if any) is always a sequence of actions. This is because the percepts received after each action are completely predictable - they are always empty. So, there are no contingencies to plan for. This is true even if the environment is nondeterministic.

How do you solve the belief state search problem?

To solve the belief state search problem we define the problem as ActionsP, ResultP, Goal TestP, and Step CostP.

constrained optimization

constrained optimization - an optimization problem where the solutions must satisfy some hard constraints (e.g. x must be less than 100).


Kaugnay na mga set ng pag-aaral

Ląstelių pažeidimas, priežastys, patogenezė. Ląstelių pažeidimo biocheminiai mechanizmai. Ląstelių pažeidimas. Ischeminis ir hipoksinis pažeidimas. Laisvųjų radikalų sukeliami pažeidimai. Oksidacinis stresas. Apoptozė-priežastys, patogenezė, skirtumai.

View Set

AP Euro Chapters 15-17 Review Questions

View Set

Africa, Lesson 2 - Climate & Vegetation

View Set