Eecs 492

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

defining a problem

(1) initial state (2) description of the possible actions (3) transition model (4) goal test (5) path cost

Reflexive

- Actions are triggered based on preprogrammed conditions.

Simple-reflexive agents

- Agents select actions based on current percept. - Get away from tabulated state! - think of if, then statements. if x then this else if y then this else if z then this

types of state

- Atomic -Factored -Structured

problems with DFS

- DFS is good in that it saves space. However, it only saves space when it is using tree search. Tree search means that the DFS could loop forever.

Definition of a game

- S0: the initial state - PLAYERS(s): Defines which player has move in state - ACTIONS(s): Returns set of legal moves in state - RESULT(s,a): Transition model defines result of move - TERMINAL-TEST(s): . Returns true when game is over (in a terminal state) . Returns false otherwise -UTILITY(s,p): Utility function (how good/bad my terminal state is, the only states that have a utility value are the terminal states) . Defines numeric value for a game that ends in terminal state, s, for player p . Zero-sum game: total payoff to players is the same for every instance of the game (b/c only end states have a utility value, this is why the heuristic function is so important. It allows us to see how good a state is without actually needing to reach the end of the game) (- can't do local search b/c local search will give us a configuration, not a move. We already know what we want wining to look like. We need to know what to actually do. Was a note on this slide but don't really know if this fits here?)

Fixed model

- The designer provides a model.

Learning model

- The model can adept automatically to the observed conditions.

Model-based reflexive agents

- Use a model to estimate state from percept - keep track of part of the world it can't see now - unlike simple agents, it can react to changes in the world because it can look at past world states. - still Bound by condition-action rules so it can't learn from the world around it. - the model and rules it follow are fixed and do not change. If for example you want to plug in a new destination, this requires a completely new set of rules. - is used in partially observable worlds. - it's limitations are: - that it's difficult to determine the exact current state of a partially observable environment. - what the world looks like now is only a best guess.

Two-player zero-sum game

- a special category of multiagent environment - only one other agent (adversary) - fully observable - deterministic -characteristics of adversary: - objects are exactly opposed to primary agent - zero-sum game (payoffs aren't changing) - agents move in alternating turns

Limitations of a simple-reflex agent

- action decisions are made on the current percept: . only works if the world is fully observable.

Table driven agents

- agents choose actions based on the state of the world. - actions and states are linked in a table. - explicit structure. - this quickly becomes intractable!

utility function

- an internalization of the agent's performance measure.

Tree Search

- assumption: state space is a tree - frontier is a queue of nodes (partial paths) FUNCTION tree-search(problem) RETURNS a solution or failure initialize frontier initial state of problem LOOP DO IF frontier is empty THEN RETURN failure choose a leaf node and remove it from the frontier IF node contains a goal state THEN RETURN solution expand chosen node, add resulting nodes to frontier (have a question about this ^ part) - a tree-like alg does not check for redundant paths

Breadth First Search

- best used when step costs are equal. - what it means for search: . expand root node . expand each child of the root node . expand each child of each child of the root node - the reach array does not hold nodes it holds states. it does this because when you go down to the next layer, it is already more expensive than the layer above it. Since you always discover the things above it first you don't need to keep path costs because the first time you find it is the cheapest. - reached stores states - frontier is FIFO queue - for bfs we assume that each action has a cost of one. - does goal state checking when expanding nodes, not when popping them off of the frontier. (early goal test)

Graph Properties

- branching factor (b) - depth (d) - maximum path length (m) - these properties help us to determine space and time complexity, and the feasibility of running at all.

game tree

- composed of: . ACTIONS function . RESULT function - the state space is made up of the initial state, ACTIONS function, and RESULT function. A search tree can be superimposed over part of the graph to determine what move to make. The game tree is a search tree that follows every sequence of moves all the way to a terminal state (- can't do local search b/c local search will give us a configuration, not a move. We already know what we want wining to look like. We need to know what to actually do.)

cost optimality

- does the strategy find the optimal solution? - can define in terms of path cost?

local search

- don't care about paths. ( still looking at the environment as observable, deterministic, and known. However, the solution is not a sequence of actions) - overview: - evaluate and modify current states(s) - no longer exploring paths from initial state - Relevance: - sometimes, we care about the solution state - we may not care about the path cost - Local search is good when you want a solution but you don't care how you got there. There may be multiple solutions to a problem. But, that does not matter. As long as you find a state that matches your criteria that is all that matters. - basics: - maintains only current node (not multiple paths) - generally, move only to neighbor - paths are not retained - no real notion of path cost or goal test! (don't know what my goal looks like, but you do know the properties of that state so goal testing is not needed.) - advantages: - very low memory cost - can often solve large continuous problems - can solve pure optimization problems: -find the best state via an optimization function -objective function as reproductive fitness

Depth First Search

- expand to the deepest node in the frontier - as nodes are expanded they are dropped from the frontier (leading to better space complexity) - the search backs up to the next deepest node - frontier is LIFO so the most recently generated node is expanded. - commonly implemented recursively. - don't need reached if tree-search, do if graph-search. we want to think about it in terms of tree-search. not having a reach set, leads to better space but could mean that we do looping. - frontier is last-in-first-out queue

reflecting on hill climbing

- for hill climbing: - it never goes down - guaranteed to be incomplete - for purely random walk: - complete - incredibly inefficient

depth (d)

- how many levels are in our graph. - how many actions we take to get to our solution.

search strategies

- how to proceed through a tree graph. - the search strategy determines the way we use reached and the frontier!

properties of the minimax alg

- m = maximum depth - b = legal moves at each point - time complexity = O(b^m) (not good) - space complexity = (1) O(bm) - generate all actions at once (2) O(m) - generate one action at a time - preforms a complete DFS exploration of the game tree. - the exponential complexity makes it impractical for complex games with large branch factors.

graph search

- more generally, state space may be a graph - explored set holds nodes already visited

trouble with utility-based agents

- must keep track of environment: .perception .representation .reasoning .learning - choosing the correct action often is computationally complex.

How will we tackle two-player zero-sum game?

- new approaches: - redefine what an optimal move is. - identify an alg to find the optimal move. - use pruning: . Ignore less relevant portions of the search tree . Cuts down on time and memory! (Pruning minimizes the search space so we can search faster and further) - heuristic evaluation functions (allows us to talk about how good or how bad an intermediate step is without having to actually make it all the way down to the termination of the game) . Approximate true utility of a state absent complete search (or with imperfect information)

limits of the state space tree

- not all problems have a simple state space: . state space can be infinite (including continuous) . states can repeat . result function can be very complicated (think simulation of the laws of physics) - it is important to allow ACTIONS() and RESULTS() to be generalized functions

problems with hill climbing

- not complete.

random repeats

- repeat hill climbing from randomly chosen initial state. - return best local maximum found. - if left to run forever in a finite space, will be complete and optimal

search effectiveness

- search cost (in terms of cost) - path cost (in terms of literally the path that you are choosing) - total cost (add more after watching lecture)

scholastic hill climbing

- select among positive steps at random. - probability proportional to steepness.

state space of local search

- shoulder - global maximum - local maximum - flat local maximum - "plateau" - current space - these are represented on a graph where the y-axis is objective function and the x-axis is state space. the x-axis is the LOCATION (function of state). The y-axis is the ELEVATION (defined by value of heuristic cost function or objective function). (when searching the space elevation = cost -> find the global min. Elevation = objective function -> find the global max)

Uninformed Search

- simplest form of search - the only information available to the algorithm is what is provided by the problem definition. - functionality: . generate successors: generating nodes that would follow from being in one state and doing something. . test for goal state vs. not goal state - distinguishing factor: search order! (don't really know what this point means). - just because it is the simplest does not mean that it is the worst. It can help to gather information about the world.

Agent programs (types)

- table-agents - simple-reflex agents - model-based reflexive agents - goal-based agents - utility-based agents

Predictive

- the effect of potential actions (or actions sequences). are predicted. - the most desirable action is performed.

search

- the process of looking for a sequence of actions that reaches the goal. - search algorithm: . input: the problem .output: a solution (if it exists) - agent design: formula, search, execute (agent ignores percepts when executing). search overview: - crate a tree that has nodes: .root node- the initial state .fringe node- a path in the graph .leaf node- a node without any children (yet)

atomic state

- the world has no internal structure visible to the agent. - example, if the states are cities, then the atomic representation would be just the city itself.

BFS Limitations

- time complexity is a problem - space complexity is the killer. - rule of thumb: exponential-complexity search problems cannot be solved by uniformed methods for any but the smallest instances!

Dijkstra's algorithm/uniform-cost search

- use when step costs are not equal. - expands node n with the lowest path cost g(n) - frontier is stored in a priority queue ordered by g - reached stores nodes - frontier: priority queue, path-cost

Utility-based agents

- uses a performance measure to differentiate between world states. - uses a utility function. - this function allows for more flexibility: . conflicting goals . a set of multiple uncertain goals. - it no longer has a binary of happy-sad. Actions can now be defined as making the agent happy or sad over a spectrum. - designed for situations in which many different paths reach the goal. it is designed with preference. - unlike goal-based agents, does not fail if it can't reach a goal. they can choose the least bad option. So if they cant achieve the goal, they can at least do something. (choose the best action among the actions that get it closer to its goal.). - rational utility-based agent: maximize expected utility.

problem solving agent

- when the correct action to take is not immediately obvious, an agent may need to plan ahead: to consider a sequence of actions that form a path to a goal state. - uses search. - sees the world in an atomic representation. - hopes that the environment is: episodic, single-agent, fully observable, deterministic, static, discrete, and known. - types of problems: . solution- fixed sequence of actions. . future actions don't depend on future observations.

Stateless

- world state = current percepts. - effects of time and actions are not modeled.

optimal decisions in games

-Max's strategy must be a conditional plan- a contingent strategy specifying a response to each of Min's possible moves. optimality of minimax: - need a contingent strategy: . calculate the value of each state- Minimax(s) . calculates utility (for Max) of being in given state . Assumes that both players play optimally! (if min does not play optimal then max will do at least as well as against an optimal player) minimax(s) = { Utility(s) if terminal-test(s), max_(a E ACTIONS(s)) minimax(RESULT(s,a)) if PLAYER(s) = MAX, min_(a E ACTIONS(s)) minimax(RESULT(s,a)) if PLAYER(s) = MIN }

Fully vs Partially Observable

-fully observable: . agents sensors give it access to the complete state of the environment(or all relevant aspects). . no internal state needed! . example: Alpha go -partially observable: . caused by noisy, inaccurate, or missing sensors. . Examples: autonomous shuttle, vacuum robot. it might know that the current dirt block is dirty, but it doesn't know anything about if other blocks also have dirt. -unobservable: no sensors

hill climbing

-overview: - loop that continually moves in direction of increasing value - terminates when reaches a peak -important to note: - does not maintain a search tree - memory requirement: a single node - does not look beyond immediate neighbors - requires heuristic h measuring the quality of the solution "objective function" ( the heuristic needs to be good) - in base terms, the alg is: - find all incremental modifications of candidate solution - pick the best one - repeat until improvements stop

Static vs. Dynamic

. Dynamic: - Environment can change as the agent is deliberating. - The agent must continually choose actions. (your environment can change as you are thinking.) - example: autonomous shuttle. . static: - otherwise (as you are thinking the environment does not change.) -example: go or chess. . semidynamic: - environment does not change with the passage of time, but the agent's performance score does.

Known vs. Unknown

. Unknown: - Agent does not know the laws of the environment. - Agent has to learn how the world works in order to make good choices. - example: outlet robot. . known: . Outcome (or outcome probabilities if the environment is stochastic) for all actions are given.

Goal-based agents

. Uses a goal to define desirability of states. . No more condition-action rules with direct mapping. . Consider what will happen if I take an action and will it make me happy. . Goals are binary, it is either I am happy or I am not. There is no gradient. . Since no more direct mapping, a goal can lead to more then just a sing if then condition-action rule. Saying don't hit the car in front of you could mean not only breaking but also swerving. This means goal based agents are more flexible then model based ones. Complexities of goal-based agents: . How to define a goal? Define a goal by having a final definition of what it means to succeed. . How to satisfy a goal? - agent may have to consider long sequence of actions - search - planning Limitations of goal-based agents: - cannot optimize across different paths that will result in the goal state. If more then one path leads to the goal, then the agent won't know what to do. It's ability to ask if an action makes it happy is a binary yes or no. This means that these different but still right paths would both be marked as making the machine happy. In the same vain, if no path leads to the goal, the agent fails, even if one path got it closer to the end goal. In its eyes progress is not good enough unless it leads to the goal being finished.

learning agent

. agent operates in an initially unknown environment and becomes more competent in time. . is made up of these components: - learning element - performance element - critic - problem generator

Queue data structure

. frontier must be stored in accessible manner, so use a queue. . the queue must have these operations: - empty(queue) - true if no more elements in queue - pop(queue) - removes first element of queue and returns element - insert(element, queue) - inserts element and returns queue. - three types: FIFO, LIFO (stack), priority

episodic vs sequential

.Episodic: - Agent's experience is divided into atomic episodes. - Agent receives percept and performs an action. - Current action does not depend on previous actions. - example: a game of chess. You don't know anything about your opponent, so no action takes into account, for example, the strategies that the component might like to do that one would gain from playing the same person more than once.) .Sequential: - current decisions can affect future decisions. - short-term choices have long-term consequences. - example: a single game of chess. moving a pice will affect the future actions that one will take.

Alpha-Beta Pruning

A method for eliminating branches during the search that will not affect the outcome; Minimax with Alpha-Beta pruning always returns the same scores as Minimax would when using the same depth limit; Potentially much faster than minimax - can be applied to trees of any depth - often can prune entire subtrees - consider node n somewhere in the tree: . if player has better choice m at a parent node of n then n will never be reached during play. . once we've learned enough about n (via descendants) we can prune it . Definitions: - a = the value of the best choice so far for MAX - b = the value of the best choice so far for the MIN - since minimax is DFS, at any one time we just have to consider the nodes along a single path in the tree.

autonomous

A system is autonomous to the extent that its behavior is determined by its own percept, rather than the prior knowledge of its designer

Perfect agent maximizes

Actual performance

Evaluation functions

Allow us to approximate the true utility of a state without doing a complete search. - make use of our limited computation time. Cut off search at a point and treat non terminal nodes as if they were terminal. - replace the UTILITY function with EVAL, estimates the state's utility. - replace the terminal test by a cutoff test, returns true for terminal states, but otherwise is free to decide when to cut off the search properties like search depth. - EVAL(s,p) = estimate of the expected utility of state s to player p - if in terminal state EVAL(s,p) = UTILITY(s,p) - if not in terminal state: evaluation must be between win or loss. UTILITY(loss,p) <= EVAL(s,p) <= UTILITY(win,p) Good evaluation function: - computation must not take too long. - the evaluation function should ne strongly correlated with the actual chances of winning. (Cutting off a game early will lead to uncertainty about the final outcome of the game. If all players did the correct outcome at every step then the final outcome would be predetermined, in the case of cutting it off in the middle we are bound by uncertainty. This uncertainty means that the environment is now stochastic) How the evaluation function works: - you get an expected value based on how good or bad you expect your state to be based on a set of properties. This is not calculation is not straight forward (if using win loss percentages), you need extensive knowledge in advance to do stuff like this. Instead use features of the board, which give numeric descriptions to aspects the current world state. You then weight them: EVAL(s) = n^SUM_i=1 (w_i * f_i (s), f_i = feature, w_i = weight (how important that feature is) Limitations of this is: - is a simplistic assumption that allows you to do something very quick. - it assumed independence - it might assume dependencies between the states (Over all it might miss the relationships that exist between the characteristics of their environment that are quite meaningful) Good of EVAL functions: - you don't need to reach a terminal state to get some information about your current path. It gives the nodes in the middle a value to be judged on. Don't need to rely on a utility function, instead rely on an estimate of that function. Challenges of using an evaluation function: - you can make short sited decisions. - what is the proper cut off point. - short vs long-term goal - horizon effect

agent

Anything that can be viewed as perceiving it's environment through sensors and acting upon that environment through actuators

path cost

Assigns numeric cost to path: STEPCOST(s, a, s'). - what is the cost of going from s to s' via action a

Rationality of an action

Depends on the performance measure, the agents prior knowledge, the actions the agent can perform, the agents percept sequence.

Deterministic vs. Nondeterministic

Deterministic: . State of environment is completely determined by the current state of the environment and the action executed by the agent. (you always know the outcome of an action. if you flip a coin it will always land on the same side. if you plug an socket into an outlet then you will always know if it zaps you or not. There is not probability). .Nondeterministic: - otherwise.

Rational agent maximizes

Expected performance

Rationality

For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has.

time complexity

How long does it take to find a solution?

Space Complexity

How much memory is needed to perform the search?

completeness

Is the algorithm guaranteed to find a solution when there is one?

An agents choice of action at any given time depends on:

It's built in knowledge and on the entire percept sequence observed to date, but not on anything it Hasn't perceived

PEAS

Performance measure, Environment, Actuators, Sensors

rational behavior

Rational behavior depends only on the percept sequence to date. An important part of rational behavior is information gathering

Percept

The content an agent's sensors are perceiving

Rational Agent

The goal is to compute the right thing. It should maximize expected performance with respect to evidence provided by percept sequence, built-in (prior) knowledge, and its capabilities (available actions). It must also be rational.

Evaluation vs utility function

Utility tells u the true outcome, your won or you lost. A evaluation is an estimate of that utility.

Factored state

a state is represented by underlying pieces. The components of the state can be seen. for example, a city is composed of buildings like a school.

Agent structure

agent structure = agent program + agent architecture

Adversarial search

competitive environments in which two or more agents have conflicting goals.

local search complete and optimal

complete - always finds the goal if it exist (it always finds a global min/max) optimal - always finds global max/min (the thing it returns is a global min/max)

DFS search performance criteria

completeness: - complete: graph search (finite space) (because wont have loops, because will have a reached set) - not complete: tree search (finite spaces) (in tree could have loops) - not complete: both versions (infinite spaces) optimality: - not optimal! (add why) time complexity: - bounded by size of state space - let m = maximum depth -> O(b^m) space complexity: - if is graph search then it offers no advantage over BFS - tree search: -store the current path -store the unexpected siblings -once a node has been expanded it can be removed from memory! this means that the time complexity is O(bm), b= branching factor, m = maximum depth

BFS search performance criteria

completeness: - complete: if the shallowest goal node is at a finite depth and there is a non-infinite branching factor optimality: - optimal: if the path cost is a no-decreasing function of the depth of the node (simplest: all past (did she mean path instead of past?)costs are the same) (it is optimal because it checks all of the nodes on level d before d+1. This means that since it checks every node at each level before moving on to the next, that is, it will find out if the solution is on level d first, even if the solution might also be on level d+1, meaning that the solution it does find will always be the optimal one.) time complexity: - Assume branching factor of b and depth d. (O(b^d)) space complexity: - Assume same branching factor and depth. (O(b^d)) - generally, BFS runs out of space before it runs out of time!

infrastructure for search algorithm

each node is made up of: - n.State: state in state space - n.Parent: node in search tree that generated n - n.Action: action applied to parent to generate n - n.Path-Cost: cost of path from initial state of n - a new node is created by the function CHILD-NODE(problem, parent, action)

Properties of task environment

fully vs partially observable single vs multi-agent deterministic vs stochastic/nondeterministic episodic vs sequential static vs dynamic discrete vs continuous known vs unknown

first-choice hill climbing

generate successors randomly until you find a better one (doesn't have to be the best successor, just better than the current one)

Actuators

how agent can affect the environment through chosen actions.

Sensors

how can agent measure/know features of the environment, and how well.

Sensors

how can an agent measure/know features of the environment, and how well.

branching factor (b)

how many things you can do, on average, in a given state. Tells us how wide our graph will get. at each level it is b^d, where d = depth.

Actuators

how the agent can affect the environment through chosen actions

Agent Program

implements agent function (mapping percepts to actions)

Task Environment

is described in PEAS: -performance measure -environment -actuators -sensors

goal test

is the given state the goal state: IN(SB) - returns a binary yes, no

components of a learning agent

learning element - makes improvements. (allows us to understand how we can make improvements) performance element - selects external actions. (selects actions, and adds knowledge back in) critic- determines how the agent is doing - determines how the performance element should be modified. (the performance standard is added by the person making the agent. the agent does not make its own) problem generator - explores the space - exploration vs. exploitation

frontier

leaves of partial search tree, states what you have seen but not gone to. Means that I am going to explore there next.

structured state

like a factored state but the internal components of the state are also connected to each other.

Minimax

look it up on notes, but it does this: - is a recursive alg that proceeds all the way down to the leaves of the tree then backs up the minimax values through the tree as the recursion unwinds

node

n.state = state in state-space n.parent = the node that generated this one n.action = the action from the parent to this node n.path-cost g(n) = cost of path thus far

UCS: search performance criteria

optimality: - optimal! - when node n is expanded the optimal path to n has been found - Note: this is true because step-costs are non-negative! completeness: - complete given cost of every step is greater than a small constant, epsilon time and space complexity: C* = the cost of the optimal solution e = the minimum step-cost O(b^(1 +[C*/e])

function CHILD-NODE(problem, parent, action)

returns a node with : state = problem.Result(parent, state, action), parent = parent action = action path-cost = parent.path-cost + problem.step-cost(parent.state, action)

DFS function BEST-FIRST-SEARCH(problem, f)

returns a solution node or a failure node <- NODE(STATE = problem.INITIAL) frontier <- a priority queue ordered by f, with node as an element while not IS-EMPTY(frontier) do: node <- POP(frontier) if problem.IS-GOAL(node.STATE) then return node for each child in EXPAND(problem, node) do: s <- child.STATE if s is not in reach or child.PATH-COST < reached[s].PATH-COST then: reached[s] <- child add child to frontier return faliure

bfs alg/ function BEST-FIRST-SEARCH(problem, f)

returns a solution node or failure node <- NODE(STATE = problem.INITIAL) frontier <- a FIFO queue ordered by f, with node as an element reached <- a set of states, with one entry with value node.STATE if problem.IS_GOAL(node.STATE) -> return node while not IS-EMPTY(frontier) do: node <- POP(frontier) for each child in EXPAND(problem, node) do: if problem.IS-GOAL(child.STATE) -> return child s <- child.STATE if s not in reached or child.PATH-COST < reached[s].PATH-COST -> : reached[s] <- child add child to frontier return failure

function BEST-FIRST-SEARCH(problem, f)

returns a solution node or failure node <- NODE(STATE = problem.INITIAL) frontier <- a priority queue ordered by f, with node as an element reached <- a lookup table, with one entry with key problem.INITIAL and value node while not IS-EMPTY(frontier) do: node <- POP(frontier) if problem.IS-GOAL(node.STATE) -> return node for each child in EXPAND(problem, node) do: s <- child.STATE if s not in reached or child.PATH-COST < reached[s].PATH-COST then: reached[s] <- child add child to frontier

USC function BEST-FIRST-SEARCH(problem, f )

returns a solution node or failure node <- NODE(STATE = problem.INITIAL) frontier <- a priority queue ordered by f, with node as an element reached <- a lookup table, with one entry with key problem.INITIAL and value node while not IS-EMPTY(frontier) do: node <- POP(frontier) if problem.IS_GOAL(node.STATE) -> return node for each child in EXPAND(problem, node) do: s <- child.STATE if s is not in reached or child.PATH-COST < reached[s].PATH-COST ->: reached[s] <- child add child to frontier

function GRAPH-SEARCH(problem)

returns a solution of failure initialize frontier initial state of problem "initialize the explored set to be empty" loop do: if (frontier is empty) -> return failure if (node contains a goal state) -> return solution "add node to explored set" expand chosen node, add resulting nodes to frontier "only if not in frontier or explored yet".

function SIMULATED-ANNEALING(problem, schedule)

returns a state current <- problem.INITIAL for t = 1 to inf do T <- schedual(t) if T = 0 then return current next <- a random selected successor of current deltaE <- Value(next) - Value(current) if deltaE > 0 then current <- next else current <- next only when probability e^deltaE/T

function HILL-CLIMBING(problem)

returns a state that is a local maximum current <- problem.INITIAL while true do neighbor <- a highest-valued successor of current if VALUE(neighbor) <= VALUE(current) then return current current <- neighbor

description of the possible actions

set of actions applicable in state s: {a1, a2, a3, ...} = ACTIONS(s)

Single vs multi-agent

single agent: - agent A can treat agent B as an agent in the environment. multiagent: - is B's behavior best described as maximizing a performance measure whose value depends on agent A's behavior? - example: . in chess the opponent entity B is trying to maximize its performance measure, by the rules of chess, minimizes agent A's performance measure. (competitive multiagent environment). . In the taxi-driving environment avoiding collisions maximizes the performance measure of all of the agents, so it is a partially co-operative multiagent environment.

Transition Model

state resulting from action a: s' = RESULT(s, a)

state vs node

state- configuration of the world node- step along the path (state is a component of the node)

Simulated Annealing

strategy: - instead of best move, pick random move - if move improves, accept else, accept with prob., p < 1, p = e^-deltaf/T - probability decreases with: - badness of move - temperature ( T decreases over time, making -deltaf/T bigger, meaning that the p of shacking things up happens less over time.) - if schedule lowers T slowly enough - will find global optimum with probability -> 1

search algorithm

takes a search problem as input and returns a solution, or an indication of failure.

Percept Sequence

the complete history of everything the agent has ever perceived.

Agent Architecture

the computing device with physical sensors and actuators.

Discreet vs Continuous

this has to do with: - the state of the environment - refers to the way in which time is handled - refers to the percepts and actions of the agent. if the environment is not still or static, like in the case of a self driving car, then it is continuous. if time is involved then it is usually continuous because time is represented on a line of CONTINUOUS values. Angles come in a CONTINUOUS set of values. However, in chess for example, the actions that a specific piece can move is discrete. The number of ways a piece can be put is discrete. So, if a clock is ignored then chess is discrete.

environment

what features of the world are germane, and what values they can take

environment

what features of the world are germane, and what values they can take.

Performance Measure

what somebody wants to achieve thanks to the agents actions.

initial state

where the agent starts: IN(AA)

function EXPAND(problem, node)

yields nodes s <- node.STATE (your current state) for each action in problem.ACTIONS(s) do: (all the tings you could do in the state ^) s' <- problem.RESULTS(s, action) (^ where you could end up) cost <- node.PATH-COST + problem.ACTION-COSTS(s, action, s') (how expensive it would be to get there) yield NODE(STATE = s', PARENT = node, ACTION = action, PATH-COST = cost) (generates a sequence of values, each time yield is encountered)

measuring problem-solving performance

you measure the performance of the search algorithm in respect to completeness, optimality, time complexity, and space complexity.

H-MINIMAX(s,d) =

{EVAL(s, MAX) if IS-CUTOFF(s,d), Max_a E ACTIONS(s) H-MINIMAX(RESULTS(s,a), d+1) if TO-MOVE(s) = MAX, Min_a E ACTIONS(s) H-MINIMAX(RESULTS(s,a), d+1) if TO-MOVE(s) = MIN } Minimax used with evaluation function. Makes bata-alpha pruning faster.


Kaugnay na mga set ng pag-aaral

CHAPTER 16 Documenting, Reporting, Conferring, and Using Informatics

View Set

Algebra (MULTIPLYING POLYNOMIALS)

View Set

Political Socialization and Public Opinion Quiz

View Set

Home Inspector National Exam Prep

View Set

Chapter 2 - The Insurance Contract

View Set