AI Exam 2
Nearest neighbor k-NN
Base classification of a sample on the number of votes of k of its nearest neighbors rather than just the single nearest (denoted k-NN).
Optimal Even odd game
Best mixed strategy is the mixed Nash maximum equilibrium. Look at notes / potentially practice calculation
Most common representation for genetic algorithms
Bit level. Crossover and mutation can be used to directly produce potential solutions
Regression
Continuous (minimize the square error)
Goal of machine learning
Develop algorithms that automate the decision making process
Classification
Discrete
Prisoner's Dilemma
Dominant strategy is both players testify. Nash equilibrium (testify, testify) is not pareto optimal
Classification Errors
When a known observation that belongs to one class is classified as another. Simple test is to classify all observations as belonging to the most prevalent class (naive rule)
alpha beta pruning
method to cut of large part of the game tree
Nearest neighbor k-NN: problem with too small of a k
puts too much emphasis on chance locations of a few sample points
Tragedy of the Commons
situation in which people acting individually and in their own interest use up commonly available but limited resources, creating disaster for the entire community (found this definition, thought it was better than ours: If nobody has to pay for a common resource, it tends to be exploited to lead to less utility for all)
Genetic algorithms
start out with a population of individuals (sample problem states) with each individual represented by a chromosome (bit string)
Markov decision process
stochastic, discrete states
Nearest neighbor k-NN: problem with too large of a k
suppresses the fine structure of underlying density
Overfitting
the result of over training and produces a hypothesis too specific for the rest of the data
Alpha (alpha beta pruning)
the value of the best choice found so far at any subtree for MAX. If V is worse than alpha, V will be pruned
Basic Genetic Algorithm (5 steps)
1) Generate a random population; 2) If termination criteria satisfied, stop, else continue; 3) Determine the fitness of each individual; 4) Apply crossover and mutation to selected genes from individual of current generation to create a new generation; 5) Return to step 2
Crossover (3 steps)
1) Select a random crossover point; 2) Break chromosome into 2 parts at crossover point; 3) Recombine broken chromosomes by mix and matching. Usually applied to one point but can be more
Anti coordination
A and B compete but x is better only if 1 chooses Y. Not in nash equilibrium
Boolean random variable
B(q) = -(qlogbase2(q) + (1-q)logbase2(1-q))
Estimation of Error Rate
After classifier developed, performance needs evaluation. Usually counting based error estimation ro confusion matrix
Mutation
After new individual generated, mutation operation applied to every bit of a chromosome. The mutation is random and applied individually to one gene. It flips the gene (bit) with a low probability
Reinforcement Learning
Agent actively learns from experience and explores the world in order to do so.
Supervised Learning
Aka classification / regression. Output has labeled values. The classifier is designed using training sets of inputs with a know class (funcion output) to eventually be used for predicting unknown inputs.
Unsupervised learning
Aka clustering. Training data unlabeled and so no defined output.
Mixed Strategy
Allows for unpredictability (chance events)
Problem of scale (nearest neighbor)
An arbitrary change in the unit of measure of one feature can skew results as well as vast differences in range of values. To prevent, scale factors should be applied as needed so features should all have ~ the same standard deviation, range, and other stats measures for the dataset.
Coordination Game
Assume common knowledge and rationality, then choosing the "Standard X/ Standard Y" is the best option. It should be dominant strategy (if it exists)
Inductive learning
Attempt to learn new knowledge from examples. Learn general function / rule from specific input / output pairs. Construct or adjust hypothesis function h to agree with an unknown function after given training
Deductive learning
Attempts to deduce new knowledge from rules and facts using logic inference. Go from a general rule to a new rule logically
Nearest-neighbor classification
Each input is a point in the feature space. Classify unknown as belonging to a class of most similar or nearest sample point in training data. Nearest usually means smallest distance in an n-dimensional feature space
Gray Coding
Each number is exactly one bit from its neighbor. The genetic operator between states of near neighbors has one bit flip.
Information gain from querying A
Gain(A) = B(p/(p+n)) - Remainder(A) bits
Finite games
Guaranteed to end. Finite number of choices for each player. Has well defined rules. Played with intent to win.
H(output) = B(?)
H(output) = B(p/(p+n))
Nash Equilibrium
If outcome is in equilibrium, then no player can gain from unilaterally changing strategy. A local optimum. May not exist in pure strategy games but guaranteed at least one in mixed strategy.
Nearest neighbor k-NN: Case of dimensionality
In low dimensions with lots of data, k-NN works well but in higher dimensions, nearest neighbor is not as near. For N samples of a known class, the query for one sample takes O(N)
Non zero sum games
In zero sum, payoffs for each player inversely related. In non zero, there is not necessarily an effect on the other player. Communication and binding agreement (coop strategy) leads to better outcome for all involved
Mixed Strategy Games
Involves chance. Strategy kept secret from opponent
Dominant strategy and rationality
Irrational to play dominated strategy. Irrational to not play dominant strategy if it exists
How does an agent learn?
It uses a collection of input / output pairs to learn a prediction function
Mixed Strategy Nash Equilibrium
Many games don't have a pure Nash Equilibrium
Pure Strategy Games
No chance involved
Infinite Games
No definite end / beginning. Played with goal of continuation. Has partially defined or no rules
Parato Optimal
No other profile would make all players better or the same
Even-odd game
No pure strategy nash equilibrium. Assume players don't know what best mixed strategies are, so they arbitrarily choose to play each number a certain percentage of the time.
Does pruning affect the final result?
No.
Non-parametric Classification
Not enough knowledge or data to be able to assume a general form of a model or to estimate the relevant parameters.
Fitness
Objective / cost function, determines the likelihood of reproduction
Cardinal payoffs
On interval scale. Need to know more than just ordering / preferences.
Ordinal payoffs
Only need to know ordering / preferences of outcomes
Strongly dominant strategy
Payoff is always better than all other payoffs
Weakly dominant strategy
Payoff is never worse than all other payoffs
Elitism
Places most fit individual without change into new generation. This guarantees that the generation fitness value will only improve or stay the same
3 major components of Game theory
Players ( agents who play the game) , Strategies (What agents do, how they respond in all possible situations), Payoffs (how much each player likes result / subjective score)
Ockham's razor
Prefer the simplest hypothesis that is consistent or best fit with the training data. Beware of underfitting.
Expected remainder of entropy after A
Remainder(A) = sumk[((pk+nk)/(p+n)) B(pk/(pk+nk))]
Simultaneous games
Represent events happening at the same time aka static games. Players know each other's moves.
Sequential Games
Represent events unfolding over time. Aka dynamic games. Players only have one decision and don't know what other players are doing.
Potential issues with binary representation (genetic algorithms)
Representation anomaly with massive bit changes between numbers like 0111 and 1000, can cause analysis problems. Fix with gray coding
Parameterization
Representing a function in terms of a few optimization parameters
Learning decision tree
Represents a function that takes input as a vector and returns a single output value decision. Starts at root and makes decisions until arriving at leaf
System integration and Euler's method
Sk+1 = Sk + h*f(Sk, Uk, tk). Integration process continues until last step reaches final node.
Dynamic system components
State variables s(t), action / control variables u(t), differential equations describing how the system evolves with time
Parametric
Statistical analysis based on an assumption about the population distribution
Non-paramentric
Statistical analysis not based on assumptions about the population distribution
Schelling (focal) point
Strategy profile players choose in the absence of communication
Game theory
Study of Strategic interactive decision making among rational agents
Payoff matrix
Tensor order n for n players. Each element in tensor contains cardinal value of each player's preferences.
Open loop control
The control histories may be applied without sensing the environment (impractical). Basically prior knowledge allows agent to act blindly with environment
Sequential decision problem
Time discretized into steps. At each step, an agent must take an action with a reward or cost. It ends when it reaches a terminal state with terminal rewards
No dominant strategy
Two Nash equilibriums exist. Choose pareto optimal
Causes of unpredictability (mixed strategy)
Uncertainty about event outcome, game structure, pure strategy of a player
Closed loop control
Uses the found state Sref(t) used as the reference trajectory to be tracked.
Reinforcement Learning
Where there are positive or negative "rounds" for correct / incorrect answers
Entropy
a measure of the uncertainty of a random variable. As information increases, the entropy decreases. H(V) = -sumk(P(Vk)logbase2(P(Vk)))
Pareto dominated
a strategy is pareto dominated by a different strategy that makes all players better or the same
Model based Reinforcement learning
agent can explore simulation of world if it has (rules, state transitions, etc). Otherwise it is model-free reinforcement learning
Confusion Matrix
aka a contingency table has two dimensions comparing actual to predicted outcomes
Maching learning
allows an agent to change behavior based on the data it receives by recognizing patterns and extrapolating new situations
Counting-based error estimation
assumes true classification (ground-truth) is known for the sample test set
Degrees of freedom
constrain something. Idk discuss this one
Minkowski Distance
d(a,b) = [sumi(|bi-a-|)^r]^(1/r) where r is an adjustable parameter. When r = 2 -> euclidean, r=1 -> normal manhattan, r=infinity -> normal max
Maximum distance metric
d(a,b) = maxi |bi-a|, finds the distance between the most dissimilar pair of features
Euclidean distance
d(a,b) = sqrt(sumi( |bi-ai|^2 )
Absolute Distance
d(a,b) = sumi(bi-ai) aka city block distance / Manhattan distance
dynamic system
deterministic, continuous states. Changes / evolves over time
Decision tree: perfect attribute
divides examples into sets of positive and negative and therefore is terminal
Decision tree: useless attribute
divides examples into sets with same proportions of positive and negative values
Tracking error (closed loop control)
e(t) = Sref(t) - S(t). Feedback controller uses error tracking to determine closed loop control u(s) in real time
Battle of the Sexes
harder coordination game. No dominant strategy. No clear solution unless a schelling (focal) point exists
When is h (the hypothesis function) consistent?
if it agrees with the function on all the training data. It is more important how h handles unseen input though. Exact consistency on h may not be feasible so consider best fit / curve fit
Non coop games
implies binding agreements between players is not possible
An agent is learning if
it performs better in the future after making observations about the world
Proportionate selection
use fitness ration to randomly select individuals to reproduce with corresponding weights
Termination Criteria (List of 3)
usually : stop after fixed number of of generations, stop when best individual reaches specified fitness level, stop when best individual succeeds in solving problem within a specified tolerance
Beta (alpha beta pruning)
value of best choice found so far for MIN