AI Past Quizzes

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

A decision tree, tested on a set of test data, returns 50 True Positives, 100 True Negatives, 20 False Positives, and 30 False Negatives. What is the accuracy of this classifier?

(50+100) / (50 + 100 + 20 + 30) = 150 / 200 = 0.75 = 75%

Write an example of a CNF formula that has three clauses and each clause has two literals.

(A v B) ^ (B v C) ^ (C v D)

We choose the attribute 'A1' to partition the database of Question #1 above into two parts. The first branch where A1=True has 30 records, of which 15 belong to the "+" class and 15 belong to the "-" class. The second branch where A1=False has a total of 70 records, of which 60 belong to the "+" class and 10 belong to the "-" class. What is the entropy of the first branch? What is the entropy of the second branch? What is the weighted entropy of both the children datasets of the original dataset?

-(1/2)*log2(1/2) - (1/2)*log2(1/2) = 1 -(6/7)*log2(6/7)-(1/7)*log2(1/7) = 0.592 (3/10)*1 + (7/10)*0.592 = 0.717

We choose the attribute 'A2' to partition the database of Question #1 above into two parts. The first branch where A2=True has 35 records, of which 14 belong to the "+" class and 21 belong to the "-" class. The second branch where A2=False has a total of 65 records, of which 61 belong to the "+" class and 4 belong to the "-" class. What is the entropy of the first (A2=True) branch? What is the entropy of the second (A2 = False) branch? What is the weighted entropy of both the children datasets of the original dataset?

-(14/35)*log2(14/35) - (21/35)*log2(21/35) = 0.971 -(61/65)*log2(61/65) - (4/65)*log2(4/65) = 0.334 (35/100)*0.971 + (65/100)*0.334 = 0.557

A database has five attributes and a Target (class) attribute. The target attribute column has 75 instances of the "+" class and 25 instances of the "-" class. What is the entropy of the target attribute for this database? State the base value of the log function you use in your computations.

-(3/4)*log2(3/4) - (1/4)*log2(1/4) = 0.811

What is the Modus Ponens inference rule - give an example to illustrate your answer.

A, A -> B It is cold and rainy today. Therefore, I will wear my rain coat.

Which of the two attributes, A1 or A2, should be chosen for inclusion in the decision tree? Give reason for your answer.

A1 parent entropy = -(3/4)*log2(3/4) - (1/4)*log2(1/4) = 0.811 Information gain A1 = 0.811 - 0.717 = 0.094 A2 parent entropy = -(3/4)*log2(3/4) - (1/4)*log2(1/4) = 0.811 Information gain A2 = 0.811 - 0.557 = 0.254 since A2 has the higher information gain, it should be chosen for inclusion in the decision tree.

What is meant when we say that a heuristic function h1(n) dominates a heuristic function h2(n)?

All values of h(1) are greater than or equal to the corresponding values of h(2).

In the context of MDP, wha is meant by "discounted Rewards"?

Discounting rewards encourages taking actions earlier rather than waiting. apparently this answer needs more explaination

Consider the context of working with a propositional Knowledge Base to derive answers to your queries. What is the difference between Inference Rules and Equivalence Rules? Give one example of each type of rule.

Equivalence rules - allow individual wffs to be rewritten - Commutative: R V S -> S V R Inference rules - allow new wffs to be derived - Modus Ponens: R, R -> S

Briefly describe a problem situation in which you would prefer to use a hill climbing search instead of an A* search. In the context of your example problem, what is the main advantage of using hill climbing search instead of A* search?

Hill climbing is a greedy method so it would not "waste" time considering an open node that is less promising of forming a solution. There perhaps would be a situation with many children nodes with decent h(n) values. The search follows down a certain path and eventually reaches a point where it must choose a child node on the same path or one further back in the tree that may have slightly better values. The hill climbing approach might view a child node on the same path as more promising and pick it despite its higher values. In the example above, the hill climbing approach could have avoided expanding the M node altogether by deciding D (a node on the same path which has a lower h(n)) value was a more promising node and should be expanded instead.

Very briefly outline the main steps of the Iterative Deepening Depth-First Search algorithm?

IDDFS is a combination of BFS and DFS. Essentially, it performs DFS to a certain limited depth of the graph. The depth in incrementally increased until the goal is found. IDDFS is not used to find a path to the goal, only to check if the goal is reachable.

When is a search algorithm said to be Complete?

If at least one solution exists, the algorithm is guaranteed to find a solution in a finite amount of time.

What is meant by saying that a data is linearly separable?

If it is possible to draw a line that can separate two data point types from each other. This makes it easier to draw conclusions and predict future data.

What are the two conditions under which the Alpha-Beta game tree search algorithm decides to discontinue the search below some branches of a game's search tree?

If the node in question is a max node and its alpha value is larger than the beta value of any min node ancestor. This is because the max node can never get smaller, so it is pointless to continue to search for a value smaller than the ancestor. If the node in question is a min node and its beta value is smaller than the alpha value of any max node ancestor. This is because the min node can never get larger, so it is pointless to continue to search for a value larger than the ancestor.

During the learning of a decision tree, when would you stop growing the decision tree? Justify your answer by stating the intuition behind your rule for stopping the growth of decision tree.

If you grow until each node corresponds to the lowest impurity, the tree will become overfit. If you stop early performance will be poor. Set a threshold of impurity that is above the lowest amount and stop when a node is below that.

Give a small example to illustrate how the Resolution rule of inference works.

If you know "alpha or beta" (a v b) and "not beta or gamma" (!b v y) then you can conclude "alpha or gamma" (a v y). The goal is to resolve down until you get empty clauses (false v false).

What do you understand from the term "Gradient Descent" in the context of neural networks?

It is a optimization algorithm that teaks its parameters iteratively to minimize a given function to its minimum.

What is meant when we say that a learned decision tree overfits the training data?

It negatively impacts the performance of the model by picking up noise or other randomness and learning it as a concept.

Why do we add a bias variable, or an augmenting variable with a constant value of '1' when training a perceptron or a neuron?

It sets a lower bound to how high the weighted sum needs to be before the neuron gets to be meaningfully active. apparently this answer is not precise enough

In your view what are the two most important advantages of the Depth-First Search over the Breadth-First search of a state space?

Optimal for finding the depth of a state space. Usually requires less memory than BFS.

What is the main difference between a decision tree model and a perceptron model of a data set?

Perceptron model is more focused on training the model over time to best fit the data set, while the decision tree is a quick solution. Perceptron models are much less easy for a human to interpret than decision trees.

What is the difference between a relation (predicate) and a function as they are used in FOPC formulas?

Relation is a 1 to 1 formula made up of tuples. Function is a many to 1 formula for each function symbol.

You know that a search algorithm you are using is not Complete. The algorithm runs for a long while and doesn't return a path to the goal state. What can be said about the existence of a path from the start state to the goal state?

The graph is not finite and the program will never terminate. There is therefore no path that can be found using the current algorithm.

What does it mean for a rule of inference to be Sound?

This means the conclusions inferred from any set of wffs using the rule are logical consequences of the set of wffs.

How can we make game tree search more efficient using the knowledge of the relative advantages of various moves to the two players?

You would apply values to represent the reward or advantage difference each player would get by making certain rules. You would then design the search algorithm to optimize each players rewards when it is their turn. Additionally, minimizing the opponents rewards may be more beneficial than maximizing your own, and this situation should be considered.

How would you process information at the chance nodes of a game that has a dice-throw involved by each player?

You would use a depth limited search with an accurate evaluation the is proportional to the expected payoff of each outcome. Focus on the belief states that will lead to the goal state instead of individual values.

What is meant by a "ground" instance of a FOPC formula?

a formula with only constants.

In the context of a Markov Decision Process, what is an Optimal Policy?

a policy that maximizes the value of all states at the same time.

What idea is conveyed by the Ockham's razor in the context of machine learning?

break down problem into key success criteria and apply ockhams razor to chose a model that is the simplest to interpret, create, and maintain. also the model should be consistent.

What is the difference between the forward-chaining and backward-chaining methods of proving formulas in logic?

forward chaining: start with known information and continue "forward" by applying inference rules to extract more data until the goal is reached backward chaining: start from the goal state and move "backward", using inference rules to determine the facts

What is the unit preference strategy for performing a search for a resolution proof for a knowledge base and a goal formula?

prioritize resolution steps with a clause that only has one literal in it. This will return a shorter clause, and therefore it will be closer to a zero length clause.

Consider the Arc-Consistency procedure for CSPs. Wee have a variable A with initial domain [1 . . . 15] and a variable B with initial domain [4 . . .20]. Variables A and B are bound by the constraint: (A > B+5). What will be the new domain of 'A' when this constraint is enforced on the variables?

{A, A>B+5} remove all A where A <= B+5 lowest value of B is 4, 4+5 = 9 -> remove [1 ... 9] from A all values in A [10 ... 15] are greater than some value in B + 5 So, remove 1, 2, 3, 4, 5, 6, 7, 8, 9 from A A = [10 ... 15]

We want to unify the following two formulas. What is the set of substitutions for the most general unifier. And what is the unified formula? Formula1: P(x, y, Blue) v Q(f(w)) Formula2: P(f(u), y, z) v Q(v)

{x/f(u), z/Blue, v/f(w)} P(f(u), y, Blue) v Q(f(w))


Ensembles d'études connexes

4. A gazdálkodás alapegységei (4.1-4.2.5)

View Set

History of the Republican Party (Schlesinger)

View Set

The sequence of events in muscle contraction (Ch.9)

View Set