EXAM 2
Suppose we have the following data and we want to use the k Nearest Neighbor algorithm with k= 3 to predict whether a point is classified as a circle, square or triangle. Use the standard Euclidean distance. Triangles: T1: (4,0) T2: (4.5,0) Circles: C1: (2,0) C2: (3.1,0) Squares: S1: (0,0) S2: (1.5, 0) S3: ( -1, 0) How do we classify the point (3.5,0)?
Triangle
Which of the following is NOT true about K-Means clustering?
all clusters must have the same number of points
An example of a type of Machine Learning algorithm is pattern recognition.
true
In Agglomerative Hierarchical clustering using either Single or Complete Linkage at the first step each point is in its own cluster
true
In supervised learning one first trains an algorithm with a training set.
true
Suppose we are using the K-Means algorithm with K=2 to cluster the following points:(-2,0), (-0.50,0), (0.25,0), (1.1,0), (1.9,0), (2.5,0)The initial guess for the centroids are: Red centroid (-1,0) Blue centroid (1,0) How many points are in the Red cluster at the end of the first iteration?
2
Which of the following is NOT an example of Machine Learning?
Digital Signature Algorithms
In Agglomerative Hierarchical clustering using either Single or Complete Linkage (with the same distance measure) at the first step each point is in its own cluster. However, at the next step when we merge two clusters the results may be different.
FALSE
The prediction from Linear Regression always gives a discrete (e.g., yes or no) result.
FALSE
Machine learning has only been used to advance the sciences, not the arts.
False
Suppose we are using the K-Means algorithm with K=2 to cluster the following points:(-2,0), (-0.50,0), (0.25,0), (1.1,0), (1.9,0), (2.5,0) The initial guess for the centroids are: Red centroid (-1,0) Blue centroid (1,0) What is the new Red centroid?
(-1.25, 0)
Suppose at an iteration of K-Means we have determined that the following points are closest to the initial Red centroid. (4,1), ( 0,-2), (2,3) What is the new Red centroid?
(2, 2/3)
Assume that a quantity y depends linearly on a variable x and that we know that when x changes from 10 to 20 then y changes from 2 to 1. What is the value of y when x = 30?
0
If we use linear regression to find the line which best fits the data x y 2 4.1 3 5.9 4 7.9 then we get the line y = 1.9 x + 0.267. Use linear regression to predict the value of the data when x=1.
2.167
Assume linear regression was used on a set of data and we find that the equation which fits this data is y = 4x - 2 What is the predicted y-value when x=10?
38
Suppose you have the following clusters which have been formed using Single Linkage Hierarchical Clustering. What clusters do you merge next? C1: (0,0) (1.5,0), (-2.5,0) C2: (3,0),(4.5,0) C3: (-3,0),(-3.5,0)
C1 & C3
Which of the following algorithms is NOT an example of supervised learning.
Hierarchical clustering
Suppose we use a training set with 100 items to build a Decision Tree to play tennis and we are trying to decide which attribute (Outlook, Humidity, Wind) is better to test at the root node. The outcome is YES (Y) or NO (N). Summarizing the results as we did in the notes, we have Outlook Sunny: Y-25, N-10 Overcast: Y-35, N-5 Rain: Y-5, N-20 Humidity High: Y-0, N-25 Normal: Y-35, N-15 Low: Y-15, N-10 Wind Strong: Y-5, N-20 Weak: Y-15,N-15 None: Y-20, N-25 Which attribute is best?
Humidity
Suppose we use linear regression to fit a set of experimental data which does not contain x=1 and get the line y = 3x - 1.4 The line predicts that at x=1, y = 1.6. However, if we take an experimental measurement at x=1 we get y=2.1 If we now use linear regression to fit the original data plus the new point ( 1, 2.1), which lies to the right of the data used to get $y=3x-1.4$, should the slope of this line Correct! increase
INCrease - The line fitting the original data predicts a much lower value for y at x=1 so the slope of the new line should increase
Suppose we have the following data and we want to use the Nearest Neighbor algorithm using the Euclidean distance measure to predict whether a point is classified as a circle, square or triangle. Triangles: T1: (4,1) T2: (4.5,3) Circles: C1: (2,-1) C2: (3.1,0) Squares: S1: (-1,0) S2: (-1.5, 2) S3: ( -1, 1) How do we classify the point (0,2)?
square
Suppose we have 10 collinear (they lie on a line) data points and we want to use linear regression to find the line which fits all the data. Which of the following is true?
the line passes through all 10 points
Suppose we are using a Perceptron (an Artificial Neural Network with only one neuron) to predict whether a point should lie above or below a given line. If a point in the training set lies above the line and the Perceptron algorithm predicts that it lies below the line then
the weights should be modified
A linear regression algorithm will make different predictions if different training sets are used.
true
Algorithms used for navigation systems such as those by Google are examples of supervised learning.
true
Any linear function has a constant rate of change.
true
Suppose we want to use Linear Regression to predict the listing price of houses. Our training set contains 100 houses which have 1700 - 4100 sq.ft. and our algorithm only uses the feature of sq. ft. of living space to determine the line which best fits the data. The line which is found by linear regression minimizes the sum of the square of the vertical distance from the line to each data point.
true
The purpose of supervised learning algorithms is to predict the outcome of data not in the training set.
true
Assume we determined the line which best fits a set of 100 data points (x,y) using linear regression. Then to predict the outcome for a value of x not in our data set, we simply evaluate the line for this x value.
truw
Suppose we have two data points (3,5) and (-1,1). If we use Linear Regression to find the line which fits these two data points, then what is the line?
y = x + 2
Suppose you have the following clusters which have been formed using Complete Linkage Hierarchical Clustering. What clusters do you merge next? C1: (0,0) (1.5,0), (-1.5,0) C2: (3,0),(3.1,0) C3: (-3,0),(-3.5,0)
c1 & C2
Suppose we have the following data and we want to use the Nearest Neighbor algorithm using the Euclidean distance measure to predict whether a point is classified as a circle, square or triangle. Triangles: T1: (4,0) T2: (4.5,0) Circles: C1: (2,0) C2: (3.1,0) S quares: S1: (0,0) S2: (1.5, 0) S3: ( -1, 0) How do we classify the point (3.5,0)?
circle
Suppose we want to use Linear Regression to predict the listing price of houses. Our training set contains 100 houses which have 1700 - 4100 sq.ft. and our algorithm only uses the feature of sq. ft. of living space to determine the line which best fits the data. If we use the line to predict the listing price for a house of 5200 sq. ft. then this is called
extrapolation
Algorithms used for self-driving cars are examples of unsupervised machine learning
false
In an Artificial Neural Network each neuron can have multiple outputs.
false
In an Artificial Neural Network the output of each neuron is an integer between 0 and 10.
false
If we have 5 data points to use for linear regression, then the goal is to find a line which passes through these 5 points.
false because It is impossible to find a line which passes through 5 random points.
Suppose we are using linear regression to fit 10 data points, then the line
is not required to pass through any of the 10 points
If we have 2 data points in our training set for a linear regression algorithm, then the result is a line which
passes through the two points
Machine Learning is a type of Artificial Intelligence.
true
Once a line is found using linear regression which represents the data in the training set, it is used to predict the outcome at some other point.
true
The goal of Linear Regression is to predict a discrete outcome such as YES or NO.
false