Final - Intro to AI - review

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Which of the following is NOT true about boxplots A) The mode is easy to locate B) Boxplots alone can distinguish between unimodal and bimodal data C) Boxplots can tell you whether you have qualitative or quantitative data o D) Boxplots can tell you about percentile 50th E) Boxplots can not tell you about the range of the data F) The center line in a boxplot signifies the mean

F) The center line in a boxplot signifies the mean

If the FPR and TPR was 0 and 1 respectively for a set of thresholds. What can you say about such classifier? o It is bad classifier as it always predicts wrong result o It is an excellent classifier as it always predicts the correct result o The accuracy of such classifier is below 90% o The accuracy of such classifier is below 50%

It is an excellent classifier as it always predicts the correct result

A covariance level of 50.61 was found between two variables x and y. What does this mean? o It means there is a strong positive relationship between the two variables. o It means that the two variables move in the same direction as each other (as one rises so does the other). o Covariance does not give any useful information. The correlation must be calculated before interpreting the result. o None of the above

o Covariance does not give any useful information. The correlation must be calculated before interpreting the result.

If the FPR and TPR was 1 and 0 respectively for a set of thresholds. What can you say about such classifier? o It is bad classifier as it always predicts wrong result o It is an excellent classifier as it always predicts the correct result o The accuracy of such classifier is below 90% o The accuracy of such classifier is below 50%

o It is bad classifier as it always predicts wrong result

Which of the following statements about Outliers is not true? o Outliers are values very different from the rest of the data o Outliers should always be deleted o Outliers have an effect on the mean o b & c o None of the above

o Outliers should always be deleted

Which of the following is a performance measure for regression? o Residual Sum of Squares o Precision o Accuracy o Recall

o Residual Sum of Squares

Predicting whether a tumour is malignant or benign is an example of? o Supervised Regression Problem o Categorical Attribute o Unsupervised Learning o Supervised Classification Problem

o Supervised Classification Problem

Price prediction in the domain of real estate is an example of? o Supervised Classification Problem o Supervised Regression Problem o Unsupervised Learning o Categorical Attribute

o Supervised Regression Problem

How do you find the upper and bottom whiskers? A) Upper = Q3+IQR*1.5 Bottom = Q1-IQR*1.5 B) Upper = IQR*1.5 - Q3 Bottom = IQR*1.5 - Q1 C) Upper = IQR/1.5 +Q3 Bottom = IQR/1.5 - Q1 o Q3- Q1

A) Upper = Q3+IQR*1.5 Bottom = Q1-IQR*1.5

Given: yᵢ = [5,12,15,20] and ŷᵢ = [4.8,10.6,14.3,19.1] The sum of the squared Errors (SSE) is: 0.225 3.3 0.4 0.474 None of the above

3.3

Out of 300 emails, a classification model correctly predicted 150 spam emails and 60 ham emails. What is the accuracy of the model? o 10% o 70% o 80% o 30%

70%

Consider the following values for the confusion matrix: • True negatives (TN) = 300 • True positives (TP) = 500 • False negatives (FN) = 150 • False positives (FP) = 50 The recall is: o 76.92% o 80% o 90.9% o 85.71

76.92%

Consider the following values for the confusion matrix: • True negatives (TN) = 300 • True positives (TP) = 500 • False negatives (FN) = 150 • False positives (FP) = 50 The accuracy is: o10% o80%o90% o50%

80%

Out of 200 emails, a classification model correctly predicted 120 spam emails and 40 ham emails. What is the accuracy of the model? o 90% o 80% o 60% o 10%

80%

Consider the following values for the confusion matrix: • True negatives (TN) = 25 • True positives (TP) = 60 • False negatives (FN) = 10 • False positives (FP) = 5 The accuracy is: o 25% o 90% o 60% o 85%

85%

Consider the following values for the confusion matrix: • True negatives (TN) = 30 • True positives (TP) = 50 • False negatives (FN) = 15 • False positives (FP) = 5 The precision is: o 90.9% o 80% o 85.7% o 76.9%

90.9%

To solve the problem of Overplotting, which one of the following techniques we use. (You can select multiple options) A) Data Sampling B) Changing the transparency by altering the alpha value C) Combine the overplotted plot with another plot D) Use x_jitter or y_jitter or both E) All of the above

A) Data Sampling B) Changing the transparency by altering the alpha value D) Use x_jitter or y_jitter or both

Which of the following is a reinforcement learning? A) Learning to ride a bicycle B) Grouping related documents from an unannotated corpus C) Grouping students into groups - primary, high school and college D) Both A and C E) None of the above

A) Learning to ride a bicycle

Which of the following metric is used for the continuous output problem? A) R-square B) Precision C) Accuracy D) Confusion Matrix

A) R-square

For analyzing correlation values, which graph is used? A) Scatter Plot B) Line plot C) Bar plot D) Histogram E) None of the Above

A) Scatter Plot

In Reinforcement learning, the weight of exploration is supposed to be: A) The agent starts the learning with high weight for exploration, then it decays gradually B) Very small, about 0.1 C) Very large, about 0.9 D) The agent learns with weight of 0.5 for exploration and ends with 0.9 E) None of the above

A) The agent starts the learning with high weight for exploration, then it decays gradually

Which of the following questions cannot be answered by a violin plot A)What is the center of the data? B)How many modes does my data set have? C)What is the spread of the data? D) What is the shape of my data? Is it symmetrical, skewed, uniform, or multimodal? E)Where is the median? F) None of the above

A)What is the center of the data?

Is a histogram the same as a bar graph? A)Yes, because they both use intervals B) No, because a histogram visualizes quantitative data while a bar graph visualizes qualitative data C) No, because a histogram displays qualitative data while a bar graph displays quantitative data

B) No, because a histogram visualizes quantitative data while a bar graph visualizes qualitative data

Height and weight are well known to be positively correlated. Ignoring the plot scales (the variables have been standardized), which of the two scatter plots (plot1, plot2) is more likely to be a plot showing the values of height (Var1 - X axis) and weight (Var2 - Y axis) A) Plot 1 B) Plot 2 C) Both D) None E) We can not tell

B) Plot 2

Which of the following is a supervised learning problem? You may have more than one correct answer. A) Grouping people on a social network B) Predicting credit approval based on historical data C) Predicting rainfall based on historical data D) all of the above E) None of the above

B) Predicting credit approval based on historical data C) Predicting rainfall based on historical data

The difference between histogram and bar chart is: (i) Bar chart is used to represent continuous values and histogram represent discrete values. (ii) There is no gap between bars in histogram however in bar chart gap exists. A) (i) True, (ii) False B) S(i) False, (ii) True C) Both are True D) Both are False

B) S(i) False, (ii) True

In the above figure, what does it mean when the FPR and TRP are two points on the red line? Choose the best answer A) The classifier at a particular threshold is so accurate B) The probability of classifying TP or FP are equals. Hence the classifier is useless at that threshold C) The TPR is dependent on FPR D) It means that the AUC equals to 1 E) We can not tell

B) The probability of classifying TP or FP are equals. Hence the classifier is useless at that threshold

Which of the following is a classification problem? You may have more than one correct answer. A) Predicting the amount of rainfall for a particular day B) predicting whether it will rain or not on a particular day C) Given all the actors in a movie, predicting D) Filtering of span messages its genre E) None of the above

B) predicting whether it will rain or not on a particular day C) Given all the actors in a movie, predicting its genre D) Filtering of span messages

In case there are too many outliers in the dataset, the most representative typical value is: A) Mean B)Median C)Variance D)All of the above E)None of the above

B)Median

In the heatmap, as cells are darker, it means the cells has more data points: A) True B) False C) It can be specified to refer to the cells of the more data points or the opposite D) None of the above

C) It can be specified to refer to the cells of the more data points or the opposite

Which of the following is correct in a positively skewed distribution: A) Mean < median < mode B)Mean > median < mode C) Mean > median > mode D) None of the above

C) Mean > median > mode

If you have column with categorical variables, which will be the appropriate method to fill in the NaN's present in the column? A) Mean B) Median C) Mode D) None of the above

C) Mode

Which of the following statement is/are not correct? 1- Covariance and correlation give us the mathematical tools to check whether two different attributes are related or not. 2- Correlation always implies causation. 3- Interpreting correlation is easy. 4- Interpreting covariance is hard. A)Only 4 B) Only 3 C) Only 2 D) Only 1 E)Both 3 and 4 F)Both 1 and 2

C) Only 2

Which is true about KNN? A) KNN can be used for solving both classification and regression problems B) KNN works well with small number of input variables C) It is a lazy leaning algorithm D) All of the above E) None of the above

D) All of the above

Which of the following statements is not correct? A) Mean > Median > Mode in a normal distribution with positive skew B) Covariance can show negative or positive relationship between two variables C)Interpreting covariance is hard D) If a distribution has zero skew then it is a symmetric normal distribution E) Covariance and correlation give us the mathematical tools to check whether two different attributes are related or not.

D) If a distribution has zero skew then it is a symmetric normal distribution

You want to plot the probability distribution curve of a discrete random variable, which of the following is the best to use? A) The normal distribution function. B)The scatter plot function. C)The propensity density function. o D) The probability mass functions.

D) The probability mass functions.

Which one of the following survey questions would generate categorical data? A) How many times do you eat at your favorite fast-food place in a typical week? B) How much do you usually spend buying your favorite fast food? C)How many items did you buy last time you went to your favorite fast food place? D)Which is your favorite fast food? E)None of the above

D)Which is your favorite fast food?

Given: yᵢ = [5,10,15,20] and ŷᵢ = [4.8,10.6,14.3,20.1] The sum of the squared Errors (SSE) is: o 0.4 o 0.474 o 0.16 o 0.9

0.9

Given: yᵢ = [5,12,15,20] and ŷᵢ = [4.8,10.6,14.3,19.1] Find R-Square: o 0.5 o 0.7 o 0.9 o 0.97

0.97

To explore the relation between two qualitative variables, it is better to use: A)Scatter plot B) Clustered Bar chart C)Violin plot D)Histogram

B) Clustered Bar chart

Which of the following statements is/are NOT correct about Reinforcement learning discount factor: A) Is used to compensate for uncertainties about future rewards B) Is specified in the interval [-1,1]. C) Determines how much the reinforcement learning agent cares about rewards in the distant future relative to those in the immediate future D) When discount factor is 0, the agent will only learn about actions that produce an immediate reward E) B and D

B) Is specified in the interval [-1,1].

In a regression model, the distance between the predicted value and the actual value is called: o True Positive Rate o F-Measure o R-Square o Residual

Residual


Set pelajaran terkait

ИКТ экзамен 1 часть

View Set

General Biology 2 - MIDTERMS (Final Test)

View Set

bio test 4 module 11 gene expressions

View Set

Ch 42 Assessment and Management of Patients with Obesity

View Set

ALC Marksmanship Exam 02/13/2017

View Set

From Treasure Island (Section Questions 1-3)

View Set