ITD 140 Quizzes 1,2,3
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms True or False? Almost all machine learning algorithms are highly optimized and do not need large amounts of RAM.
False
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms True or False? Artificial intelligence is a subset of machine learning.
False
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms True or False? Data analysts normally use 50% of a data set for training and 50% for testing.
False
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms True or False? Data clustering and data association use a supervised machine-learning process.
False
Quiz 2 - Feature Selection and Engineering True or False? Data analysts normally use 50% of a data set for training and 50% for testing.
False
Quiz 2 - Feature Selection and Engineering True or False? Eliminating features with dimensionality reduction always improves the accuracy of the model.
False
Quiz 2 - Feature Selection and Engineering True or False? Overfitting occurs when we use an overly simple model to fit our dataset.
False
Quiz 2 - Feature Selection and Engineering True or False? Reducing the number of features increases the complexity of the model.
False
Quiz 3 - Supervised and Unsupervised Learning True or False? A machine learning training algorithm modifies hyperparameters to iteratively improve training accuracy.
False
Quiz 3 - Supervised and Unsupervised Learning True or False? The K-means algorithm normally executes faster than K-means++.
False
Quiz 2 - Feature Selection and Engineering What approach to feature selection involves removing less relevant features based on a correlation matrix? 1. Filter 2. Wrapper 3. Embedded
Filter
Quiz 2 - Feature Selection and Engineering **question with image**
Given the model (blue line), this is the relative training accuracy and testing accuracy. Training accuracy High Testing accuracy Low
Quiz 3 - Supervised and Unsupervised Learning _____ clustering restricts each point to residing in only one cluster. Soft Hard Unified Segmented
Hard
Quiz 2 - Feature Selection and Engineering **question with image**
If you had to discard three (3) input features, which should you choose based on the correlation matrix above? 1. Pregnancies 2. Glucose 3. BloodPressure 4. SkinThickness 5. Insulin 6. BMI 7. DiabetesPedigreeFunction 8. Age
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms **question with image**
In the following plot, what is the term for the feature that the arrows are indicating? A. decision boundary B. splitline C. dividing boundary D. phase-shift
Quiz 3 - Supervised and Unsupervised Learning What are the fundamental components of a basic neural network? A. Neurons, synapses, and neurotransmitters. B. Input layer, hidden layers, and output layer. C. Datasets, data preprocessing techniques, and feature extraction methods. D. Activation functions, linear regression models, and decision trees.
Input layer, hidden layers, and output layer
Quiz 3 - Supervised and Unsupervised Learning The _____ classifier assigns data to the class the data most resembles. K-nearest neighbor Naïve Bayes NClass linear regression
K-nearest neighbor
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms ________ is defined as the use of data pattern-recognition algorithms, which allow a program to solve problems, such as clustering, categorization, predictive analysis, and data association without the need for explicit step-by-step programming instructions to tell the algorithm how to perform tasks. A. Classification B. Data mining C. Pattern recognition D. Machine learning
Machine learning
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms Briefly describe the difference between supervised and unsupervised machine learning. (just a couple of sentences)
Supervised machine learning involves training a model on a labeled dataset, where the algorithm learns to map input data to corresponding output labels. Unsupervised machine learning deals with unlabeled data, aiming to identify patterns, relationships, or structures within the data without explicit target labels. Unsupervised methods include clustering and dimensionality reduction. Supervised methods focus on classification and regression tasks.
Quiz 2 - Feature Selection and Engineering **question with image**
The numbers with the correct description True positive 15 True negative 80 False positive 2 False negative 3
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms **question with image**
The plots below indicate the decision boundary generated using KNN. How is the value of 'k' changing, if at all, in the following plots, from left to right? A. no change B. increasing C. alternating between negative and positive D. decreasing
Quiz 2 - Feature Selection and Engineering **question with image**
The relative bias and variance for the models in the image Model 1 HIGH bias, LOW variance Model 2 MEDIUM bias, MEDIUM variance Model 3 LOW bias, HIGH variance
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms True or False? Correlation is a measure that describes the relationship between two variables.
True
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms True or False? To determine a model's accuracy, you divide the number of predictions that are correct by the total number of items in the set.
True
Quiz 2 - Feature Selection and Engineering True or False? Reducing the number of features increases the model's training and prediction performance (i.e., speed).
True
Quiz 2 - Feature Selection and Engineering True or False? Underfitting occurs when we use a very simple model to fit our dataset.
True
Quiz 3 - Supervised and Unsupervised Learning True or False? A decision tree is a graph-based data structure that specifies a collection of decision points. By following a path through the decision points, a decision-tree classifier assigns data to specific classes.
True
Quiz 3 - Supervised and Unsupervised Learning True or False? Classification algorithms work by examining an input "training set" of data to learn how the data values combine to create a result.
True
Quiz 3 - Supervised and Unsupervised Learning True or False? Classification is more appropriate than regression when there are two or more finite values in the target variable.
True
Quiz 3 - Supervised and Unsupervised Learning True or False? K-means clustering selects K clusters based on optimizing the positions of centroids.
True
Quiz 3 - Supervised and Unsupervised Learning True or False? The K-means clustering algorithm is a centroid clustering algorithm.
True
Quiz 3 - Supervised and Unsupervised Learning True or False? The logistic regression classifier is best suited for binary classification.
True
Quiz 3 - Supervised and Unsupervised Learning True or False? When you "overfit" the model, the model may start to treat noise or errant data as valid training data, similar to memorizing answers to particular questions rather than understanding the underlying concept.
True
Quiz 3 - Supervised and Unsupervised Learning True or False? Within a machine-learning program, the code will specify, for example, that 70% of the data will be training data and 30% will be used for testing.
True
Quiz 2 - Feature Selection and Engineering **question with image**
What is the accuracy of this model based on the confusion matrix above? A. 95% B. 80% C. 15% D. 98% E. 97%
Quiz 2 - Feature Selection and Engineering **question with image**
Which model is overfitting the data? A. Model 1 B. Model 2 C. Model 3 D. None of them
Quiz 2 - Feature Selection and Engineering **question with image**
Which model is underfitting the data? A. Model 1 B. Model 2 C. Model 3 D. None of them
Quiz 2 - Feature Selection and Engineering What approach to feature selection is used by the SelectKBest algorithm (note that it uses some form of machine learning)? A. Filter B. Wrapper C. Embedded
Wrapper (uses machine learning)
Quiz 2 - Feature Selection and Engineering Given the following sample feature data, what kind of encoding should you apply before using it to train a machine learning model? Sex: ['M', 'F', 'F', 'M', 'F', 'M', 'M', 'F'] A. binary B. one hot C. none
binary one hot
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms In k-means, the value k represents the number of __________, and it ___________ have to be specified. A. centroids, does B. classes, does not C. centroids, does not D. classes, does
centroids, does
Quiz 2 - Feature Selection and Engineering What is the effect of using the MinMaxScaler on a feature? A. changes the scale of the feature data to [0,1] or [-1,1], increasing the effect of any outliers B. reshapes distribution to mean=0, stddev=1 C. changes the scale of the feature data to [-1,1], decreasing the effect of any outliers
changes the scale of the feature data to [0,1] or [-1,1], increasing the effect of any outliers
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms To determine if an email is spam or not, data programmers would use a ________________ algorithm. A. matching B. predicting C. clustering D. classification
classification
Quiz 2 - Feature Selection and Engineering What type of supervised learning approach works best for qualitative, discrete target/output data?
classification
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms Many machine learning programs display a(n) ____________, which shows the overall performance of the model with counts of correct and incorrect predictions. A. accuracy matrix B. confusion matrix C. error rate table D. prediction matrix
confusion matrix
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms What term is used to indicate how well a trained model will predict labels on input feature data that it has never seen (that is, not used in training)? A. generalizability B. accuracy C. goodness D. validation
generalizability
Quiz 2 - Feature Selection and Engineering Given the following sample feature data for "distance", what kind of encoding should you apply before using it to train a machine learning model? Distance: [12.3, 53.8, 4.2, 10.5, 9.1, 20.9, 7.7] A. none B. one hot C. binary
none
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms What does the 'k' in KNN refer to? A. 'killer' in "killer neural network" B. number of points to examine C. number of features D. number of epochs
number of points to examine
Quiz 2 - Feature Selection and Engineering Given the following sample feature data, what kind of encoding should you apply before using it to train a machine learning model? Species: ['dog', 'cat', 'rabbit', 'hamster'] A. one hot B. binary C. none
one hot
Quiz 2 - Feature Selection and Engineering To assign numeric values to in place of categorical data, machine-learning programs use: A. column substitution B. one hot encoding C. category swapping D. None of these is correct
one hot encoding
Quiz 3 - Supervised and Unsupervised Learning Neural networks are at the heart of machine learning and are used for many problems, including classification. Behind the scenes, neural networks use a collection of activation functions in the 'nodes', called _______, that simulate activities performed by the brain and nervous system. logics perceptrons neurologits AI methods
perceptrons
Quiz 2 - Feature Selection and Engineering To perform data-set dimensionality reduction, analysts will often use a technique called __________. A. principal component analysis B. dimensionality elimination C. data alignment D. data association
principal component analysis
Quiz 2 - Feature Selection and Engineering Principal component analysis __________________. A. projects feature dimensions into lower-dimensional space B. determines the N most important features C. extracts the two most important features
projects feature dimensions into lower-dimensional space
Quiz 2 - Feature Selection and Engineering What type of supervised learning approach works best for continuous, numeric target/output data?
regression
Quiz 2 - Feature Selection and Engineering What type of machine learning does Principal Component Analysis (PCA) use? A. unsupervised B. supervised C. reinforcement
unsupervised
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms Ideally, what amount of bias and variance would ensure the model generalizes well? Include a brief explanation of the relationship between bias and variance.
Achieving a balance between bias and variance is crucial for a model to generalize well. This concept is known as the bias-variance trade-off. A model with high bias tends to oversimplify the underlying patterns in the data, leading to systematic errors or underfitting. A model with high variance can result in the model being too sensitive to fluctuations in the training data, leading to overfitting. bias and variance are more or less inversely related; as one goes up, the other goes down
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms _______ is the general science of making intelligence machines that can perceive visual items, recognize voices, make decisions, and more. A. Data association B. Machine learning C. Data mining D. Artificial intelligence
Artificial intelligence
Quiz 3 - Supervised and Unsupervised Learning Which of these are clustering algorithms? (select all that apply) A. Logistic regression B. Connectivity algorithms, such as hierarchical clustering C. Centroid algorithms, such as K-means clustering D. Hybrid algorithms, such as K-nearest-neighbors
B. Connectivity algorithms, such as hierarchical clustering C. Centroid algorithms, such as K-means clustering
Quiz 2 - Feature Selection and Engineering **question with image**
Based on the correlation matrix above, which input feature has the most impact on the target ("Outcome")? A. Glucose B. Pregnancies C. BloodPressure D. SkinThickness E. Insulin F. BMI G. DiabetesPedigreeFunction H. Age
Quiz 3 - Supervised and Unsupervised Learning Which of the following describes the center of a cluster? Locus Centroid Dendrogram Central point
Centroid
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms ______ is the process of assigning data to matching groups (categories), such as a tumor being benign or malignant, email being valid or spam, or a transaction being legitimate or fraudulent. A. Mapping B. Clustering C. Classification D. Association
Classification
Quiz 3 - Supervised and Unsupervised Learning ______ is the use of a supervised machine-learning algorithm to assign an observation into a specific category. Predicting Classification Clustering Data mining
Classification
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms ______ is the processing of grouping related items without providing or even having 'correct' answers. A. Gathering B. Clustering C. Grouping D. Factoring
Clustering
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms _______ is the assignment of data items to a related group, and it is a type of unsupervised learning. A. Classification B. Prediction C. Association D. Clustering
Clustering
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms In terms of KNN, explain how bias and variance change as k increases from a small value to a high value.
As the value of k increases in K-Nearest Neighbors (KNN), bias tends to increase while variance decreases. When k is small, the model becomes more flexible and closely fits the training data, resulting in lower bias but higher variance. As k increases, the model becomes more rigid, leading to higher bias but lower variance. Finding an optimal value for k involves balancing this bias-variance trade-off to achieve better generalization performance on unseen data.
Quiz 2 - Feature Selection and Engineering Compare and contrast ordinal and nominal features. (i.e., state their commonalities as well as their differences in a few words)
Ordinal and nominal features are both types of categorical variables, but they differ in the nature of their values: Commonalities: 1. Both are categorical variables that represent qualitative characteristics. 2. They are often used in classification and regression tasks. Differences: 1. Nominal features have categories with no inherent order or ranking, while ordinal features have categories with a meaningful order or hierarchy. 2. In nominal features, categories are mutually exclusive, meaning there is no implied order between them. In ordinal features, there is a meaningful order or ranking among the categories. 3. Statistical measures like mode are applicable to both, but ordinal features additionally allow for comparisons of relative order. In summary, - both features are categorical - nominal features do not have inherent order/ranking - ordinal features have meaningful order/ranking
Quiz 3 - Supervised and Unsupervised Learning Which of the following describes a value that falls outside of the expected range of values? Edge value Border value Extrinsic value Outlier value
Outlier value
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms What term describes a trained model that has effectively 'memorized' the mappings between input features and labels/outputs such that it does not do well with samples it has not seen? A. Mirroring B. Overfitting C. Memory committing D. Memorizing
Overfitting
Quiz 3 - Supervised and Unsupervised Learning Classification algorithms vary most prominently by which of these? (select all that apply) Performance Complexity Data format Programming language Memory use
Performance Complexity Memory Use
Quiz 3 - Supervised and Unsupervised Learning You are working on a machine learning project where you have a large dataset and a complex model with several hyperparameters. Due to resource constraints, you have a limited amount of time to optimize these hyperparameters. Which hyperparameter tuning method would be more appropriate in this scenario? A. Random Search, because it randomly selects a subset of hyperparameters, which can be more efficient in high-dimensional spaces. B. Neither, as hyperparameter tuning is not necessary for complex models. C. Both Grid Search and Random Search are equally appropriate in this scenario. D. Grid Search, because it exhaustively searches through all possible combinations of hyperparameters.
Random Search, because it randomly selects a subset of hyperparameters, which can be more efficient in high-dimensional spaces.
Quiz 1 - AI and ML Concepts, Supervised and Unsupervised Learning, Basic Algorithms Which of the following are common machine-learning approaches? (select all that apply) 1. Dependent 2. Supervised 3. Categorized 4. Unsupervised
Supervised Unsupervised