IME 470/471 Predictive Enterprise Analytics

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Compare and contrast regression-based and time-series prediction methods.

1. Computation cost 2. Complexity 3. Prediction performance

3 criteria of success for classification algorithms.

1. Speed - less comp cost when low time to compute 2. Robustness - the algorithm's ability to maintain performance even with noisy data 3. Scalability & interpretability - the ability for the algorithm to use various amounts of data and still be easily understood in its results

This clustering algorithm initially assumes that each data instance represents a single cluster A. Agglomerative clustering B. Divisive clustering C. K-Means clustering D. Density based clustering

A. Agglomerative clustering

Which statement is true about neural network and linear regression models? A. Both models require input attributes to be numeric B. Both models require numeric attributes to range between 0 and 1 C. The output of both models is a categorical attribute value D. Both techniques build models whose output is determined by a linear sum of weighted input attribute values E. More than one of a, b, c, or d is true

A. Both models require input attributes to be numeric

Neural network training is accomplished by repeatedly passing the training data through the network while A. individual network weights are modified B. Training instance attribute values are modified C. The ordering of the training instance is modified D. Individual network nodes have the coefficients on their corresponding functional parameters modified

A. individual network weights are modified

Which of the following statements about Naïve Bayes is incorrect? A. Attributes are equally important B. Attributes are statistically dependent of one another given the class value C. Attributes are statistically independent and of another given the class value D. Attributes can be nominal or numeric E. All of the above

B. Attributes are statistically dependent of one another given the class value

This supervised learning technique can process both numeric and categorical input attributes A. Linear regression B. Bayes classifier C. K-nearest neighbors D. Backpropagation learning

B. Bayes classifier

Classification problems are distinguished from estimation problems in that A. Classification problems require the output attribute to be numeric B. Classification problems require the output attribute to be categorical C. Classification problems do not allow an output attribute D. Classification problems are designed to predict future outcome

B. Classification problems require the output attribute to be categorical

Assume you want to perform supervised learning and to predict number of newborns according to size of storks' population, it is an example of A. Classification B. Esitimation C. Clustering D. Artificial Neural Networks

B. Estimation

With a Kohonen network, the output layer node that wins an input instance is rewarded by having A. A higher probability of winning the next training instance to be presented B. Its connect weights modified to more closely match those of the input instance C. Its connection weights modified to more closely match those of its neighbors D. Neighboring connection weights modified to become less similar to its own connection weights *Kohonen network is another name of self-organizing map. Kohonon is the name of scientist who first introduced SOM

B. Its connect weights modified to more closely match those of the input instance

A two-layered neural network used for unsupervised clustering A. Backpropagation network B. Kohonen network C. Perceptron network D. Aggomerative network

B. Kohonen network

High value of Gini index means that the partitions in classification are A. Pure B. Not pure C. Useful D. Useless E. None of the above

B. Not pure

Epochs represent the total number of A. Input layer nodes B. Passes of the training data through the network C. Network nodes D. Passes of the test data through the network

B. Passes of the training data through the network

The K-Means algorithm terminates when A. User-defined minimum value of the summation of squared error differences between instances and their corresponding cluster center is seen B. The cluster centers for the current iteration are identical to the cluster centers for the previous iteration C. The number of instances in each cluster for the current iteration is identical to the number of instances in each cluster of the previous iteration D. The number of clusters formed for the current iteration is identical to the number of clusters formed in the previous iteration

B. The cluster centers for the current iteration are identical to the cluster centers for the previous iteration

A nearest neighbor approach is best used A. with large-sized datasets B. when irrelevant attributes have been removed from the data C. when a generalized model of the data is desirable D. when an explanation of what has been found is of primary importance

B. when irrelevant attributes have been removed from the data

Suppose we would like to convert a nominal attribute X with 4 values to a data table with only binary variables. How many new attributes are needed? A. 1 B. 2 C. 4 D. 8 E. 16

C. 4

With Bayes theorem the probability of hypothesis Ho specified by P(Ho) is referred to as A. An A priori probability B. A conditional probability C. A posterior probability D. A bidirectional probability

C. A posterior probability

The average positive difference between computed and desired outcome values A. root mean squared error B. Mean squared error C. Mean Absolute error D. Mean positive error

C. Mean Absolute error

The value input into a feed-forward neural network A. May be categorical or numeric B. Must be either all categorical or all numeric but not both C. Must be numeric D. Must be categorical

C. Must be numeric

Which one fo the following is not a major strength of the neural network approach? A. Neural networks work well with datasets containing noisy data B. Neural networks can be used for both supervised learning and unsupervised clustering C. Neural network learning algorithms are guaranteed to converge to an optimal solution D. None of the above

C. Neural network learning algorithms are guaranteed to converge to an optimal solution

Some telecommunication company wants to segment their customers into distinct groups in order to send appropriate subscription offers, this is an example of A. Supervised learning B. Data reduction C. Unsupervised learning D. Data summarization

C. Unsupervised learning

Compare/contrast co-occurrence grouping and clustering.

Clustering is grouping a set of objects with those most similar to it whereas co-occurrence grouping find the association between items based on transactions involving them. Clustering looks at characteristics of objects and co-occurrence grouping looks at behavior involving the data objects.

Use customer segmentation to discuss the difference between cluster and profiling DM tasks.

Clustering is grouping a set of objects with those most similar to it whereas profiling is describing typical characteristics of groups of data objects. Clustering alike customers based on attributes such as age, income, and spending score. Profiling uses these attributes to describe the group in order to predict what sort of customer may come in if they match similar attributes to those that have already been grouped/analyzed.

Which of the following statements is true for k-NN classifiers? A. The classification accuracy is better with larger values of k B. The decision boundary is smoother with smaller values of k C. The decision boundary is linear D. k-NN does not require an explicit training step

D. k-NN does not require an explicit training step

Self-Organizing maps are an example of A. Unsupervised learning B. supervised learning C. Clustering algorithm D. Prediction Tool E. A and C

E. A and C A. Unsupervised learning C. Clustering algorithm

A proper design of a clustering task will include: A. Clear definition of the data points B. The appropriate selection of data attributes included in the analysis C. The specification of weights and standardization of the selected attributes D. The selection of the proper algorithm E. All of the above

E. All of the above A. Clear definition of the data points B. The appropriate selection of data attributes included in the analysis C. The specification of weights and standardization of the selected attributes D. The selection of the proper algorithm

In the example of predicting number of babies based on storks' population size, number of babies is: A. Independent variable B. Target C. Predictor D. Dependent Variable E. B and D

E. B and D B. Target D. Dependent Variable

Which statement about outliers is true? A. Outliers should be identified and removed from a dataset B. Outliers should be part of the training dataset but should not be present in the test data C. Outliers should be part of the test dataset but should not be present in the training data D. The nature of the problem determines how outliers are used E. More than of a, b, c, d, is true

E. More than of a, b, c, d, is true

Use the confusion matrix for Model X and confusion matrix for Model Y to decide which statement is correct Model X Computed Accept Computed Reject Accept 10 5 Reject 25 60 Model Y Computed Accept Computed Reject Accept 6 9 Reject 15 70 A. Model X classified 35 instances as an "accept" B. Model Y is more successful when evaluated by error rate C. Model X is more successful in classifying rejection cases D. There are 100 instance in the dataset E. More than one of a, b, c, or d is correct F. None of a, b, c, or d is correct

E. More than one of a, b, c, or d is correct Note: C. is not correct!

What is the purpose of the validation set?

Ensure that the prediction is not a fluke and to minimize error in the prediction model

Which of the following statements about the following confusion matrix is correct? Computed Decision Class 1 Class 2 Class 3 Class 1 10 5 3 Class 2 5 15 3 Class 3 2 2 5 A. 60% of instances were incorrectly classified B. There are 22 instances of class 2 in the dataset C. 4 instances of the data were incorrectly classified with class 3 D. In total 18 instances were classified with class 1 E. More than one of a, b, c, or d is correct F. None of a, b, c, or d is correct

F. None of a, b, c, or d is correct

T/F: A critical skill in data science is the ability to decompose a data-analytics problem into pieces such that each piece matches a known DM algorithm.

False

T/F: After comparing 2 clusterings with Rand's measure, we get a value of 0.7 for R. Knowing that Rand can assume the values between 0 and 1, we decide that the clusterings are somewhat similar.

False

T/F: If we build a classifier and evaluate it on the training set and the test set: we would expect the test set to have the higher accuracy.

False

T/F: In back propagation for learning of MLP, the connections' weight will be updated in such a way that next time the data points is exposed to network the amount of error will be zero

False

T/F: The best way to include an ordinal attribute as in input to MLP in a classification task is binary coding of the ordinal attribute.

False

T/F: The k-means clustering algorithm will automatically find the best value of k as part of its normal operation.

False

T/F: ina. Lie detection effort formulated as a classification task, the data objects are each individual.

False

T/F: the only difference between a trained & non-trained MLP network is that the trained network is guaranteed to have accurate prediction.

False

Limitations/strengths of MLP as a prediction algorithm.

Strengths - handle complex data, adjustable, high success rate, resilient to outliers Limitations - high computational cost, minimal insight on relationships between data points

T/F: Clustering 1 has grouped 300 data points into 4 groups, and clustering 2 has grouped the same 300 data points into 5 groups. The value of Fowlkes-Mallows for the 2 clusterings is 1. That means H-criterion will be 0.

True

T/F: DT, KNN, & NB are similar with respect to the fact that all of them are non-random algorithm, and are only used for classification tasks.

True

T/F: Decision Trees are easy to interpret.

True

T/F: Increasing value of k in KNN can combat noise in a dataset.

True

T/F: K-means can only work with numerical attributes.

True

T/F: SMA forecasting method is basically a linear regression model with the following specifications: - the independent attributes are the n previous data - B0 (beta) is assumed to be 0 and the rest of the betas are simply calculated by one divided by the # of independent attributes

True

T/F: What separate the extracted interesting patterns with a DM algorithm vs with human comprehension? DM found pattern depth and complexity that human compressions would not be able to spot.

True

T/F: a density-based clustering algorithm can generate non-globular (globe-like) clusters.

True

T/F: classification and estimation are similar DM tasks, & the only difference is that classification estimate the class probability.

True

T/F: co-occurrence grouping, association rules discovery and market basket analysis are different names for the same DM task.

True

T/F: discriminating between spam and non-spam emails is a classification task.

True

T/F: social network friend suggestion is an example of DM classification task

True


Set pelajaran terkait

Maternal & Child Health Nursing Chapter 1

View Set

Business Law Chapter 9-13 Test (Test #2)11

View Set

X-ray Circuit Chapter 5 Exposures (Worksheets)

View Set

Chapter 11 - Attachment to Others and Development of Self

View Set