Chapter 6 Textbank
In the opening vignette, predictive modeling is described as A) estimating the future using the past. B) not yet accepted in the business world. C) the least practiced branch of data mining. D) unable to handle complex predictive problems.
A) estimating the future using the past.
No matter the topology or architecture of a neural network, they all use the same algorithm to adjust weights during training.
False
The k-nearest neighbor algorithm is overly complex when compared to artificial neural networks and support vector machines.
False
The task undertaken by a neural network does not affect the architecture of the neural network; in other words, architectures are problem-independent.
False
The opening vignette teaches us that ________ medicine is a relatively new term coined in the healthcare arena, where the main idea is to dig deep into past experiences to discover new and useful knowledge to improve medical and managerial procedures in healthcare.
evidence-based
In the process of image recognition (or categorization), images are first transformed into a multidimensional ________ and then, using machine-learning techniques, are categorized into a finite number of classes.
feature space
In the power generators case study, data mining—driven software tools, including data-driven ________ technologies with historical data, helped an energy company reduce emissions of NOx and CO.
predictive modeling
________ has proved the most popular of the techniques proposed for shedding light into the "black-box" characterization of trained neural networks.
Sensitivity analysis
In a neural network, groups of neurons can be organized in a number of different ways; these various network patterns are referred to as ________.
topologies
Predictive modeling is perhaps the most commonly practiced branch in data mining. What are three of the most popular predictive modeling techniques?
1. Artificial neural networks 2. Support vector machines 3. K-nearest neighbor
________ is the most widely used supervised learning algorithm in neural computing.
Backpropagation
All the following statements about hidden layers in artificial neural networks are true EXCEPT A) hidden layers are not direct inputs or outputs. B) more hidden layers increase required computation exponentially. C) many top commercial ANNs forgo hidden layers completely. D) more hidden layers include many more weights.
C) many top commercial ANNs forgo hidden layers completely.
What is a major drawback to the basic majority voting classification in kNN? A) It requires frequent human subjective input during computation. B) Classes that are more clustered tend to dominate prediction. C) Even the naive version of the algorithm is hard to implement. D) Classes with more frequent examples tend to dominate prediction.
D) Classes with more frequent examples tend to dominate prediction.
In the Coors case study, a neural network was used to more skillfully identify which beer flavors could be predicted.
True
The development process for an ANN application involves ________ steps.
nine
Neural networks are called "black boxes" due to the lack of ability to explain their reasoning.
True
Compared to the human brain, artificial neural networks have many more neurons.
False
Generally speaking, support vector machines are less accurate a prediction method than other approaches such as decision trees and neural networks.
False
In the Coors case study, genetic algorithms were of little use in solving the flavor prediction problem.
False
In the mining industry case study, the input to the neural network is a verbal description of a hanging rock on the mine wall.
False
In the opening vignette, the high accuracy of the models in predicting the outcomes of complex medical procedures showed that data mining tools are ready to replace experts in the medical field.
False
The k-nearest neighbor algorithm appears well-suited to solving image recognition and categorization problems.
True
The most complex problems solved by neural networks require one or more hidden layers for increased accuracy.
True
The network topology that allows only one-way links between layers, with no feedback linkage permitted, is known as backpropagation.
True
The use of hidden layers and new topologies and algorithms renewed waning interest in neural networks.
True
Though useful in business applications, neural networks are a rough, inexact model of how the brain works, not a precise replica.
True
Neural computing refers to a ________ methodology for machine learning.
pattern-recognition
Writing the SVM classification rule in its dual form reveals that classification is only a function of the ________, i.e., the training data that lie on the margin.
support vectors
Describe the k-nearest neighbor (kNN) data mining algorithm.
KNN is a type fo instance-based learning where the function is only approximated locally and all computations are deferred until the actual prediction.
Describe the nine steps in the development process for an ANN application.
1. Data to be used for training and testing the network are collected. 2. Training data must be identified, and a plan must be made for testing the performance of the network. 3 & 4. A network architecture and a learning method are selected. 5. The initialization fo the network weights and parameters, followed by the modification of the parameters as training-performance feedback is received. 6. Transforms the application data into the type and format required by the neural network. 7 & 8. Training and testing are conducted iteratively by presenting input and desired or known output data to the network. 9. A stable set of weights is obtained.
What are the five steps in the backpropagation learning algorithm?
1. Initialize weights with random values and set other parameters. 2. Read in the input vector and the desired output. 3. Compute the actual output via the calculations, working forward through the layers. 4. Compute the error. 5. Change the weights by working backwards from the output layer through the hidden layer.
What are the three steps in the process-based approach to the use of support vector machines (SVMs)?
1. Numericizing the data 2. Normalizing the data 3. Select the kernel type and kernel parameters 4. Deploy the model
The student retention case study shows that, given sufficient data with the proper variables, data mining techniques are capable of predicting freshman student attrition with approximately ________ percent accuracy.
80
How is a general Hopfield network represented architecturally?
A single layer of neurons, the neurons are connected to one another.
When using support vector machines, in which stage do you transform the data? A) preprocessing the data B) developing the model C) experimentation D) deploying the model
A) preprocessing the data
Why is sensitivity analysis frequently used for artificial neural networks? A) because it is required by all major artificial neural networks B) because some consequences of mistakes by the network might be fatal, so justification may matter C) because it is generally informative, although it cannot help to identify cause-and-effect relationships among variables D) because it provides a complete description of the inner workings of the artificial neural network
B) because some consequences of mistakes by the network might be fatal, so justification may matter
When using support vector machines, in which stage do you select the kernel type (e.g., RBF, Sigmoid)? A) preprocessing the data B) developing the model C) experimentation D) deploying the model
B) developing the model
Which element in an artificial neural network roughly corresponds to a dendrite in a human brain? A) node B) input C) output D) weight
B) input
Using the k-nearest neighbor machine learning algorithm for classification, larger values of k A) sharpen the distinction between classes. B) reduce the effect of noise on the classification. C) increase the effect of noise on the classification. D) do not change the effect of noise on the classification.
B) reduce the effect of noise on the classification.
Support vector machines are a popular machine learning technique primarily because of A) their relative cost and superior predictive power. B) their superior predictive power and their theoretical foundation. C) their relative cost and relative ease of use. D) their high effectiveness in the very few areas where they can be used.
B) their superior predictive power and their theoretical foundation.
In the Coors case study, why was a genetic algorithm paired with neural networks in the prediction of beer flavors? A) to replace the neural network in harder cases B) to complement the neural network by reducing the error term C) to enhance the neural network by pre-selecting output classes for the neural network D) to best model how the flavor of beer evolves as it ages
B) to complement the neural network by reducing the error term
In the student retention case study, which of the following variables was MOST important in determining whether a student dropped out of college? A) high school GPA and SAT high score math B) college and major C) completed credit hours and hours enrolled D) marital status and hours enrolled
C) completed credit hours and hours enrolled
In the student retention case study, of the four data mining methods used, which was the most accurate? A) ANN B) DT(C5) C) SVM D) LR
C) SVM
Neural networks have been described as "biologically inspired." What does this mean? A) They are faithful to the entire process of computation in the human brain. B) They were created to look identical to human brains. C) They crudely model the biological makeup of the human brain. D) They have the power to undertake every task the human brain can.
C) They crudely model the biological makeup of the human brain.
In developing an artificial neural network, all of the following are important reasons to pre-select the network architecture and learning method EXCEPT A) some configurations have better success than others with specific problems. B) development personnel may be more experienced with certain architectures. C) most neural networks need special purpose hardware, which may be absent. D) some neural network software may not be available in the organization.
C) most neural networks need special purpose hardware, which may be absent.
The k-nearest neighbor machine learning algorithm (kNN) is A) highly mathematical and computationally intensive. B) a method that has little in common with regression. C) regarded as a "lazy" learning method. D) very complex in its inner workings.
C) regarded as a "lazy" learning method.
In the opening vignette, which method was the best in both accuracy of predicted outcomes and sensitivity? A) ANN B) CART C) C5 D) SVM
D) SVM
For how long do SVM models continue to be accurate and actionable? A) for as long as the developers stay with the firm B) for as long as management support continues to exist for the project C) for as long as you choose to use them D) for as long as the behavior of the domain stays the same
D) for as long as the behavior of the domain stays the same
Backpropagation learning algorithms for neural networks are A) the least popular algorithm due to their inaccuracy. B) used without hidden layers for effectiveness. C) used without a training set of data. D) required to have error tolerance set in advance.
D) required to have error tolerance set in advance.
All of the following are disadvantages/limitations of the SVM technique EXCEPT A) model building involves complex and time-demanding calculations. B) selection of the kernel type and kernel function parameters is difficult. C) they have high algorithmic complexity and extensive memory requirements for complex tasks. D) their accuracy is poor in many domains compared to neural networks.
D) their accuracy is poor in many domains compared to neural networks.
Which element in an artificial neural network roughly corresponds to a synapse in a human brain? A) node B) input C) output D) weight
D) weight
Using support vector machines, you must normalize the data before you numericize it.
False
With a neural network, outputs are attributes of the problem while inputs are potential solutions to the problem.
False
In 1992, Boser, Guyon, and Vapnik suggested a way to create nonlinear classifiers by applying the kernel trick to maximum-margin hyperplanes. How does the resulting algorithm differ from the original optimal hyperplane algorithm proposed by Vladimir Vapnik in 1963?
It differed because it was replaced by a nonlinear kernel function.
Why have neural networks shown much promise in many forecasting and business classification applications?
Neural networks are designed to recognize patterns and can learn sensory data similar to the human brain. it gives accurate interpretations of raw data input. They are a set of algorithms that have the ability to generalize which gives unbiased results.
Define the term sensitivity analysis as it relates to ANNs.
Sensitivity analysis is a method for extracting the cause-and-effect relationships among the inputs and the outputs for a trained neural network model.
________ are of particular interest to modeling highly nonlinear, complex problems, systems, and processes and use hyperplanes to separate output classes in training data.
Support vector machines (SVMs)
Each ANN is composed of a collection of neurons that are grouped into layers. One of these layers is the hidden layer. Define the hidden layer.
The hidden layer is a layer of neurons that takes input from the previous layer and converts those inputs into outputs for further processing.
In the student retention case study, support vector machines used in prediction had proportionally more true positives than true negatives.
True
Prior to starting the development of a neural network, developers must carry out a requirements analysis.
True
Unlike other "black box" predictive models, support vector machines have a solid mathematical foundation in statistics.
True
In the formulation of the traffic accident study in the traffic case study, the five-class prediction problem was decomposed into a number of ________ models in order to obtain the granularity of information needed.
binary classification
Due largely to their better classification results, support vector machines (SVMs) have recently become a popular technique for ________-type problems.
classification
In an ANN, ________ express the relative strength (or mathematical value) of the input data or the many connections that transfer data from layer to layer.
connection weights
In a typical network structure of an ANN consisting of three layers-input, intermediate, and output-the intermediate layer is called the ________ layer.
hidden
In machine learning, the ________ is a method for converting a linear classifier algorithm into a nonlinear one by using a nonlinear function to map the original observations into a higher-dimensional space.
kernel trick
In the mathematical formulation of SVM's, the normalization and/or scaling are important steps to guard against variables/attributes with ________ that might otherwise dominate the classification formulae.
larger variance
A thorough analysis of an early neural network model called the ________, which used no hidden layer, in addition to a negative evaluation of the research potential by Minsky and Papert in 1969, led to a diminished interest in neural networks.
perceptron
Kohonen's ________ feature maps provide a way to represent multidimensional data in much lower dimensional spaces, usually one or two dimensions.
self-organizing
Historically, the development of ANNs followed a heuristic path, with applications and extensive experimentation preceding theory. In contrast to ANNs, the development of SVMs involved sound ________ theory first, then implementation and experiments.
statistical learning