SAS Enterprise Miner Certification without choices

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

builds classification models in an attempt to improve classification of rare events in the target.

Rule Induction Node

used to register a model to the SAS Metadata Server.

Register Model

Fit Statistics can provide information that affects decision predictions, but does not affect estimate predictions.

Which of the following is not true about results produced by the Regression node?

creates Generalized Linear models in the HP Env.

HP GLM Node

creates a sample dataset.

Sample Node

used to save data to a dataset or other specified location.

Save Data Node

Data Source

A ________________ is a link between SAS Enterprise Miner and a SAS table. It contains metadata that informs SAS Enterprise Miner of all the information about the data set that is required for the analysis project.

Doubling Amount

A __________________ gives the amount of change required for doubling the primary outcome odds. It is equal to log(2) ≈ 0.69 divided by the parameter estimate of the input of interest.

Data Source

A __________________ is a metadata definition that provides SAS Enterprise Miner with information about a SAS data set or SAS table.

Nominal

A _____________variable (sometimes called a categorical variable) is one that has two or more categories, but there is no intrinsic ordering to the categories. For example, gender is a categorical variable having two categories (male and female) and there is no intrinsic ordering to the categories.

All of the Above

A data source in SAS Enterprise Miner differs from a raw data file because a data source has additional attached metadata. This metadata includes which of the following?

Logit Score

A linear combination of the inputs generates a ______________

51 one weight for each input, per hidden unit. Each hidden unit has a bias. The output layer has a bias and a weight for each hidden unit. This equates to (3*10) + 10 + 1 + 10 =51

A multiplayer perceptron neural network is using three interval inputs to model one interval target(outcome). The neural network has ten hidden units and one hidden layer. How many weights, including the biases are being estimated?

Consequence

A profit value called a profit _________________ is assigned to both correct (and incorrect) outcome and decision combinations. The profit _____________ is the expected revenue (profit) or cost (loss or negative profit) that is associated with each outcome and decision combination.

Odds Ratio

An ______________ expresses the increase in primary outcome odds associated with a unit change in an input. It is obtained by exponentiating the parameter estimate of the input of interest.

Association

An ________________ rule is a statement of the form (item set A) => (item set B). The goal of the analysis is to determine the strength of all the association rules among a set of items. The value of the generated rules is gauged by confidence, support, and lift.

Interval

An _________________variable is similar to an ordinal variable, except that the intervals between the values of the interval variable are equally spaced. For example, suppose you have a variable such as annual income that is measured in dollars, and we have three people who make $10,000, $15,000 and $20,000.

Ordinal

An ____________variable is similar to a categorical variable. The difference between the two is that there is a clear ordering of the variables. For example, suppose you have a variable, economic status, with three categories (low, medium and high). In addition to being able to classify people into these three categories, you can order the categories as low, medium and high.

57 % The confidence that the purchase of A implies the purchase of B = (total of purchase of A and B together) / (total purchases of A) = 100 / 175 = 57%

An analyst is performing a market basket analysis (affinity analysis) on the purchasing of shaving cream and seltzer water. The purchase data from a set of 250 customers is shown in the image: What is the confidence of the rule "Shaving Cream implies Seltzer Water"?

Binary

An example of a ___________________target variable is purchase or no-purchase, which is often used for modeling customer profiles.

Interval

An example of an ________________target variable is value of purchase, which is useful for modeling the best customers for particular products, catalogs, or sales campaigns.

Ordinal

An example of an _____________target is sales volume, which might contain a few discrete values such as low, medium, and high.

performs Association or Sequence discovery.

Association Node

Nominal The reason is that each level of the variable simply represents the fact that they are different from each other. No level is greater or smaller than the other level. These levels do not follow any natural order. Therefore, the underlying scale of measurement is nominal in nature.

Assume a variable is coded as: 1=unmarried 2= married 3=divorced 4=widowed. Which measurement levels should be selected in SAS EMiner for this variable?

Gini coefficient

Assume in a data mining project that the task is to predict rankings of a target variable as accurately as possible. Which of the following should be used to judge prediction models?

Stratify Stratification will ensure that the segment scheme values will be distributed evenly between the partitions.

Assume that a company has an excellent customer segmentation in place and the segment scheme is a variable in the input data set. What is the best partition method that one should use?

Sample Method: Stratify and Criterion: Equal The binary target variable will require a Stratify sample method. If equal is selected, the node samples the same number of observations from each stratum.

Assume the Target has an event proportion of 2% in the original data. Which of the following property values should be used in the Sample node of SAS Eminer to create a sample form that data with a balanced 50/50 split for Target?

Linear

Attempts to predict the value of a continuous target as a ___________function of one or more independent inputs.

automated tool to help find optimal configurations for a neural network model.

AutoNeural Node

Average Square Error

By changing the model Assessment Measure property to ___________________, you can construct what is known as a class probability tree. It can be shown that by doing so, you can minimize the imprecision of the tree. Analysts sometimes use this model assessment measure to select inputs for a flexible predictive model such as neural networks.

Average Square Error

By changing the model assessment measure to _______________________, you can construct what is known as a class probability tree. It can be shown that this action minimizes the imprecision of the tree. Analysts sometimes use this model assessment measure to select inputs for a flexible predictive model such as neural networks.

Standard Deviation

By default, the Neural Network node standardizes all input variables prior to the weight estimation step so that they have a mean of zero and a __________________ of one. This default setting is known as _____________________.

Each unique value has the potential of being the optimal split point. Split Point Defintion: portioning the available training data. If the measurement scale of the selected input is interval, each unique values serves as a potential split point for the data.

Choose the correct statement that illustrates Decision Tree Split Search for continuous (interval) inputs:

Each unique value has the potential of being the optimal split point. Split Point definition: portioning the available training data. If the measurement scale of the selected input is interval, each unique value serves as a potential split point in the data.

Choose the correct statement that illustrates Decision Tree Split Search for continuous (interval) inputs:

performs observation clustering which can be used to segment databases.

Cluster Node

Interval

In Decision Tree modeling, If the measurement scale of the selected input is ______________, each unique value serves as a potential split point for the data.

Cluster Analysis

In _____________ analysis, the goal is to identify distinct groupings of cases across a set of inputs.

Transform input data, apply analysis, and generate deployment methods. SAS Enterprise Miner enables you to perform some of the steps of the full analytic workflow. Some tasks, such as defining the business problem or analytic objective, and selecting, extracting, validating, and repairing input data, must be done before you begin working with SAS Enterprise Miner. Other tasks, such as deploying the analysis, gathering, and assessing results, must be done after you have worked with SAS Enterprise Miner. Review: The Analytic Workflow

In a typical applied analytics project, which of the following tasks would you use SAS Enterprise Miner to perform?

Good

In the Cumulative Lift chart, high values of cumulative lift suggest that the model is doing a ____________ job of separating the primary and secondary cases?

used to modify columns metadata for variables.

Metadata Node

Specify a maximum percentage of missing values for variables to be rejected. The default value is 50.

Missing Percentage Threshold

Prediction is more important than explanation. What makes neural networks interesting is their ability to approximate virtually any continuous association between the inputs and the target. Neural networks are especially useful for prediction problems where no mathematical formula is known that relates inputs to outputs, prediction is more important than explanation, and there is a lot of training data. The Rule Induction tool combines decision tree and neural network models to predict nominal targets. It is intended to be used when one of the nominal target levels is rare. Review: Neural Network Structure

Neural networks are especially useful for prediction problems where which of the following is true?

False positive fraction

One of the most useful is ___________________, the proportion of secondary outcome cases that find their way into the top ranks of predicted cases.

Sensitivity

One quantity of interest for a selected fraction of cases is _________________, the proportion of primary outcome cases.

Impute

Regression input variables with missing values can cause problems such as biased predictions. Methods to _______________ these values include synthetic distributions, estimation, and tree algorithms.

Specify a maximum number of levels for a class variable before being marked REJECTED. The default value is 20.

Reject Levels Count Threshold

enables the user to manually build if-then-else rules.

Rules Builder Node

Stopped The name stopped training comes from the fact that the final model is selected as if training were stopped on the optimal iteration. Detecting when this optimal iteration occurs (while actually training) is somewhat problematic. To avoid stopping too early, the Neural Network tool continues to train until convergence on the training data or until reaching the maximum iteration count, whichever comes first.

SAS Enterprise Miner treats each iteration in the optimization process as a separate model. The iteration with the smallest value of the selected fit statistic is chosen as the final model. This method of model optimization is called ___________ training.

performs unsupervised learning using Kohonen vector quantization(VQ), Kohonen self-organizing maps (SOMs), or batch SOMs with Nadaraya-Watson or local-linear smoothing.

SOM/Kohonen Node

is a penalized likelihood statistic, which can be thought of as a weighted average square error

Schwarz Bayesian Criterion (SBC) / likelihood?

used to examine segmented or clustered data and identify factors that differentiate data segments from the population.

Segment Profile Node

not impute any missing variables because trees can handle them Do not impute any missing values because trees can handle them as a separate category. The split search criteria for decision tress assign the missing values along one side of a branch at the Splitting node as a category. This is quite different from a regression or neural network, where each input variable is used in a mathematical equation and hence cannot have missing values.

Suppose your input variables have missing values. What should you do before running a decision tree with these input variables?

computes similiarity measures associated with time-stamped or time series data.

TS Similarity Node

Role: Input Level: Nominal The Advanced Metadata Advisor assists the SAS EMiner user in assigning appropriate Role and Level metadata based upon a data set's variable data type, name and stored values. Although units_sold is a numeric variable, it will be assigned the Level Nominal because it contains fewer than 10 distinct values. The role of the variable units_sole will be input because it does not contain any missing values and is below the Missing Percentage Threshold value.

The SAS data set credit_customers contains a numeric variable units_sold that holds only the values: 1,2,3,4. Based on the settings provided in the Advanced Advisor Options, what will be the Role and Level of the units_sold variable when the credit_customers data set is created using Advanced Metadata Advisor in the Data Source Wizard? Property: Value: Missing Percentage Threshold - 50 Reject Vars with Excessive Missing Values - Yes Class Levels Count Threshold - 10 Detect Class Levels - Yes Reject Levels Count Threshold - 20 Reject Vars with Excessive Class Values - Yes Database Pass-Through - Yes

True

The Score tool creates score code modules in the SAS, C, and Java languages. These score code modules can be saved and used outside of SAS Enterprise Miner or outside of the SAS System.

True

The Score tool is used to score new data inside SAS Enterprise Miner and to create scoring modules for use outside of SAS Enterprise Miner. The Score tool adds predictions to any data source that has a role of Score. This data source must have the same inputs as the training data.

True

The Variable Selection tool 's chi-square approach is similar to a decision tree. The advantage of the tree-like approach is its ability to detect nonlinear and non-additive relationships between the inputs and the target. However, its method for handling categorical inputs makes it sensitive to spurious input/target correlations.

Lift The lift can be interpreted as a general measure of association between the two item sets. Lift values greater than 1 indicate positive correlation; values equal to 1 indicate zero correlation; and values less than 1 indicate negative correlation. If Lift=2 for the rule A => B, then a customer having A is twice as likely to have B than a customer chosen at random. Lift is symmetric, so the lift of the rule A => B is the same as the lift of the rule B => A.

The ______ of the rule A => B is the confidence of the rule divided by the expected confidence, assuming that the item sets are independent. The expected confidence of A => B is the probability that a customer has B.

Support

The __________ for the rule A => B is the probability that the two item sets occur together (or the probability that a customer has both A and B). __________ is symmetric, so the support of the rule A => B is the same as the support of the rule B => A. The support of the rule A => B is estimated by using the following formula:

heuristic

The ___________ algorithm alternately merges branches and reassigns consolidated groups of observations to different branches. The process stops when a binary split is reached. Among all candidate splits considered, the one with the best worth is chosen. The ____________ algorithm initially assigns each consolidated group of observations to a different branch, even if the number of such branches is more than the limit allowed in the final split. At each merge step, the two branches are merged that degrade the worth of the partition the least. After the two branches are merged, the algorithm considers reassigning consolidated groups of observations to different branches. Each consolidated group is considered in turn, and the process stops when no group is reassigned.

Results

The ____________ window for the cluster analysis includes the Segment Plot, Segment Size, Mean Statistics, and Output windows.

Outcome

The _____________ of market basket analysis is a set of association rules such as buying item A implies buying item B (A => B). The strength of the association is measured by the support, confidence, and lift of the rule.

Validation

The ______________ data set is used for monitoring and tuning the model to improve its generalization. The tuning process usually involves selecting among models of different types and complexities. The tuning process optimizes the selected model on the validation data. Consequently, a further holdout sample is needed for a final, unbiased assessment.

average squared error Average square error is a fundamental statistical measure of model performance. It is calculated by averaging (across all cases in a data set) the squared difference between the actual and the predicted target values of the target variable, as shown in the equation below.

The ______________ measures the difference between the prediction estimate and the observed target value.

File Import

The _______________ tool enables you to convert selected external flat files, spreadsheets, and database tables into a format that SAS Enterprise Miner recognizes as a data source.

DMNeural

The _______________ tool is designed to provide a flexible target prediction using an algorithm with some similarities to a neural network. A multi-stage prediction formula scores new cases. The problem of selecting useful inputs is circumvented by a principle components method. Model complexity is controlled by choosing the number of stages in the multi-stage predictions formula.

Link Analysis

The _______________ tool is used to discover and examine connections between items in a complex system. The tool transforms data from different sources into a data model that can be graphed. Centrality measures are derived from the graph and the tool can perform item-cluster detection for certain types of data. Recommendation tables can also be provided for transactional input data.

Input Tool

The _______________ tool represents the data source that you choose for your mining analysis. It also provides details (metadata) about the variables in the data source that you want to use.

Rule Induction The Rule Induction algorithm has three steps. 1. Using a decision tree, the first step attempts to locate "pure" concentrations of cases. These are regions of the input space containing only a single value of the target. The rules identifying these pure concentrations are recorded in the scoring code, and the cases in these regions are removed from the training data. The two-color example data does not contain any "pure" regions, so the step is skipped. 2.The second step attempts to filter easy-to-classify cases. This is done with a sequence of binary target decision trees. The first tree in the sequence attempts to distinguish the most common target level from the others. Cases found in leaves correctly classifying the most common target level are removed from the training data. Using this revised training data, a second tree is built to distinguish the second most common target class from the others. Again, cases in any leaf correctly classifying the second most common target level are removed from the training data. This process continues through the remaining levels of the target. With a binary target, no cases will remain in the training data after this step, so the Rule Induction node is essentially a decision tree.

The ________________ tool combines decision tree and neural network models to predict nominal targets. It is intended to be used when one of the nominal target levels is rare. New cases are predicted using a combination of prediction rules (from decision trees) and a prediction formula (from a neural network, by default).

Number of Surrogate

The __________________Rules, under the Node property, specifies the maximum number of surrogate rules that are sought in each non-leaf node. A surrogate rule is a backup to the main splitting rule. When the main splitting rule relies on an input whose value is missing, the first surrogate rule is invoked. If the first surrogate rule also relies on an input whose value is missing, the next surrogate rule is invoked. If missing values prevent the main rule and all of the surrogate rules from applying to an observation, then the main rule assigns the observation to the branch that it designated as receiving missing values.

Variable

The ___________________Selection tool provides selection based on one of two criteria: the R-square variable selection criterion or the chi-square selection criterion.

Dmine Regression The main distinguishing feature of Dmine regression versus traditional regression is its grouping of categorical inputs and binning of continuous inputs. • The levels of each categorical input are systematically grouped together using an algorithm reminiscent of a decision tree. Both the original and grouped inputs are made available for subsequent input selection. • All interval inputs are broken into a maximum of 16 bins in order to accommodate nonlinear associations between the inputs and the target. The levels of the maximally binned interval inputs are grouped using the same algorithm for grouping categorical inputs. These binned-and-grouped inputs and the original interval inputs are made available for input selection.

The ____________________ tool is designed to provide a regression model with more flexibility than a standard regression model. It should be noted that with increased flexibility comes increased chances of overfitting.

Curse of dimensionality

The ______________________ refers to the exponential increase in data required to densely populate space as the dimension increases. For example, the eight points fill the one-dimensional space but become more separated as the dimension increases. In 100-dimensional space, they would be like distant galaxies. The ______________________ limits your practical ability to fit a flexible model to noisy data (real data) when there are a large number of input variables. A densely populated input space is required to fit highly complex models. In assessing how much data is available for data mining, the dimension of the problem must be considered.

AutoNeural

The ____________________tool ignores decision processing data. Predictions from the ______________ tool are adjusted for prior probabilities, but its actual model selection process is based strictly on misclassification (without prior adjustment). When the primary and secondary outcome proportions are not equal, this process can lead to unexpected prediction results.

Bonefferi

The ___________________correction is an adjustment made to P values when several dependent or independent statistical tests are being performed simultaneously on a single data set. To perform a ________________correction, divide the critical P value (α) by the number of comparisons being made.

Maximum

The ________________function takes the maximum of the prediction estimate from the different models as the prediction from the Ensemble tool.

Dimension

The ________________of a problem refers to the number of input variables that are available for creating a prediction. Data mining problems are often massive in ________________. Therefore, predictive models must have a means of selecting useful inputs from a potentially vast number of candidates. Predictive modeling methods such as decision trees, regressions, and neural networks have built-in mechanisms for ______________reduction.

Ensemble

The ________________tool creates a new model by combining the predictions from multiple models. For prediction estimates and rankings, this combination is usually done by averaging. When the predictions are decisions, this combination is done by voting. The commonly observed advantage of ______________ models is that the combined model provides a better prediction than the individual models that compose it. It is important to note that the _______________ model can be more accurate than the individual models only if the individual models disagree with one another. You should always compare the model performance of the _______________ model with the individual models. Using an _____________ model, you can combine predictions from multiple models to create a single consensus prediction.

Confidence

The ______________of an association rule A => B is the conditional probability of a transaction that contains item set B given that it contains item set A (or the probability that a customer has B given that the customer has A). The ________________is estimated by using the following formula:

Average

The _____________function takes the average of the prediction estimates from the different models as the prediction from the Ensemble tool. This function is the default method.

Split

The ___________search starts by selecting an input for partitioning the available training data. If the measurement scale of the selected input is interval, each unique value serves as a potential _________ point for the data. If the input is categorical, the average value of the target is taken within each categorical input level. The averages serve the same role as the unique interval input values in the discussion that follows.

True

The analysis role of each variable tells SAS Enterprise Miner the purpose of the variable in the current analysis. The measurement level of each variable distinguishes continuous numeric variables from categorical variables. The analysis role of the data set tells SAS Enterprise Miner how to use the selected data set in the analysis.

Hidden Units

The basic building blocks of multilayer perceptrons are called ____________. ______________ are modeled after the neuron. Each ______________ receives a linear combination of input variables. The coefficients are called the (synaptic) weights. An activation function transforms the linear combinations and then outputs them to another unit that can then use them as inputs.

Gini Index

The basis for calculating worth in the ___________ method is the proportion of individual class outcomes in the cases (for all cases and for the weighted subsets of the cases). Proportions are squared and then summed across classes. Worth is calculated as the difference between one and the sum of the squares of proportions.

DMine

The main distinguishing feature of ____________ regression versus traditional regression is its grouping of categorical inputs and binning of continuous inputs.

True

The metadata definition serves three primary purposes for a selected data set. It informs SAS Enterprise Miner of the following: the analysis role of each variable the measurement level of each variable the analysis role of the data set

Missing Values

The most prevalent problem for neural networks is __________________. Like regressions, neural networks require a complete record for estimation and scoring. Neural networks resolve this complication in the same way that regression does: by imputation.

True

The primary advantage of separate sampling is a reduction in the number of cases required to build a model (with little reduction in model quality).

Weights

Then you use the Decision _____________tab in the Decision Processing window to specify the values in the profit matrix.

Advanced

Using the default ____________options, the Metadata Advisor does the following: •rejects variables with more than 50% missing values •detects the number of class levels for numeric variables and assigns a role of Nominal to those with class counts below 20 •detects the number of class levels for character variables and assigns a role of Rejected to those with class counts above 20.

all of the above The Metadata Advisor Options step in the Data Source Wizard enables you to set the Metadata Advisor, which controls how SAS Enterprise Miner creates metadata for the variables in your data source. The Metadata Advisor has two modes: Basic and Advanced. You can use the Advanced Advisor Options window to customize the advanced metadata options. Review: Using the Advanced Metadata Advisor

Using the default advanced options, the Metadata Advisor performs which of the following options automatically?

To ensure that the choice of split is not influenced by input measurement scale. The Kass adjustment makes it easier for an interval input to be chosen as the best split.

What is the purpose of the Kass (Bonferroni) adjustment in the decision tree split search algorithm?

DMNeural

What prediction tool provides these features? Up to three PCs with highest target R square are selected. One of eight continuous transformations are selected and applied to selected PCs. The process is repeated three times with residuals from each stage.

False SAS scores the data internally in SAS EMiner by making a copy of the scored data table on the SAS Foundation Server assigned to your project.

When you score data internally in SAS Enterprise Miner, a copy of the scored data table is not stored on the SAS Foundation Server assigned to your project. If the data table to be scored is very large, you might want to consider scoring the data outside of SAS Enterprise Miner using score code modules.

AutoNeural You can use the AutoNeural tool to automatically explore alternative network architectures and hidden unit counts. The AutoNeural tool conducts limited searches in order to find better network configurations. There are several options that the tool uses to control the algorithm. Review: Introduction

Which SAS Enterprise Miner tool can be used to automatically explore alternative network architectures and hidden unit counts?

The Filter tool You use the Filter tool to apply a filter to exclude certain observations from your data for your analysis. The Input Datatool enables you to inlcude a data source in your process flow. The Data Partition tool enables you to partition your training data into subsets for training, validation, and testing. The Explore window enables you to visually explore the details in your data source. Review: Introduction to Filtering Data

Which SAS Enterprise Miner tool would you use to exclude certain observations in your data source, such as extreme outliers, from your analysis?

growing decision trees interactively There are three methods for constructing decision tree models. These three methods include the following: interactively or by hand automatically autonomously Building models interactively, or by hand, enables you to select and modify the splits of your data to create the tree and its leaves. It is an informative method and is recommended when you are first learning about the Decision Tree node and remains valid even when you are more expert at predictive modeling. You grow a decision tree automatically by using the Train node function of the Interactive Decison Tree tool. Review: Constructing Decision Trees

Which method of growing decision trees is the most informative and hands-on approach?

Stepwise Forward evaluates the improvement upon baseline and builds increasingly complex models from one-input up - sequentially adding smallest p-value. Backwards contains all available inputs and then removes them - sequentially removing largest p-value. Stepwise builds from baseline as forward but scans all variables after each new variable for inclusion in the model. Those with large p-values are removed to the pool for reassessment.

Which method of input selection for regression analysis evaluates the statistical significance of all included inputs after each input is added?

A. Forward Forward evaluates the improvement upon baseline and builds increasingly complex models from one input up - sequentially adding smallest p-value. Backwards contains all available inputs and then removes them -sequentially removing largest p-value. Stepwise builds from baseline as forward but scans all variables after each new variable for inclusion in the model. Those with large p-values are removed to the pool for reassessment.

Which method of input selection for regression analysis evaluates the statistical significance of the total model to see if it improves on the baseline as the variables are added and once no further improvement is made then variable selection ends?

Forward Forward evaluates the improvement upon baseline and builds increasingly complex models from one-input up - sequentially adding smallest p-value.

Which method of input selection for regression analysis evaluates the statistical significance of the total model to see if it improves on the baseline as the variables are added and once no further improvement is made, then variable selection ends?

Memory-Based Reasoning The Memory-Based Reasoning (MBR) tool uses a k-nearest neighbor algorithm to categorize or predict observations.

Which modeling tool uses a k-nearest neighbor algorithm to categorize or predict observations?

You use the Score tool to apply a predictive model to scoring data, and you can use the SAS Code tool to save the scored data table to a different SAS library. The Model Implementation tool, the Java Code tool, and the C Code tool do not exist.

Which of the following SAS Enterprise Miner tools might you use to perform model implementation?

the Score tool and the SAS Code tool You use the Score tool to apply a predictive model to scoring data, and you can use the SAS Code tool to save the scored data table to a different SAS library. The Model Implementation tool, the Java Code tool, and the C Code tool do not exist. Review: Introduction

Which of the following SAS Enterprise Miner tools might you use to perform model implementation?

Neural networks are universal approximators. Neural networks have no internal, automated process for selecting useful inputs.

Which of the following are true about neural networks in SAS Enterprise Miner?

A project can contain one or more diagrams. A diagram can contain one or more process flows. A process flow contains multiple nodes. Projects contain diagrams, diagrams contain process flows, and process flows contain nodes. Remember that a node is a SAS Enterprise Miner tool. Review: What Is a Project?

Which of the following correctly describes the hierarchical organization of an analysis within SAS Enterprise Miner?

all of the above The most widely used type of neural network in data analysis is the multilayer perceptron (MLP). Multilayer perceptrons are often represented by a network diagram instead of an equation. The network diagram is arranged in layers. The first layer, called the input layer, contains any number of inputs. The input layer connects to a hidden layer, which is composed of hidden units. The hidden layer connects to a final layer called the target, or output, layer. A multilayer perceptron can contain additional hidden layers with any number of hidden units. Review: Neural Network Structure

Which of the following is a characteristic of a multilayer perceptron?

all of the above Predictive models are widely used and come in many varieties. Any model must perform three essential tasks: predict new cases, select useful inputs, and optimize complexity. Different modeling tools use different methods to complete each task. Review: A Predictive Model's Tasks

Which of the following is an essential task for any predictive model?

All of the above. For cluster analysis, you should seek inputs that have all the listed attributes. Inputs should also be meaningful to the analysis objective, relatively independent, and have low kurtosis and skewness statistics (at least in the training data). Review: Selecting and Refining Inputs for Analysis

Which of the following is important to consider when selecting inputs for cluster analysis?

All of the above. For cluster analysis, you should seek inputs that have all the listed attributes. Inputs should also be meaningful to the analysis objective, relatively independent, and have low kurtosis and skewness statistics (at least in the training data). Review: Selecting and Refining Inputs for Analysis

Which of the following is important to consider when selecting inputs for cluster analysis?

Another benefit is ease in model interpretation.

Which of the following is not a good reason to"regularize" input distributions using a simple transformation?

Another benefit is ease in model interpretation. A cost associated with "regularizing" the input distributions using a simple transformation is difficulty in model interpretation.

Which of the following is not a good reason to"regularize" input distributions using a simple transformation?

One benefit is improved model performance. A cost associated with "regularizing" the input distributions using a simple transformation is difficulty in model interpretation. Review: Introduction

Which of the following is not a good reason to"regularize" input distributions using a simple transformation?

A prediction estimate for the target variables is formed from a simple linear combination of the inputs.

Which of the following is not true about logistic regression?

all of the above All of these problems can occur if you do not adjust for separate sampling. Review: Adjusting for Separate Sampling

Which of the following problems can result if you do not adjust for separate sampling?

Stepwise The stepwise selection method compares each variable to the previous variable entered and based on a specified level, one variable is considered more significant than the other.

Which of the following sequential selection methods do you use so that SAS Enterprise Miner will look at all variables already included in the model and delete any variable that is not significant at the specified level?

When you impute a synthetic value, it eliminates the incomplete case problem. Imputing a synthetic value for a missing value eliminates the incomplete case problem but modifies the input's distribution, which can bias the model predictions. Review: Introduction

Which of the following solves problems for you when you impute missing values?

Unless a profit matrix is defined, the Model Comparison tool selects the model with the smallest validation misclassification rate by default. The Model Comparison tool (on the Assess tab) generates a variety of statistics of fit that are listed for both the training and validation data partitions. By default, unless a profit matrix is defined, the Model Comparison tool selects the model with the smallest validation misclassification rate. You should look at the validation fit statistics that are appropriate to the type of prediction that you are interested in. Whether the best fit is indicated by the highest value or the lowest value depends on the specific fit statistic. Review: Understanding Model Assessment

Which of the following statements about assessing model performance using the Model Comparison tool is true? a. Unless a profit matrix is defined, the Model Comparison tool selects the model with the smallest validation misclassification rate by default. b. The Model Comparison tool calculates values for up to three statistics at a time. c. For all fit statistics that the Model Comparison tool generates, the highest value indicates the best fit. d. The Model Comparison tool appears on the Explore tab.

Score data contains the same input variables as training data, but the target variables might be different or missing.

Which of the following statements about the data table that you use for scoring is true?

Score data contains the same input variables as training data, but the target variables might be different or missing. Score data is data that is structured like the training data but that lacks a target variable. Review: Introduction

Which of the following statements about the data table that you use for scoring is true?

None of the above. Typically, you do not need to alter any of the default settings within the Data Source Wizard when you create a data source for your scoring data. Usually, you want to apply your predictive model to the data without making changes to the data first. Partitioning data into training, validation, and test subsets is a task you perform on your training data, but not on your scoring data. Review: Introduction

Which of the following statements about the score data source is true?

none of the above All of these statements about working with profit matrices are true. Review: About Profit Matrices

Which of the following statements about working with profit matrices is false?

A data source is a metadata definition that informs SAS Enterprise Miner about the name and location of a SAS table, the SAS code that is used to define a library path, and the variable roles, measurement levels, and other attributes that are important for the data mining project. A data source is a link between an existing SAS table and SAS Enterprise Miner. It contains metadata that informs SAS Enterprise Miner about the name and location of the SAS table, the SAS code that is used to define a library path, and the variable roles, measurement levels, and other attributes that guide the data mining process. Review: What Is a Data Source?

Which of the following statements best describes a SAS Enterprise Miner data source?

The Neural Network tool stops training when the final model is selected. SAS Enterprise Miner treats each iteration of the optimization process as a separate model. The iteration with the smallest value of the selected fit statistic is chosen as the final model. This method of model optimization is called stopped training. To avoid stopping too early, the Neural Network tool continues to train until the training data converges or until the maximum iteration count is reached, whichever comes first. Review: Introduction

Which of the following statements is false? a. The Neural Network tool treats each iteration of the optimization process for a neural network as a separate model. b. The Neural Network tool chooses the iteration with the smallest value of the selected fit statistic as the final model. c. The Neural Network tool stops training when the final model is selected. d. The Neural Network tool often requires trial and error to find the best architecture.

All of the above. The Explore window enables you to visually explore the variables in your data. You can use the Explore window to search for anticipated trends, unanticipated trends, or anomalies. The Explore window includes a Variable Values window that displays all of the values in the data source. It can also contain a histrogram for each variable in the data source, which enables you to graphically explore relationships between the variables. The Explore window also includes a Plot Wizard that enables you to create scatter plots and other types of charts to further explore your data. Review: Using the Explore Window

Which of the following tasks can you perform using the Explore window in SAS Enterprise Miner?

assessing a tree based on validation data The Interactive Decision Tree tool enables you to create decision trees either by interactive training or automatic training. You can use the Interactice Decision Tree tool to view results based on the training data, but in order to assess your tree based on the validation data, you must exit the Interactive Decision Tree tool and use the Results window of the Decision Tree node. Review: Interactively Creating a Decision Tree with One Split

Which of the following tasks cannot be performed in the Interactive Decision Tree tool?

sensitivity Sensitivity is the proportion of primary outcome cases in a selected fraction. Review: Sensitivity Charts

Which of the following terms refers to the proportion of primary outcome cases in a selected fraction?

devote more data to training and less data to validation You use the Data Partition tool to specify the fraction of input data devoted to the training, validation, and test partitions. When you select a partitioning strategy it is important to note that there are various trade-offs. More data devoted to training results in more stable predictive models but less stable model assessments. More data devoted to validation results in less stable predictive models but more stable model assessments. The test partition is used only for calculating fit statistics after the modeling and model selection is complete. It is not uncommon to omit the test partition and assign an equal number of cases to the training partition and the validation partition. Review: Using the Data Partition Tool

Which partitioning strategy results in more stable predictive models?

Cluster analysis is considered to be supervised classification because the k-means clustering algorithm assigns cases to clusters. Cluster analysis is considered to be unsupervised classification because it attempts to group training data set cases based on similarities in input variables. Review: Introduction

Which statement about cluster analysis is false?

Cluster analysis is considered to be supervised classification because the k-means clustering algorithm assigns cases to clusters. Cluster analysis is considered to be unsupervised classification because it attempts to group training data set cases based on similarities in input variables. Review: Introduction

Which statement about cluster analysis is false?

Market basket analysis is performed on transactions data, and variables must be assigned to both the ID and Target roles. Market basket analysis is performed on transactions data in which one variable is assigned to the ID role and one variable is assigned to the Target role. The Sequence role does not need to be used unless you want to perform a sequence analysis. Market basket analysis results in a list of association rules. The strength of the association is measured by the support, confidence, and lift of the rule. Review: Introduction

Which statement about market basket analysis is true?

Market basket analysis is performed on transactions data, and variables must be assigned to both the ID and Target roles. Market basket analysis is performed on transactions data in which one variable is assigned to the ID role and one variable is assigned to the Target role. The Sequence role does not need to be used unless you want to perform a sequence analysis. Market basket analysis results in a list of association rules. The strength of the association is measured by the support, confidence, and lift of the rule. Review: Introduction

Which statement about market basket analysis is true? a. To perform a market basket analysis, one variable must be assigned to the Sequence role. b. Market basket analysis is performed on transactions data, and variables must be assigned to both the ID and Target roles. c. Market basket analysis results in a list of association rules. The strength of the rules can be measured by the CCC plot. d. Market basket analysis is useful for identifying the order in which customers buy products or services.

Sequence analysis requires inputs in the ID, Target, and Sequence roles. Sequence analysis is useful for determining the order in which a customer acquires products or services over time. The analysis requires a variable in the Sequence role as well as the Target and ID roles. In SAS Enterprise Miner, you use the Association tool to perform sequence analysis. The strength of the rule is measured by the support and confidence of the rule. Review: Introduction

Which statement about sequence analysis is true?

Sequence analysis requires inputs in the ID, Target, and Sequence roles. Sequence analysis is useful for determining the order in which a customer acquires products or services over time. The analysis requires a variable in the Sequence role as well as the Target and ID roles. In SAS Enterprise Miner, you use the Association tool to perform sequence analysis. The strength of the rule is measured by the support and confidence of the rule. Review: Introduction

Which statement about sequence analysis is true?

They are performed to reduce the bias in model predictions.

Which statement below is true about transformations of input variables in a regression analysis?

The average target value is calculated for each level, and then passed on for testing if it is the optimal split point. Split Point Definition: if the input is categorical, the average value of the target is taken within each categorical input level. The averages serve the same role as the unique interval input values.

Which statement describes the Decision Tree Split Search mechanism for categorical inputs?

link graph You can use the link graph and rules table to help interpret the results of market basket analysis.

Which tool can be used to help interpret the results of market basket analysis?

segment profile You can use the link graph and rules table to help interpret the results of market basket analysis. Review: Using the Link Graph and Rules Table to Interpret Results

Which tool can be used to help interpret the results of market basket analysis?

validation A validation data set is used for monitoring and tuning a model to improve its generalization. The tuning process usually involves selecting among models of different types and complexities. The tuning process optimizes the selected model on the validation data. Review: Introduction

Which type of data set is used to monitor and tune a predictive model?

ranking Ranking predictions order cases based on the inputs' relationships with the target. Using the training data, the prediction model attempts to rank high value cases higher than low value cases. It is assumed that a similar pattern exists in the scoring data so that high value cases have high scores. The actual scores produced are inconsequential; only the relative order matters. The most common example of a ranking prediction is a credit score. Review: Rankings

Which type of prediction orders cases based on the inputs' relationship to the target?

Train

You grow a decision tree automatically by using the __________ Node function in the Interactive Decision Tree.

Assessment Measure

You use the __________property to specify the method that you want to use to select the best tree, based on the validation data. The options for this property include the following: Decision (profit), Average Square Error, Misclassification, and Lift. Misclassification, Lift, and specialized pruning metrics should be used sparingly.

Filter Tool

You use the_______________to exclude certain observations from your analysis. You might want to remove any cases that seem in error or out of place in a data source. After you remove these cases, your data is ready for subsequent analysis.

Decision

_________ are the simplest type of prediction and are usually associated with some kind of action (such as classifying a donor or non-donor). For this reason, ___________ are also known as classifications. Examples of _________ prediction include handwriting recognition, fraud detection, and direct mail solicitation. _________ predictions usually relate to a categorical target variable. For this reason, they are identified as primary, secondary, and tertiary in correspondence with the levels of the target.

Sequence

__________ analysis is an extension of market basket analysis to include a time dimension in the analysis. In this way, transaction data is examined for ________________s of items that occur (or do not occur) more (or less) commonly than expected. A Webmaster might use _____________ analysis to identify patterns or problems of navigation through a Web site.

Profiling

__________ is a by-product of reduction methods such as cluster analysis. The idea is to create rules that isolate clusters or segments, often based on demographic or behavioral measurements. _____________ helps describe the attributes of the people in a cluster. A marketing analyst might develop profiles of a customer database to describe the consumers of a company's products.

Clustering

__________ is considered to be unsupervised classification because the goal is to group training data set cases based on similarities in input variables with no target variable or specific outcome.

Predictive

______________ modeling tries to find good rules for predicting the values of one or more variables in a data source from the values of other variables in the data source. After a good rule has been found, it can be applied to new data sources that might or might not contain the variable(s) that are being predicted.

Simple

______________ models (linear regression, logistic regression) are easy to interpret, but linear predictions might lead to prediction bias.

Complex

______________ models can eliminate prediction bias but are much harder to interpret.

Logistic

______________ regression is used by the tool if you have a categorical response (a binary or ordinal measurement level).

Stepwise

______________ selection combines elements from both the forward and backward selection procedures. The method begins in the same way as the forward procedure, sequentially adding inputs with the smallest p-value below the entry cutoff. After each input is added, however, the algorithm reevaluates the statistical significance of all included inputs. If the p-value of any of the included inputs exceeds the stay cutoff, the input is removed from the model and reentered into the pool of inputs that are available for inclusion in a subsequent step. The process terminates when all inputs available for inclusion in the model have p-values in excess of the entry cutoff and all inputs already included in the model have p-values below the stay cutoff.

Backward

______________ selection creates a sequence of models of decreasing complexity. The sequence starts with a saturated model, which is a model that contains all available inputs, and therefore, has the highest possible fit statistic. Inputs are sequentially removed from the model. At each step, the input chosen for removal least reduces the overall model fit statistic. This is equivalent to removing the input with the highest p-value. The sequence terminates when all remaining inputs have a p-value that is less than the predetermined stay cutoff.

Accuracy

_______________ measures the fraction of cases where the decision matches the actual target value.

Unsupervised

________________ classification (also known as clustering and segmenting) attempts to group training data set cases based on similarities in input variables.

Data Reduction

________________ is the most ubiquitous application: exploiting patterns in data to create a more compact representation of the original. Though vastly broader in scope, data reduction includes analytic methods such as cluster analysis, which you learn to perform in this lesson.

Synthetic Distribution

________________ methods use a "one size fits all" approach to handle missing values. Any case with a missing input measurement has the missing value replaced with a fixed number. The net effect is to modify an input's distribution to include a point mass at the selected fixed number. The location of the point mass in ____________________ methods is not arbitrary. Ideally, it should be chosen to have minimal impact on the magnitude of an input's association with the target. With many modeling methods, this can be achieved by locating the point mass at the input's mean value.

Input

________________ variables with extreme distributions can diminish the predictive power of regression models. Transformations of these variables to a less extreme, more symmetric form is recommended in most cases.

Market Basket

_________________ analysis (also known as association rule discovery or affinity analysis) is a popular data mining method for exploring associations between items.

Market Basket

_________________ analysis (or association rule discovery) is used to analyze streams of transaction data (for example, ________________) for combinations of items that occur (or do not occur) more (or less) commonly than expected. Retailers can use this as a way to identify interesting combinations of purchases or as predictors of customer segments.

Discordance

__________________ measures the fraction of primary-target cases with predicted score lower than the predicted score secondary-target cases.

Market Basket

___________________ analysis (also known as association rule discovery or affinity analysis) is a popular data mining method for exploring associations between items. In the simplest situation, the data consists of two variables: a transaction and an item. For each transaction, there is a list of items. Typically, a transaction is a single customer purchase, and the items are the things that were bought. The result of __________________ analysis is a list of association rules. The value of the generated rules is gauged by confidence, support, and lift. In SAS Enterprise Miner, you use the Association tool to perform ___________________ analysis. After you have run the association analysis, you can use a variety of tools such as the Link Graph and Rules table to interpret the results of the analysis.

Misclassification

____________________ measures the fraction of cases where the decision does not match the actual target value.

Forward

____________________ selection creates a sequence of models of increasing complexity. The sequence starts with the baseline model, a model predicting the overall average target value for all cases. The algorithm searches the set of one-input models and selects the model that most improves on the baseline model. It then searches the set of two-input models that contain the input selected in the previous step and selects the model showing the most significant improvement. By adding a new input to those selected in the previous step, a nested sequence of increasingly complex models is generated. The sequence terminates when no significant improvement can be made.

Estimate

_____________________ methods provide tailored imputations for each case with missing values. This is done by viewing the missing value problem as a prediction problem. You can train a model to predict an input's value from other inputs. Then, when an input's value is unknown, you can use this model to predict or estimate the unknown missing value. This approach is best suited for missing values that result from a lack of knowledge about values that have no match or are not disclosed. The ______________ method is not appropriate for not-applicable missing values.

Memory-based reasoning

_______________________ is a process that identifies similar cases and applies the information that is obtained from these cases to a new record. In SAS Enterprise Miner, the ____________________ tool uses a k-nearest neighbor algorithm to categorize or predict observations.

Decisions

_____________________usually are associated with some type of action (such as classifying a case as a donor or a non-donor). For this reason, ____________are also known as classifications. ______________prediction examples include handwriting recognition, fraud detection, and direct mail solicitation.

Model Model Implementation is the process of applying a predictive model to data that lacks a target variable in order to make predictions.

________________implementation is the process of applying your chosen predictive model to score data.

Ranking

_____________predictions order cases based on the input variables' relationships with the target variable. Using the training data, the prediction model attempts to ________high value cases higher than low value cases.

Ranking

____________predictions order cases based on the inputs' relationships with the target. Using the training data, the prediction model attempts to __________ high value cases higher than low value cases. It is assumed that a similar pattern exists in the scoring data so that high value cases will have high scores. The actual scores produced are inconsequential; only the relative order matters. The most common example of a ____________ prediction is a credit score.

computes summary statistics using the DMDB procedure.

DMDB Node

Categorical

Chi-Squared logworth, Entropy and Gini evaluate split worth for which type of variables?

Appends Data Sets

Append Node

creates variable transformations in the HP Env using high performance procedures.

HP Transform Node

Logistic

Attempts to predict the probability that a binary or ordinal target will acquire the event of interest as a function of one or more independent inputs.

equals 2 × (ROC index - 0.5)

Gini coefficient (for binary prediction) ?

generates tree models in the HP Env.

HP Tree Node

analyze preprocessed web data.

Path Analysis Node

generates summary and association statistics.

StatExplorer Node

creates the classical seasonal decomposition of time series data.

TS Decomposition Node

Squared Error

The squared difference between a target and an estimate is called the ___________________.

Concordance

When a pair of primary and secondary cases is correctly ordered, the pair is said to be in ___________________.

measures the difference between the prediction estimate and the observed target value

average squared error?

accuracy or misclassification

provides a tally of the correct or incorrect prediction decisions

0.69/b1 The logistic regression equation (for two input variables) in general can be represented below: Log(p/1-p) = b0 +b1 X1 + b2 X2 A doubling amount gives the amount of change the required in each of the input variables for doubling the primary outcome odds. It is equal to log2 (or about 0.69) divided by the parameter estimate of the input of interest.

A useful concept in logistic regression is the doubling amount. How would you calculate doubling amount for an input variable that has a parameter estimate of b1?

establishes a central connection to join multiple nodes.

Control Point Node

cutoff node for binary target decisions.

Cutoff Node

fit an additive non-linear model using bucketed principal components as inputs.

DMNeural Node

partitions data into test, training and validation tables.

Data Partition Node

Indicates if the advisor should use sql pass-through to determine the measurement levels and compute the summary statistics.

Database Pass-Through

empirical tree represents a segmentation of the data that is created by applying a series of simple rules.

Decision Tree Node

overfitting The maximal tree represents the most complicated model you are willing to construct from a set of training data. To avoid potential overfitting, many predictive modeling procedures offer some mechanism for adjusting model complexity. For decision trees, this process is known as pruning. Review: Introduction

Decision Tree models use pruning to adjust model complexity and avoid the potential problem known as what?

used to create or modify decision data for building models based on the value of the decisions and/or prior probablities.

Decisions Node

Specify whether to apply the "Class levels count threshold" property.

Detect Class Levels

computes a forward stepwise least squares regression optionally including 2-way interactions, AOV16 and group variables.

Dmine Regression Node

drops variables from the preceding training path.

Drop Node

establishes an end point for group processing. Must be used with a start group.

End Groups Node

creates a new model by taking a function of posterior probabilities (for class targets) or the predicted values (for interval models) from multiple models.

Ensemble Node

False Ensemble models combine model predictions to form a single consensus prediction.

Ensemble models combine predictions from multiple models to create a multiple consensus predictions.

tool that illustrates the various UI elements that can be used by extention nodes.

Ext Demo

imports and external file.

File Import Node

removes observations from data based upon specified criteria.

Filter Node

Categorical

For _____________________ inputs, replace any missing values with the most frequent category.

Decision Tree

For ____________models, the modeling algorithm automatically ignores irrelevant inputs, Other modeling methods must be modified or rely on additional tools to properly deal with irrelevant inputs.

Binary

For a ___________ target, the weight estimation process is driven by an attempt to maximize the log-likelihood function.

Estimate

For a binary target, _______________ predictions are the probability of the primary outcome for each case. Primary outcome cases should have a high predicted probability; secondary outcome cases should have a low predicted probability.

All of the above

For a k-means clustering analysis, which of the following statements is true about input variables?

Logit Link Function

For binary prediction, any monotonic function that maps the unit interval to the real number line can be considered as a link. The _________________ is one of the most common, because it makes it easy to interpret the model.

Segment

For higher dimension data when clustering (4+ variables), you can use the ______________ Profile tool to understand the generated partitions. This tool enables you to compare the distribution of a variable in an individual _______________ to the distribution of the variable overall. As a bonus, the variables are sorted by how well they characterize the ______________.

True

For interval inputs, replace any missing values with the mean of the non-missing values.

It uses a squared correlation and then a stepwise regression to eliminate irrelevant inputs. When you use the R-square variable selection criteria, a two-step process is followed. 1. SAS EMiner computes the squared correlation for each variableand then assigns the Rejected role to those variables that have a value less than the squared correlation criterion. (The default is 0.005) 2. SAS EMiner evaluates the remaining (not rejected) variables using a forward stepwise R-square regression. Variables that have a stepwise R-Square improvement less than the threshold criterion (default=0.0005) are assigned the Rejected role.

For the Variable Selection node, which statement describes the R-squared variable selection criteria?

creates a series of decision trees by fitting the residual of the prediction from the earlier tree in the series.

Gradient Boosting Node

generates graphical reports and interactive graphs.

Graph Explore Node

performs clustering in the HP environment.

HP Cluster Node

generates an identifier variable identifying the observations to be used for training and validation.

HP Data Partition Node

generates summary statistics for HP Env.

HP Explore Node

creates random forest models in the HP Env.

HP Forest Node

generates values for missing variables in the HP Env.

HP Impute

generates neural networks in the HP Env.

HP Neural

generates principal components to be used as inputs for succesor nodes in the HP Env.

HP Principal Components

High Performance Linear and Logistic regression models in the HP Env.

HP Regression

creates Support Vector Machine models in the HP Env.

HP SVM Node

select variables using high performance procedures.

HP Variable Selection

The correct answer is c. You must calculate statistics for the models generated in each step of the selection process using both the training and validation data sets. When you study results and assess iterative plots you must compare the average squared error and the misclassification rate associated with different models. You use Validation Misclassification if your predictions are decision and use Validation Error if your predictions are estimates.

How can you optimize complexity for regression models when using SAS Enterprise Miner?

all of the above Using SAS Enteprise Miner, you can save your score code as a score code module in SAS code, C code, or as Java code. Review: Introduction

How does SAS Enterprise Miner enable you to save your score code?

Two hidden layers In a two-layer MLP, the inputs are fed into the input layer and are multiplied by the interconnection weights as they are passed from the input layer to the first hidden layer. Within the first hidden layer, they get summed and then processed by a nonlinear function(usually the hyperbolic tangent). As the processed data leaves the first hidden layer, again it gets multipled by the interconnection weights and then summed and processed by the second hidden layer. Finally, the data is multipled by the interconnection weights and processed one last time within the output layer to produce the neural network output. Thus, the combination of the two hidden layers enables you to model complex and discontinuous relationships among the inputs and the target.

How many hidden layers are generally need in a MLP-based neural network to capture a discontinuous relationship between inputs and target?

Polynomial combinations of the model inputs enable predictions to better match the true input/target association. Polynomial combinations of the model inputs enable predictions to better match the true input/target association and minimize the chances of overfitting. Review: Standard Logistic Regression

How would you characterize the effects of adding polynomial combinations of the model inputs to a regression?

Polynomial combinations of the model inputs enable predictions to better match the true input/target association. Polynomial combinations of the model inputs enable predictions to better match the true input/target association and minimize the chances of overfitting.

How would you characterize the effects of adding polynomial combinations of the model inputs to a regression?

True

If a profit matrix is defined for a data source, the Model Comparison tool selects the model with the largest validation average profit by default. Otherwise, the Model Comparison tool selects the model with the smallest validation misclassification rate.

False Model Prediction estimates will be biased if the outcome proportions in the training sample and scoring populations do not match.

If the outcome proportions in the training sample and scoring populations do not match, model prediction estimates will not be biased.

Categorical

In Decision Tree Modeling, If the input is ____________________, the average value of the target is taken within each ____________________input level. The averages serve the same role as the unique interval input values in the discussion that follows.

Logworth

In Decision Tree modeling for splits, At least one ______________ must exceed a threshold for a split to occur with that input. By default, this threshold corresponds to a chi-squared p-value of 0.20 or a ______________ of approximately 0.7.

Inputs

In Partial Least Squares (PLS) regression, the goal is to have linear combinations of the ________________ (called latent vectors) that account for variation in both inputs and the target.

Logit

In Regression analysis, a linear combination of the inputs generates a ___________ score, the log of the odds of primary outcome.

all of the above

In SAS Enterprise Miner's Decision Tree node, which of the following types of target variable can be used?

Association Tool

In SAS Enterprise Miner, market basket and sequence analyses are handled by the _______________tool. The tool transforms a transactions data set into rules. The data source you use must have the role of Transaction and must have an ID variable and a Target variable. The ID variable is some identifier of the transaction and the Target variable is the item.

Segmentation

In _______________analysis, the goal is to partition cases from a cloud of data (data that doesn't necessarily have distinct groups) into contiguous groups.

Data Spliting

In predictive modeling, the standard strategy for honest assessment of model performance is ________________. A portion of the data source is used for fitting the model—the training data set. The rest of the data source (the validation data set and the test data set) is held out for empirical validation.

Incremental Response Modeling.

Incremental Response Node

Provides details about the variables used as input for data mining.

Input Data Node -

groups variables values into classes that can be used as inputs for predictive modeling.

Interactive Binning Node

Inputs

It is important to ensure that ____________have desirable distribution properties. The formula builder can be used to modify existing ___________variables and create new variables with desirable properties. ___________should have roughly the same scale of measurement and be free of unusual or missing values.

describes the ability of the model to separate the primary and secondary outcomes

Kolmogorov-Smirnov (KS) statistic?

least angle regressions.

LARS node

performs link analysis.

Link Analysis Node

creates a model for predicting binary and nominal targets, based on its k-closet neighbors from a training data set.

MBR Memory Based Reasoning Node

performs market basket analysis for data with potential item taxonomy.

Market Basket Node

creates new datasets or views by combining columns from datasets.

Merge Node

compares model predictions from the prior modeling nodes and selects the best modeling candidate.

Model Comparison Node

enables you to import and assess a model that is not created with one of the EMiner modeling nodes.

Model Import Node

Training

More data devoted to ___________ results in more stable predictive models but less stable model assessments.

Validation

More data devoted to ______________ results in less stable predictive models but more stable model assessments.

generates various plots and charts.

MultiPlot Node

high-correlations among input variables When input(X) variables are highly correlated with each other or linear combinations of other input variables, the input variables are called collinear and the situation is said to exhibit multicollinearity.

Multicollinearity in regression refers to which of the following?

a class of flexible, nonlinear regression models, discriminate models, and data reduction models that are interconnected in a nonlinear dynamic system.

Neural Network Node

Cluster

One of the main uses of _____________ analysis is data reduction. This process involves identifying distinct groups of cases across a set of inputs. ____________ analysis is useful because it is easier to manage, explore, and model groups rather than individual observations. New observations can be assigned to the appropriate ___________ based on the values of their associated inputs.

submit programs written in open source languages.

Open Source Integration Node

provides several predictive modeling techniques using latent variables.

Partial Least Squares node

Cumulative lift

Plotting ______________________ for all selection fractions yields a ____________________ chart. High values of ______________________ suggest that the model is doing a good job of separating the primary and secondary cases.

Receiver operating characteristic

Plotting the trade-off between sensitivity and false positive fraction across all selected fractions of data creates a _____________________ curve. The curve that is generated by a given model can be compared to a diagonal line running from the lower left to the upper right of the ________________chart. This diagonal line represents the trade-off between sensitivity and false positive fraction for various-sized random selections of cases. The further a model's _________________ curve pushes upward and to the left, away from the diagonal line, the better it is in separating primary and secondary outcomes.

Failure to adjust for separate sampling. The advantage of separate sampling is that you are able to obtain (on the average) a model of similar predictive power with a smaller overall case count. This is in concordance with the idea that the amount of information in a data set with a categorical outcome is determined not by the total number of cases in the data set itself, but instead by the number of cases in the rarest outcome category.

Prediction estimates reflect target proportions in the training sample, not the population from which the sample was drawn. Score Rankings plots are inaccurate and misleading. Decision-based statistics related to misclassification or accuracy misrepresent the model performance on the population.

generates principal components to be used as inputs for succesor nodes.

Principal Components Node

equals the percent of concordant cases plus one-half times the percent of tied cases

ROC index (concordance) ?

140% The answer is derived from the cumulative lift chart at a depth of 20%

Refer to the graphs shown on the two exhibit slides. The graphs are from a study of response rate to a marketing campaign. How much more likely are the top 20% of targeted respondents to purchase the product than a randomly selected sample?

The answer is 17.6576 The measure of association is provided as the Chi Square value in the image.

Refer to the output from the StatExplorer node shown in the image. What is the measure of association of variable used_ind with the target?

fits both linear and logistic regression models.

Regression Node

Specify whether to apply the "Reject levels count threshold" property.

Reject Vars with Excessive Class Values

Specify whether to mark variables with more than "Max. Missing Percent" as REJECTED.

Reject Vars with Excessive Missing Values

Overfitting

Removing redundant or irrelevant inputs from a training data set often reduces __________________and improves prediction performance

replace specific values and unknown levels for class variables.

Replacement Node

generates a document for nodes in the process flow.

Reporter Node

used to run sas code.

SAS Code Node

Multi-way Splits

SAS Enterprise Miner enables a multitude of variations of the default tree algorithm. One variation involves the use of __________________instead of binary splits. _________________affect decision trees in the following ways: trades tree height for tree width, complicates the split search, and uses heuristic shortcuts.

orphan nodes SAS Enterprise Miner stopping rules help to avoid orphan nodes, control sensitivity, and grow large trees. Increasing the minimum leaf size will avoid orphan nodes. For large data sets, you might want to increase the maximum leaf setting to obtain additional modeling resolution. If you want very large trees and use the chi-square split-worth criterion, deactivate the Split Adjustment option. Review: Stopping Rules

SAS Enterprise Miner stopping rules help to avoid which of the following:

extracts the source code and score metadata to a folder. Must be preceded by a score node.

Score Card Export Node

applies the score code from the preceding training path of a dataset with a role of Score. Used to validate against training/validation data.

Score Node

Survival Data Mining.

Survival Node

analyze the autocorrelation and crosscorrelation of the time series data.

TS Correlation

provides time series data cleaning, summarization, transformation and transpose.

TS Data Preperation Node

reduce the dimension of time series using Discrete Wavelet Transform , Discrete Fourier Transform, Singular Value Decomposition or Line Segment Approximation.

TS Dimension Reduction

generates forecasts by using exponetial smoothing models with optimized smoothing weights for many time series.

TS Exponetial Smoothing

The number of cluster centers With k selected, the k-means algorithm chooses cases to represent the initial cluster centers (also named seeds).

The 'k' in k means clustering represents what?

decision tree and neural network The Rule Induction tool makes predictions using a combination of rules (from decision trees) and a formula (from a neural network, by default).

The Rule Induction tool combines which of the following models?

The overall distribution of bargain item sales is approximately normal and Segment 1 contains stores selling fewer than average bargain items. The segment profile node provides a graphical representation of overall input distribution - the red outlined bars - as well as segment-specific input distribution - the blue shaded bars. The red outlined bars specify overall case distribution and the blue shaded bars represent segment-specific case distribution.

The SAS data set retail contains information on the count of retail stores sales based on the following item types: bargain, essential, gourmet, and health. Based on the results from the Cluster Profile node, which statement is true?

Model Comparison

The ______________________ tool collates statistics from different modeling nodes for easy comparison. The output below shows some of the statistics that the _______________________ tool generates, such as the misclassification rate, the average squared error, the ROC index, and the Kolmogorov-Smirnov statistic.

Variance

The basis for calculating worth with the _________ method is the difference between the target value for each case outcome (in the node or subset) and the average of the target values for that case outcome (in the node or subset). These differences are squared and then summed to calculate worth.

Entropy

The basis for calculating worth with the ___________method is the product of the proportion of individual class outcomes in the cases (for all cases and for the weighted subsets of the cases) and the log (base 2) of the proportion of individual class outcomes in the cases (for all cases and for the weighted subsets of the cases). This product is summed over all classes to calculate worth.

the highest logworth The best split for an input is the split that yields the highest logworth. Review: Split Search

The best split for an input is a split that yields what?

Split Search

The first part of the algorithm is called the _______________. The _______________ starts by selecting an input for partitioning the available training data.

none of the above

The importance of an input variable in predicting a target in an MLP-based neural network can be figured out by which of the following?

Test

The______________ data set has only one use: to give a final honest estimate of generalization. Consequently, cases in the __________ set must be treated just as new data would be treated. Therefore, they cannot be involved in the determination of the fitted prediction model.

Inputs

This term is also known as predictors, features explanatory variable or independent variables?

Target

This term is also known as response, outcome or dependent variables?

Categorical

To decrease a model's degrees of freedom and prevent model overfitting, you can consolidate _________________inputs . First, each level of a _________________variable is transformed into a numeric value using dummy variables. Because _________________variables can have many levels, this can result in the addition of many variables to the model. To reduce the degrees of freedom used by the model, several levels of the _________________variable can be assigned to a single dummy variable.

applies transformations to dataset variables and helps to equally distribute variables that are skewed.

Transform Variables Node

increase the performance of logistic regression Often predictor variables with high degree of right or left skewness are transformed such that they become approximately symmetrical in their distributions to improve the predictvie performance of any regression type model, including logistic regression.

Transformations of input variables to make their distributions more symmetric will likely have what impact in a logistic regression?

model a class and interval target variable. The interval target is usually the value that is associated with a level of the class target.

TwoStage Node

Accommodate

Unlike standard regression models, neural networks easily accommodate nonlinear and nonadditive associations between inputs and target. In fact, the main challenge is over-accommodation, that is, falsely discovering nonlinearities and interactions.

Expected

Using a completed profit matrix, SAS Enterprise Miner can calculate the ______________profit associated with each action (decision). The ____________profit is equal to the sum of the outcome/action profits multiplied by the outcome probabilities.

divides a set of input variables into disjoint or heirarchal clusters.

Variable Clustering Node

provides a tool to reduce the number of input variables using R-square, and Chi-Square selection criteria and so on.

Variable Selection Node

Interval

Variance and ProbF logworth evaluate split worth for which type of variables?

Model implementation is the process of applying your model to data other than your training data. Model implementation is the process of applying your model to score data in order to make predictions. Review: What Is Model Implementation?

What is model implementation?

Model implementation is the process of applying your model to data other than your training data. Model implementation is the process of applying your model to score data in order to make predictions.

What is model implementation?

Sample, Explore, Modify, Model, and Assess.

What is the SEMMA architecture?

The number of cases required to build a model is reduced, with little reduction in model quality. The main advantage of separate sampling is that the number of cases required to build a model is reduced, with little reduction in model quality. Review: What is Separate Sampling?

What is the main advantage of separate sampling?

Discordance

When a pair of primary and secondary cases is incorrectly ordered, the pair is said to be in ___________________.

Sample Properties, Sample Statistics, sample data for the selected variable and a histogram of the data for the selected variable.

When the Explore button is selected in the graphic, what information will be displayed?

Logit

When the target variable is binary, as in the demonstration data, the main neural network regression equation receives the same __________ link function featured in logistic regression. As with logistic regression, the weight estimation process changes from least squares to maximum likelihood.

Fit Statistics can provide information that affects decision predictions but does not affect estimate predictions. Fit Statistics can provide information that effect both decision predictions and estimate predictions. If the decision predictions are of interest, model fit can be judged by misclassification. If estimate predictions are the focus, model fit can be assessed by average square error. Review: Interpreting the Results of the Regression Tool

Which of the following is not true about results produced by the Regression node?

missing values The most prevalent problem for neural networks is missing values. Like regressions, neural networks require a complete record for estimation and scoring. Extreme or unusual values also present a problem for neural networks. This problem is mitigated somewhat by the hyperbolic tangent activation functions in the hidden units. Review: Beyond the Prediction Formula

Which of the following is the most prevalent problem for neural networks?

Increasing a model's degrees of freedom increases the chances of a model overfitting.

Which of the following is true about collapsing categorical inputs?

Cutoff Value

Within the replacement node, you can control the number of standard deviations by selecting the ___________________ property.

Profit

You can enhance the process of model selection by using _______________as a measure of model performance. _____________ is used to weight outcomes to emphasize or de-emphasize their importance in generating statistical measures.

AutoNeural

You can use the ______________ tool to automatically explore alternative network architectures and hidden unit counts. The ________________ tool conducts limited searches in order to find better network configurations. There are several options that the tool uses to control the algorithm.

Estimate

____________ predictions approximate the expected value of the target, conditional on input values. For cases with numeric targets, this number can be thought of as the average value of the target for all cases having the observed input measurements. For cases with categorical targets, this number might equal the probability of a particular target outcome.

Linear Linear regression predicts values by fitting a linear equation to available data. If you plot the data, linear regression draws a line through the values. The line represents the predicted value of Y for each value of the input X.

_____________ regression is used by the Regression tool if you have a continuous response (interval measurement level).

Sequence

______________ analysis is a type of association analysis that analyzes the order in which services or products are acquired.

Sensitivity Charts

______________ contrast the sensitivity statistic (the ability to detect primary outcome cases) versus the selection fraction or false positive fraction.

Sample, Sampling

_________________ is recommended for extremely large databases because it can significantly decrease model training time. If the ______________is sufficiently representative, relationships found in the ________________can be expected to generalize to the complete data set. The ____________ tool writes the ___________ observations to an output data set. It saves the seed values that are used to generate the random numbers for the ___________ so that you can replicate the ________________.

Novelty Detection

_________________ methods seek unique or previously unobserved data patterns. These methods are used in business, science, and engineering. Business applications include fraud detection, warranty claims analysis, and general business process monitoring.

Response Rate Charts

_____________________ plot the proportion of primary outcome cases selected (or transformations of this statistic) versus the selection fraction.

Concordance

_______________measures the fraction of primary-target cases with predicted score that exceeds the predicted score secondary-target cases.

Estimates

_______________predictions approximate the expected value of the target, conditioned on the input values. For cases with numeric targets, this number can be thought of as the average value of the target for all cases having the observed input measurements.

identifies the expected revenues and expected costs for each decision alternative for each level of a target variable

profit or loss? A. describes the ability of the model to separate the primary and secondary outcomes B. identifies the expected revenues and expected costs for each decision alternative for each level of a target variable


Set pelajaran terkait

HUN2201 Study Questions Chapter 2

View Set

Drug Use and Abuse Chapter 6 Cocaine, Amphetamines, and Related Stimulants

View Set

EXAM #1,2,3 Chapter 1-12 and Evicted

View Set

Abnormal psych. ch. 13 questions

View Set

Psyc 201 final: Chapter 11 Stereotypes prejudice and discrimination

View Set