ACCT 3130, Learnsmart Chapter 3
Salaries offered by other accounting firms, health of the economy, and current professional salaries.
In the example provided in the text regarding employee turnover, the analyst is trying to predict employee turnover based on current professional salaries, health of the economy (GDP), and salaries offered by other accounting firms. In this scenario, select the explanatory variables.
Identify the attribute you would like to reduce or focus on. Filter the results. Interpret the results. Follow up on the results.
Place the Steps of Data Reduction in Order
Identify the classes you wish to predict. Manually classify an existing set of records. Select a set of classification models. Divide your data into training and testing sets. Generate your model. Interpret the results and select the "best" model.
Place the steps of classification in order.
Identify the objects or activity you want to profile.
Place the steps of profiling in order.
False
True or False: Classification requires that we know a great deal about the observation that we're attempting to place in a class.
False.
True or False: Dependent variables can only be explained by one independent variable.
To identify groups of similar data elements and the underlying drivers of these groups.
What is the purpose of clustering?
To gain an understanding of a typical behavior of an individual, population, group, or sample.
What is the purpose of profiling?
It allows analysts to develop models to predict expected outcomes.
What is the purpose of regression analysis?
Linear Classifiers
What is the terminology for the items that are useful for ranking observations rather than simply predicting class probability?
It primarily uses structured data that is readily searchable.
Which of the following is true regarding the Data Reduction approach?
Category
A class is a manually assigned ___ applied to a record based on an event.
Decision Boundaries
A technique used to mark the split between one class and another.
Decision Tree
A tool that is used to divide data into smaller groups.
Unsupervised
A/an ___ approach is used when you don't have a specific question and are simply exploring the data for potential patterns of interest.
Manually classify an existing set of records.
After you have identified the classes you wish to predict, what is the next step?
Class
Classification is an unsupervised method that can be used to predict the ___ of a new observation.
Training Data
Existing data that have been manually evaluated and assigned a class.
Test Data
Existing data used to evaluate the model.
Debt-to-income ratio, length of employment, and credit (risk) score
In the example regarding the LendingClub data in which the analyst is researching loan rejection, they identified three possible indicators for why a loan would be rejected: the debt-to-income ratio, length of employment, and credit (risk) score. Which of the following are explanatory variables?
Patterns
Profiling is an unsupervised method that is used to discover ___ of behavior, based on the distance of the z-score from the mean.
Supervised
Regression is a ___ method used to predict specific values given an explanatory variable (or variables)
Searchable
Structured data is stored in a database or spreadsheet and are readily ___.