Acct 3130: Chapter 3 (Smartbook)
After you have identified the attribute you would like to reduce or focus on, what is the next step?
Filter the results.
True or false: Classification requires that we know a great deal about the observation that we're attempting to place in a class.
False
A specific type of data profiling that is used to look for correspondences between portions, or segments, of text for potential matches is called___match.
Fuzzy
Classification predicts a class for a new observation based on the___identification of classes from previous observations.
Manual
After you have identified the classes you wish to predict, what is the next step?
Manually classify an existing set of records.
Profiling is used to discover___of behavior, based on the distance of z-scores from the mean.
Patterns
Clustering is an unsupervised method that is used to find___of similar data elements and the underlying relationships of those groups.
groupings
______ looks for similarities between portions, or segments, of the text of each potential match.
Fuzzy match
What is the purpose of Data Reduction?
To reduce the amount of detailed information considered to focus on the most interesting or abnormal items.
In the example regarding the LendingClub data in which the analyst is researching loan rejection, they identified three possible indicators for why a loan would be rejected, the debt-to-income ratio, length of employment, and credit [risk] score. Which is the dependent variable?
Loan rejection
____data are existing data that have been manually evaluated and assigned a class. ___data are existing data used to evaluate the model.
Training; Test
In the following question, what would be the target? Given a set of customer data, we are trying to predict the total transaction amount based on a variety of attributes.
Transaction amount
Decision support systems are an example of Blank______.
prescriptive analytics
Regression is a/an___method used to predict specific values given an explanatory variable (or variables).
supervised
A decision___is a tool used to divide data into smaller groups. Decision____mark the split between one class and another.
tree; boundaries
A/an___approach is used when you don't have a specific question and are simply exploring the data for potential patterns of interest.
unsupervised
Knowing the mean and standard deviation, and assuming a normal distribution, one can compute which statistic that can be used to identify abnormal transactions?
z-score
What is XBRL used for?
to facilitate the exchange of financial reporting information between a company and the SEC.
Select the appropriate definition for regression:
A method used to predict specific values
Place the steps of profiling in order, from 1 through 5.
1.) Identify the objects or activity you want to profile. 2.) Determine the types of profiling you want to perform. 3.) Set boundaries or thresholds for the activity. 4.) Interpret the results and monitor the activity and/or generate a list of exceptions. 5.) Follow up on exceptions.
Any transaction that has a Z-score of Blank______ or above would represent abnormal transactions.
3
Select the correct definition of class.
A manually assigned category applied to a record based on an event.
Select the correct definition of a target.
An expected attribute or value that we want to evaluate.
______ is an observation about the frequency of leading digits in many real-life sets of numerical data.
Benford's law
In the example regarding the LendingClub data in which the analyst is researching loan rejection, they identified three possible indicators for why a loan would be rejected, the debt-to-income ratio, length of employment, and credit [risk] score. Which of the following is/are the explanatory variable(s)?
Debt-to-income ratio Length of employment Credit [risk] score
______ are designed to be interactive and adapt to the information collected by the user.
Decision support systems
After you have identified the objects or activity you wish to profile, what should you do next?
Determine the types of profiling you want to perform.
What is the purpose of regression analysis?
It allows analysts to develop models to predict expected outcomes.
Which of the following is true regarding the profiling approach?
It is generally performed on data that is readily available.
Which of the following is true regarding the Data Reduction approach?
It primarily uses structured data that is readily searchable.
What is the terminology for the items that are useful for ranking observations rather than simply predicting class probability?
Linear classifiers
______ might be used to identify areas where there is a lack of controls, changes in procedures, or individuals more willing to spend excessively in potential types of T&E expenses which might be associated with higher risk.
Profiling
What is the terminology for removing branches from a decision tree to avoid overfitting the model?
Pruning
A/an___approach is used when you are performing analysis that uses historical data to predict a future outcome based on a specific question.
Supervised
What is the purpose of profiling?
To gain an understanding of a typical behavior of an individual, group, population, or sample.
What is the purpose of clustering?
To identify groups of similar data elements and the underlying relationship of these groups.
What is the purpose of classification?
To predict which class an observation that we know little about will belong to.
Profiling is a/an___analytics method that is used to discover patterns of behavior, based on the distance of z-scores from the mean.
diagnostic
Generally the more complex and complete the model, the higher degree of the model Blank______ the data.
overfitting
In the profiling example regarding T&E Expenses, which of the following is NOT one of the areas that the analyst would try to uncover?
significant variances in standard cost
Clustering is a/an___method that is used to find natural groupings within the data.
supervised
In the example provided in the text regarding employee turnover, the analyst is trying to predict employee turnover based on current professional salaries, health of the economy (GDP), and salaries offered by other accounting firms. In this scenario, select the explanatory variable(s). Select all that apply.
-salaries offered by other accounting firms -health of the economy -current professional salaries
Place the steps of Data Reduction in order:
1.) Identify the attribute you would like to reduce or focus on. 2.) Filter the results 3.) Interpret the results. 4.) Follow up on the results.
True or False: Dependent variables can only be explained by a maximum of one independent variable.
False
True or False: Diagnostic analytics forecast future performance.
False
True or False: The alternative hypothesis assumes the hypothesized relationship does not exist.
False
True or False: The co-occurrence grouping data approach is associated with predictive analytics.
False
True or False: Time series analysis is a predictive analytics technique used to predict future values based on past values of other variables.
False
True or false: When clustering works well, observations within a cluster should be different, and the data across clusters should be very similar.
False
______ include both unsupervised exploratory analysis and supervised model generation to provide insight and predictive foresight into the business and decisions made by accountants and auditors.
Machine learning and artificial intelligence
Which analytics type works to identify the best possible options given constraints or changing conditions?
Prescriptive analytics
Which of the following data approaches are associated with diagnostic analytics?
Profiling
Structured data is stored in a database or spreadsheet and are readily____.
Searchable
XBRL is used to facilitate the exchange of financial reporting information between the company and the Blank______?
Securities and Exchange Commission (SEC)
Benford's law states that in many naturally occurring collections of numbers, the significant leading digit is likely to be Blank______.
Small
In the example of profiling for management accounting regarding Advanced Environmental Recycling Technologies, what are they looking for significant variances in?
Standard Cost
A class is a manually assigned___applied to a record based on an event.
category
Using a___model, you can predict whether a new vendor belongs to one class or another based on the behavior of others.
classification
When evaluating classifiers, you need to be careful to strike a balance between what two things?
complexity of the model and accuracy of the classification
Variance analysis, a common practice in management accounting, is an example of Blank______ analytics.
diagnostic
In the example provided in the text regarding employee turnover, the analyst is trying to predict employee turnover based on current professional salaries, health of the economy (GDP), and salaries offered by other accounting firms. In this scenario, what is the dependent variable?
employee turnover