chapter 3 smartbook
decision support systems are an example of _____
Prescriptive analytics
clustering is an unsupervised method that is used to find natural ____ within the data
groupings
what is the terminology for the items that are useful for ranking observations rather than simply predicting class probability?
linear classifiers
______ might be used to identify areas where there is a lack of controls, changes in procedures, or individuals more willing to spend excessively in potential types of T&E expenses which might be associated with higher risk
profiling
structured data is stored in a database or spreadsheet and are readily
searchable
profiling is a/an _____ method that is used to discover patterns of behavior, based on the distance of z-scores from the mean
unsupervised
machine learning, artifical intelligence and decision support systems are all examples of ____
prescriptive analytics
____ might be used to identify areas where there is a lack of controls, changes in procedures, or individuals more willing to spend excessively in potential types of T&E expenses which might be associated with higher risk
profiling
what is the terminology for removing branches from a decision tree to avoid overfitting the model
pruning
structured data is stored in a database or spreadsheet and are readily ____
searchable
what body mandates submission of XBRL to facilitate the exchange of financial reporting information?
securities and exchange commission
A/an _________ approach is used when you don't have a specific question and are simply exploring the data for potential patterns of interest
unsupervised
a/an _______ approach is used when you don't have a specific question and are simply exploring the data for potential patterns of interest
unsupervised
cluster is a/an _______ method that is used to find natural grouping within the data
unsupervised
clustering is a/an _______ method that is used to find natural groupings within the data.
unsupervised
knowing the mean and standard deviation, and assuming a normal distribution, one can compute which statistic that can be used to identify abnormal transactions
z-score
knowing the mean and standard deviation, and assuming a normal distribution, one can compute which statistic that can be used to identify abnormal transactions?
z-score
what is the purpose of classification
to gain an understanding of typical behavior of an individual, group, population, or sample
_____ data are existing data that have been manually evaluated and assigned a class. ____ data are existing data used to evaluate the model
1. training 2. test
A decision _______ is a tool used to divide data into smaller groups. Decision ______ is a technique used to mark the split between one class and another
1. tree 2. boundries
any transaction that has a z-score of _____ or above would represent abnormal transactions
3
in the example regarding the LendingClub data in which the analyst is researching loan rejection, they identified three possible indicators for why a loan would be rejected, the debt-to-income ratio, length of employment, and credit [risk] score. which of the following is/are the explanatory variable(s) a. credit [risk] score b. length of employment c. debt-to-income ratio d. loan rejected
A B C
in the example provided in the text regarding employee turnover, the analyst is trying to predict employee turnover based on current professional salaries, health of the economy (GDP), and salaries offered by other accounting firms. in this scenario, select the explanatory variable(s) a. employee turnover b. salaries offered by other accounting firms c. current professional salaries d. health of the economy
B, C, D
what body mandates submission of XBRL to facilitate the exchange of financial reporting information?
SEC
____ looks for similarities between portions, or segments, of the text of each potential match a. fuzzy match b. similarity match c. text matching d. data matching
a. fuzzy match
after you have identified the classes you wish to predict, what is the next step? a. select a set of classification models b. manually classify an existing set of records c. generate your model d. interpret the results and select the "best" model
b.
what is the purpose of data reduction? a. to predict the class of a new observation b. to reduce the amount of detailed information considered to focus on the most interesting or abnormal items c. to estimate or predict, for each unit, the numerical value of some variable. d. to gain an understanding of a typical behavior of an individual, group, population, or sample.
b.
when evaluating classifiers, you need to be careful to strike a balance between what two things? a. explanatory and response variables b. complexity of the model and accuracy of the classifications c. positive and negative relationships in the model
b.
place the steps of data reduction in order: a. follow up on the results b. identify the attribute you would like to reduce or focus on c. interpret the results d. filter the results
b. d. c. a.
_______ is an observation about the frequency of leading digits in many real-life sets of numerical data
benford's law
after you have identified the objects or activity you wish to profile, what should you do next? a. interpret the results and monitor the activity b. set boundaries or thresholds for the activity c. determine the types of profiling you want to perform d. follow up on exceptions
c.
select the correct definition of a target a. a method for simplifying large datasets into obvious categories b. a manually assigned category applied to a record based on an event c. an expected attribute or value that we want to evaluate
c.
select the correct definition of class: a. summary statistics, such as minimums, maximums; and averages in a dataset b. an expected attribute or value that we want to evaluate in a dataset c. a manually assigned category applied to a record based on an event
c.
what is XBRL used for? a. a technique used by analysts to develop models to predict expected outcomes b. to provide a description of each field in the tables of a relational database c. to facilitate the exchange of financial reporting information between a company and the SEC d. to look up correspondences between portions, or segments, of a set of text for a potential match
c.
which of the following is true regarding the profiling approach? a. it is rarely used to assess internal controls b. it is primarily done using unstructured data c. it is generally performed on data that is readily available d. it is never as simple as calculating summary statistics
c.
after you have identified the attribute you would like to reduce or focus on, what is the next step? a. follow up on results b. set boundaries or thresholds for the activity c. filter the results d. interpret the results
c. filter the results
which of the following is true regarding the data reduction approach? a. it is most useful when performed on a small dataset b. it works best when there is not any particular attribute you would like to focus on. c. it primarily uses structured data that is readily searchable.
c. it primarily uses structured data that is readily searchable
in the following question, what would be the target? given a set of customer data, we are trying to predict the total transaction amount based on a variety of attributes a. the entire dataset b. the number of customers c. transaction amount d. customer name
c. transaction amount
_____ are designed to be interactive and adapt to the information collected by the user
decision support systems
variance analysis, a common practice in management accounting, is an example of ____ analysis
diagnostic
variance analysis, a common practice in management accounting, is an example of _______ analytics
diagnostic
place the steps of profiling in order, from 1 through 5. a. interpret the results and monitor the activity and/or generate a list of exceptions b. set boundaries or thresholds for the activity c. follow up on exceptions d. determine the types of profiling you want to perform e. identify the objects or activity you want to profile
e. d. b. a. c.
In the example regarding the LendingClub data in which the analyst is researching loan rejection, they identified three possible indicators for why a loan would be rejected, the debt-to-income ratio, length of employment, and credit [risk] score. Which is the response variable?
loan rejection
______ include both unsupervised exploratory analysis and supervised model generation to provide insight and predictive foresight into the business and decisions made by accountants and auditors
machine learning and artificial intelligence
classification predicts a class for a new observation based on _____ identification of classes from previous observations
manual
generally the more complex and complete the model, the higher degree of the model _____ the data
overfitting
profiling is a/an unsupervised method that is used to discover ____ of behavior, based on the distance of z-scores from the mean
patterns
Benfords law states that in many naturally occurring collections of numbers, the significant leading digit is likely to be ____
small
in the example of profiling for management accounting regarding Advanced Environmental Recycling Technologies, what are the looking for significant variances in?
standard cost
a/an _____ approach is used when you are performing analysis that uses historical data to predict a future outcome based on a specific question
suervised
A class is a manually assigned ________ applied to a record based on an event
category
classification is a method that can be used to predict the _____ of a new observation
category
when evaluating classifiers, you need to be careful to strike a balance between what two things?
complexity of the model and accuracy of the classification
what is the purpose of clustering? a. to reduce the amount of detailed information considered to focus on the most interesting or abnormal items b. it allows analysts to develop models to predict expected outcomes c. to gain an understanding of typical behavior of an individual, group, population, or sample d. to identify groups of similar data elements and the underlying drivers these groups
d
in the profiling example regarding T&E Expanses, which of the following is NOT one of the areas that the analyst would try to uncover? a. individuals more willing to spend excessively b. change in procedures c. lack of controls d. significant variances in standard cost
d.
place the steps of classification into order: a. divide your data into training and testing sets b. select a set of classification models c. manually classify an existing set of records d. identify the classes you wish to predict e. generate your model f. interpret the results and select the "best" model
d. c. b. a. e. f.
true or false: classification requires that we know a great deal about observation that we're attempting to place in a class
false
true or false: classification requires that we know a great deal about the observation that we're attempting to place in a class
false
true or false: when clustering works well, observations within a segment should be different and the data across segments should be very similar?
false
true or false: when clustering works well, observations within a segment should be different, and the data across segments should be very similar
false
A specific type od data profiling that is used to look for correspondences between portions, or segments, of text for potential matches called _____ match
fuzzy
clustering is an unsupervised method that is used to find natural ______ within the data
grouping