ACC 615 Chapter 3 SmartBook, Lecture Video & Powerpoint
After you have identified the attribute you would like to reduce or focus on, what is the next step?
filter the results
descriptive analytics help
summarize what has happened in the past
types of descriptive analytics
summary statistics and data reduction or filtering
pruning
removes branches from a decision tree to avoid overfitting the model
In the profiling example regarding T&E Expenses, which of the following is NOT one of the areas that the analyst would try to uncover?
significant variances in standard cost
__________ is a set of data used to assess the degree and strength of a predicted relationship.
test data
What is XBRL used for?
to facilitate the exchange of financial reporting information between a company and the SEC.
support vector machine
a discriminating classifier that is defined by a separating hyperplane that works first to find the widest margin and then works to find the middle line
decision support systems
rule based systems that gather data and recommend actions based on the input
a financial accountant would
sum all of the sales transactions within a period to calculate the value for sales revenue that appears on the income statement
A/an ______________ , approach is used when you are performing analysis that uses historical data to predict a future outcome based on a specific question.
supervised
Clustering is a/an _______________ method that is used to find natural groupings within the data.
supervised
__________ is a discriminating classifier that is defined by a separating hyperplane that works first to find the widest margin (or biggest pipe) and then works to find the middle line.
support vector machines
Models associated with regression and classification data approaches have all but this important part:
test data
What is the purpose of classification?
To predict which class an observation that we know little about will belong to.
What is the purpose of Data Reduction?
To reduce the amount of detailed information considered to focus on the most interesting or abnormal items.
The observation that the frequency of leading digits in many real-life sets of numerical data is called:
Benford's law
_____ is an observation about the frequency of leading digits in many real-life sets of numerical data.
Benford's law
After you have identified the classes you wish to predict, what is the next step? Multiple choice question.
manually
decision boundaries
mark the split between one class and another
In general, the more complex the model, the greater the chance of
overfitting the data
Profiling is a/an unsupervised method that is used to discover ____________ of behavior, based on the distance of z-scores from the mean.
patterns
classification
predicts a class or caterer for a new observation based on the manual identification of classes from previous observations
link prediction
predicts a relationship between two data items, such as members of a social media platform
Decision support systems are an example of _____.
prescriptive
diagnostic analytics
procedures that explore the current data to determine why something has happened the way it has, typically comparing the data to a benchmark
prescriptive analytics
procedures that model data to enable recommendations for what should be done in the future
descriptive analytics
procedures that summarize existing data to determine what has happened in the past
_____ might be used to identify areas where there is a lack of controls, changes in procedures, or individuals more willing to spend excessively in potential types of T&E expenses which might be associated with higher risk.
profiling
types of diagnostic analytics
profiling, clustering, similarity matching, co occurrence grouping
What is the terminology for removing branches from a decision tree to avoid overfitting the model?
pruning
types of predictive analytics
regression, classification, link prediction
structured data is stored in a database or spreadsheet and are readily
searchable
Benford's law states that in many naturally occurring collections of numbers, the significant leading digit is likely to be ______.
small
In the example of profiling for management accounting regarding Advanced Environmental Recycling Technologies, what are they looking for significant variances in?
standard cost
Data that are organized and reside in a fixed field with a record or a file. Such data are generally contained in a relational database or spreadsheet and are readily searchable by search algorithms. The term matching this definition is:
structured data
In the following question, what would be the target? Given a set of customer data, we are trying to predict the total transaction amount based on a variety of attributes.
transaction amount
In the example of profiling for management accounting regarding Advanced Environmental Recycling Technologies, what are they looking for significant variances in?
travel and entertainment expenses
In general, the more simple the model, the greater the chance of
underfitting the data
Any transaction that has a Z-score of ____ or above would represent abnormal transactions. Multiple choice question.
3
Select the correct definition of class.
A manually assigned category applied to a record based on an event.
Select the correct definition of a target.
An expected attribute or value that we want to evaluate.
Which of the following is true regarding the profiling approach?
It is generally performed on data that is readily available.
_____ include both unsupervised exploratory analysis and supervised model generation to provide insight and predictive foresight into the business and decisions made by accountants and auditors.
Machine learning and artificial intelligence
What body mandates submission of XBRL to facilitate the exchange of financial reporting information?
SEC
What is the purpose of clustering?
To identify groups of similar data elements and the underlying drivers of these groups.
in auditing, regression may be used to determine the
appropriateness of allowance accounts
clustering algorithms
calculate the minimum distance of all observations and groups those elements
Classification is a method that can be used to predict _________ the of a new observation.
class
Which approach to data analytics attempts to assign each unit in a population into a small set of classes where the unit belongs?
classification
When evaluating classifiers, you need to be careful to strike a balance between what two things?`
complexity of the model and accuracy of the classification
an analyst would
could the number of records in a data extract to ensure the data are complete before running a a more complex analysis
__________ mark the split between one class and another.
decision boundaries
types of prescriptive analytics
decisions support systems and machine learning and artificial intelligence
summary statistics
describe a set of data in terms of their location, range, shape and size
Variance analysis, a common practice in management accounting, is an example of _____ analytics.
diagnostic
co-occurrence grouping
discovers associations between individuals based on common events, such as transactions they are involved in
In the example provided in the text regarding employee turnover, the analyst is trying to predict employee turnover based on current professional salaries, health of the economy (GDP), and salaries offered by other accounting firms. In this scenario, what is the response variable?
employee turnover
in managerial accounting, regression may predict
employee turnover
A specific type of data profiling that is used to look for correspondences between portions, or segments, of text for potential matches is called ______________ match.
fuzzy
_____ looks for similarities between portions, or segments, of the text of each potential match. Multiple choice question.
fuzzy match
data reduction or filtering
is used to reduce the amount of observations to focus on relevant items. it does this by taking a large set of data and reducing it to a smaller set that has the vast majority of the critical information of the larger set
Which of the following is true regarding the Data Reduction approach?
it primarily uses structured data that is readily searchable
machine learning and artificial intelligence
learning models or intelligent agents that adapt to new external data to recommend a course of action
What is the terminology for the items that are useful for ranking observations rather than simply predicting class probability?
linear classifiers
fuzzy matching
locates approximate matches
decision tress
used to divide data into smaller groups
linear classifiers
useful for ranking items rather than simply predicting class probability
Using a classification model, you can predict __________________ a new vendor belongs to one class or another based on the behavior of others.
whether
Knowing the mean and standard deviation, and assuming a normal distribution, one can compute which statistic that can be used to identify abnormal transactions?
z score
What is the purpose of profiling?
To reduce the amount of information that needs to be considered.
predictive analytics
procedures used to generate a model that can be used to determine what is likely to happen in the future
After you have identified the objects or activity you wish to profile, what should you do next?
Determine the types of profiling you want to perform.
regression
estimates or predicts the numerical value of a dependent variable based on the slop and intersect of a line and the value of an independent variable
training data
existing data that have been manually evaluated an assigned a class
test data
existing data used to evaluate the model
True or false: When clustering works well, observations within a segment should be different, and the data across segments should be very similar.
false
an auditor would
filter the data to limit the scope to transactions that represent the highest risk. in all these cases, basic analysis provides an understanding of what has happened in the past to help decision makers achieve good results and correct poor results
similarity matching
grouping technique used to identify similar individuals based on data known about them
Clustering is an unsupervised method that is used to find natural___________ within the data.
groupings
clustering
helps identify groups or individuals that share common underlying characteristics
profiling
identifies the typical behavior of an individual, group, or population by compiling summary statistics about the data and comparing individuals to the population
data reduction involved the following steps
identify the attribute you would like to reduce or focus on filter the results interpret the results follow up on results
IMPACT Cycle
identify the questions master the data perform test plan address and refine results communicate insights track outcomes
examples of profiling
internal auditors analyze travel and entertainment expenses for violations of internal controls managers use profiling to compare variances from target ranges in continuous audit, auditor may use Benford's Law to evaluate the frequency distribution of the first digits from a large set of numerical data
Which approach to data analytics attempts to predict a relationship between two data items?
link prediction