Ch.3

Ace your homework & exams now with Quizwiz!

T or F: When clustering works well, observations within a segment should be different, and the data across segments should be very similar

False; it's the opposite

Machine learning, artificial intelligence and decision support systems are all examples of

Prescriptive Analytics

Clustering is an unsupervised method that's used to find natural ______ within the data

groups

A decision ______ is a tool used to divide data into smaller groups. Decision _____ is a technique used to mark the split b/t one class and another

tree; boundaries

Profiling is a/an ________ method that's used to discover patterns of behavior, based on the distance of z-scores from the mean

unsupervised

_______ data are existing data that have been manually evaluated and assigned a class, ______ data are existing data used to evaluate the model

Training; Test

A/an _______ approach is used when you're performing analysis that uses historical data to predict a future outcome based on a specific question

supervised

What would be the target? Given a set of customer data, we're trying to predict the total transaction amount based on a variety of attributes

the transaction amount

A/an ______ approach is used when you don't have a specific question and are simply exploring the data for potential patterns of interest

unsupervised

Knowing the mean and standard deviation, and assuming a normal distribution, one can compute which statistic that can be used to identify abnormal transactions?

z-score

Data Reduction Steps

1. Identify the attribute you would like to reduce or focus on 2. Filter the results 3. Interpret the results 4. Follow up on results

Classification Steps

1. Identify the classes you wish to predict 2. Manually classify an existing set of records 3. Select a set of classification models 4. Divide your data into training and testing sets 5. Generate your model 6. Interpret the results and select the "best" model

Data Profiling Steps

1. Identify the objects or activity you want to profile 2. Determine the types of profiling you want to perform 3. Set boundaries or thresholds for the activity 4. Interpret the results and monitor the activity and/or generate a list of exceptions 5. Follow up on exceptions

_____ is an observation about the frequency of leading digits in many real-life sets of numerical data

Benford's Law

__________ are designed to be interactive and adapt to the information collected by the user

Decision Support Systems

Variance analysis, a common practice in management accounting, is an example of

Diagnostic Analytics

What is true regarding the Data Reduction approach?

It primarily uses structured data that is readily searchable

What's the terminology for the items that are useful for ranking observations rather than simply predicting class probability?

Linear Classifiers

______ might be used to identify areas where there is a lack of controls, changes in procedures, or individuals more willing to spend excessively in potential types of T&E expenses which might be associated w/higher risk

Profiling

What's the terminology for removing branches from a decision tree to avoid overfitting the model?

Pruning

What is XBRL used for?

To facilitate the exchange of financial reporting information b/t a company and the SEC

What's the purpose of profiling?

To gain an understanding of a typical behavior of an individual, group, population, or sample

What's the purpose of clustering?

To identify groups of similar data elements and the underlying drivers of these groups

A class is a manually assigned _______ applied to a record based on an event

category

Classification predicts a class for a new observation based on the _________ identification of classes from previous observations

manual

Generally the more complex and complete the model, the higher degree of the model _____ the data

overfitting

T or F: Classification requires that we know a great deal about the observation that we're attempting to place in a class

False

________ include both unsupervised exploratory analysis and supervised model generation to provide insight and predictive foresight into the business and decisions made by accountants and auditors

Machine Learning and Artificial Intelligence

When evaluating classifiers, you need to be careful to strike a balance b/t what 2 things?

complexity of the model and accuracy of the classification

A specific type of data profiling that is used to look for correspondences between portions, or segments, of text for potential matches is called

fuzzy match


Related study sets

2.8 Summarize the basics of cryptographic concepts

View Set

Chapter 15: Aggregate Demand, Aggregate Supply, and Inflation

View Set