MIDTERM #1
True or false: Data analytics expands auditors capabilities in services like testing for fraudulent transactions.
True
What are the additional 3 V's that characterize Big Data?
Veracity: Quality or trustworthiness of data Variability: To what extent and how often does data structure change Value: Worth of the data being extracted
Regression
a SUPERVISED method used to predict specific values given an explanatory variable or variables
Fuzzy match
a specific type of data profiling that is used to look for correspondence between portions, or segments, or text for potential matches
Similarity Matching
a testing approach that attempt to identify similar individuals based on data known about them
Audit firms are increasingly considering operational data such as manufacturing logs, customer relationship management data and supply chain data primarily to
help companies refine processes
What is Data?
information that a company collects for its customers, employees, products/services, etc
Unsupervised approach
used when you don't have a specific question and are simply exploring the data for potential patterns of interest.
How do you characterize Big Data ? Hint: 3 V's
Volume: Size of the data Velocity: Speed at which the data are being generated, collected, and analyzed Variety: Number of types of data
Clustering
an UNSUPERVISED method that is used to find natural groupings within the data
Pruning
Removing branches from a decision tree to avoid overfitting the model
Target
an expected attribute or value that you want to evaluate
Which of the following is an accurate description of the Audit Data Standards?
A guide for formatting the way in which data are provided to auditors.
Class
A manually assigned category applied to a record based on an event
Which would NOT be considered as one of the seven skills that analytic minded accountants should have?
Ability to house huge data sets
Financial accounting often has challenges with valuation and estimation in all but the following area:
Accounts Payable
When you need to retrieve data that is stored in more than one table, which type of clause should you use in your SQL query?
Join
Descriptive attribute
attributes that exist to provide business information
Data Analytics can be applied to taxes by helping to predict the tax consequences of a potential international transactions, a proposed merger or acquisition or
Investment in R&D
What are ways to validate the data for completeness and integrity?
1) Comparing descriptive statistics for numeric fields from the extracted in order to communicate the results of the test plan 2) Comparing the number of records that were extracted to the umber in the source database 3) Validating data/time fields and comparing the descriptive stats from the extracted data to those from the source database.
What are the benefits of storing data in a relational database?
1) Completeness of data 2) Integration of business processes 3) Business rules are enforced
Any transaction that has a Z-score of BLANK or above would represent abnormal transactions
3
What is a data dictionary useful for?
It helps database administrators maintain databases.
Slicing and dicing the data, finding correlations, revising and rerunning the analysis would be considered to be part of which stage of the IMPACY cycle?
Address and Refine Results
What is the fourth step of the IMPACT Model?
Address and Refine Results Identify issues with the analyses, possible root causes, and refine the model
Accountants need to be able to
Articulate business problems Communicate data needs with data specialists Draw appropriate conclusions Present results in an accessible manner
Primary Key
Unique identifier of each record in a table.
The law that states that in many naturally occurring collections of numbers, the significant leading digit is likely to be small.
Benford's Law
The firm practice of monitoring competitors, customers, and suppliers to better understand its opportunities and threats is called
Business Intelligence
Foreign Key
Carries out the relationship between two tables
Transformation includes
Cleaning the data and validating the data for completeness and intgrity
Data visualization would be part of which step of the IMPACT cycle?
Communicate Insights
Which step in the IMPACT model comes after slicing and dicing the data, finding correlations, asking further questions, and revising and running the analysis?
Communicate Insights
What is the fifth step of the IMPACT Model?
Communicate Insights Communicate effectively using clear language and visualizations
When evaluating classifiers, you need to be careful to strike a balance between what two things?
Complexity of the model and accuracy of the classification
The process of evaluating data with the purpose of drawing conclusions to address business questions is defined as:
Data Analytics
Semi-Structured Data
Data that contain tags and other markers to separate data elements
Unstructured Data
Data that do not reside in fixed or defined fields
These mark the split between one class and another
Decision boundaries
These are used to divide data into smaller groups
Decision trees
True or False: Dependent variable can only be explained by a maximum of one independent variables
FALSE
True or false: A data dictionary will be more robust and will have more attributes to keep track of for a dataset stored as a flat file.
FALSE
When using a VLOOKUP function to add a column to a table from an existing table, the rangelookup option that you should use is BLANK.
FALSE
True or false: Classification requires that we know a great deal about the observation that we're attempting to place in a class.
False: we can know very little about a given observation in order to predict which class it will belong to
IMPACT Model
I: Identify Questions M: Master Data P: Perform Testing A: Address/Define C: Communicate Insights T: Track Outcomes
What is the first step of the IMPACT Model?
Identify the Questions: Understand the business problems that need to be addressed.
Which of the following is true regarding data reduction approach? 1) It is msot useful when performed on a small dataset 2) It works best when there is not any particular attribute you would like to focus on 3) It primarily uses structured data that is readily searchable
It primarily uses structured data that is readily searchable.
Loading
Load the data for data analysis
Which of the following is not an existing Audit data standard? 1) General Ledger 2) Inventory Subledger 3) Order-to-Cash subledger 4) Manufacturing subledger 4) Procure-to-Pay
Manufacturing
ETL (extraction, transformation, and loading) would be an example of which step in the IMPACT cycle?
Master the Data
What is the second step of the IMPACT Model?
Master the Data Know what data are available and how they relate to the problem.
Extraction includes
Obtaining the Data, and Determining the purpose and scope of the data request
What is the third step of the IMPACT Model?
Perform the Test Plan Select an appropriate model to find a target variable
Profiling
an UNSUPERVISED method that is used to discover patterns of behavior, based on the distance of z-scores from the mean
In profiling example regarding t&e expenses, which of the following is not one of the areas that the analyst would try to uncover? 1) Individuals more willing to spend excessively 2) Change in procedures 3) Lack of controls 4) Significant variances in standard cost
Significant variances in standard cost
In the example of profiling for management accounting regarding Advanced Environmental Recycling Technologies, what are they looking for significant variances in?
Standard costs
What are the steps of ETL?
Step 1: Determine the purpose and scope of the data request. Step 2: Obtain the data. Step 3: Validate the data for completeness and integrity. Step 4: Clean the data. Step 5: Load the data for data analysis.
What are the steps of Data Reduction?
Step 1: Identify the attribute you would like to reduce or focus on Step 2: Filter the results Step 3: Interpret the results Step 4: Follow up on the results
What are the Classification Steps?
Step 1: Identify the classes you wish to predict Step 2: Manually classify an existing set of records Step 3: select a set of classification models Step 4: Divide your data into training and testing sets Step 5: Generate your model Step 6:Interpret the results and select the best model
What are the profiling steps?
Step 1: Identify the objects or activity you want to profile Step 2: Determine the types of profiling you want to perform Step 3: Set boundaries or thresholds for the activity Step 4: Interpret the results and monitor the activity and or generate a list of exceptions Step 5: Follow up on exceptions
What is the purpose of profiling?
To gain an understanding of a typical behavior of an individual, group, population, or sample
What is the purpose of clustering?
To identify groups of similar data elements and the underlying drivers of these groups
What is the purpose of a data request form?
To make communication easier between data requestor and provider
What is the purpose of classification?
To predict which class an observation that we know little about will belong to
A digital dashboard would be part of which step of the IMPACT cycle?
Track Outcomes
What is the sixth step of the IMPACT Model?
Track Outcomes Follow up on the results of the analysis
The Forbes Insight/KPMG report, Audit 2020: A Focus on Change." found that a vast majority of survey respondents believe that technology will enhance the quality, BLANK, and accuracy of the audity
Transparency
Classification
an SUPERVISED method that can be used to predict the class of a new observation an attempt to assign each unit in a population into a few categories
Structured Data
data that reside in fixed fields
As part of mastering the data, data analysts perform data BLANK to reduce data redundancy and improve data integrity
normalization
Big Data
refers to datasets which are too large and complex to be analyzed traditionally; has the potential to be mined for information
A UML Class Diagram is used to support and design a BLANK database
relational database
Data Analytics
the ability to understand the data that is collected; the process of evaluating data with the purpose of drawing conclusions to address business questions
Data Insights
the actions a company takes after collecting and analyzing data
XBRL
to facilitate the exchange of financial reporting information between a company and the SEC
Accountants must be comfortable with
•Data scrubbing and data preparation •Data quality •Descriptive data analysis •Data analysis through data manipulation •Define and address problems through statistical analysis •Data visualization and data reporting
