Data Analytics for ACCT - Exam 1 Study Guide
_________ is a set of data used to assess the degree and strength of a predicted relationship. A. structured data B. training data C. test data D. unstructured data
C. test data
The purpose of transforming data is: A. to obtain the data from the appropriate source. B. to identify which data are necessary to complete the analysis. C. to validate the data for completeness and integrity. D. to load the data into the appropriate tool for analysis.
C. to validate the data for completeness and integrity.
The IMPACT cycle includes all except* the following process: A. track outcomes B. perform test plan C. visualize the data D. master the data
C. visualize the data
The advantages of storing data in a relational database include which of the following? Option A - Help in enforcing business rules Option B - Increased information redundancy Option C - Integrating business processes A. Option B B. Option B and C C. Option A and B D. Option A E. Option C F. All of these are advantages G. Option A and C
G. Option A and C are the only advantages
Which approach to Data Analytics attempts to identify similar individuals based on data known about them? A. Similarity matching B. Classification C. Data reduction D. Regression
Similarity matching
An observation about the frequency of leading digits in many real-life sets of numerical data is called: A. Benford's law B. Moore's law C. leading digits hypothesis D. clustering
A. Benford's law
______ mark the split between one class and another. A. Decision boundaries B. Linear classifiers C. Decision trees D. Identified questions
A. Decision boundaries
______ data would be considered the least sophisticated type of data. A. Nominal B. Ordinal C. Interval D. Ratio
A. Nominal
Which attribute is required to exist in each table of a relational database and serves as the "unique identifier" for each record in a table? A. Primary key B. Unique identifier C. Foreign key D. Key attribute
A. Primary Key
Which of these terms is defined as being a central repository of descriptions for all the data attributes of the dataset? A. data dictionary B. Big Data C. Data Analytics D. Data warehouse
A. data dictionary
When there is NO alarm in a continuous audit, but there is an abnormal event, we would call that a: A. false negative B. true negative C. false positive D. true positive
A. false negative
When there is an alarm in a continuous audit, but it is associated with a normal event, we would call that a: A. false positive B. true negative C. true positive D. false negative
A. false positive
Gold, silver, and bronze medals would be examples of: A. ordinal data B. structured data C. nominal data D. test data
A. ordinal data
Data that are organized and reside in a fixed field with a record or a file are generally contained in a relational database or spreadsheet and are readily searchable by search algorithms. The term matching this definition is: A. structured data B. unstructured data C. test data D. training data
A. structured data
Which skills were not emphasized that analytic-minded accountants should have? A. Data scrubbing and data preparation B. Classification of test approaches C. Define and address problems through statistical data analysis D. Develop an analytics mindset
B. Classification of test approaches
(CHAPTER FIVE) A company has two divisions, one in the United States and the other in China. One uses Oracle and the other uses SAP for its basic accounting system. What would we call this? A. Homogeneous systems B. Heterogeneous systems C. Dual ling accounting systems D. Dual data warehouse systems
B. Heterogeneous systems
Who is most likely to have a working knowledge of the various ERP systems that are in use in the company? A. Chief executive officer B. Internal auditor C. IT Staff D. External auditor
B. Internal auditor
Which audit data standards ledger defines product master data, location data, inventory on hand data, and inventory movement? A. Procure to Pay Subledger B. Inventory Subledger C. Base Subledger D. Order to Cash Subledger
B. Inventory Subledger
Why is Supplier ID considered to be a primary key for a Supplier table? A. It can either be for a vendor or miscellaneous provider. B. It contains a unique identifier for each supplier. C. It is used to identify different supplier categories. D. It is a 10-digit number.
B. It contains a unique identifier for each supplier.
By the year 2020, about 1.7 megabytes of new information will be created every: A. minute B. week C. second D. day
C. second
Which audit data standards ledger identifies data needed for purchase orders, goods received, invoices, payments, and adjustments to accounts? A. Base Subledger B. Procure to Pay Subledger C. Inventory Subledger D. Order to Cash Subledger
B. Procure to Pay Subledger
(CHAPTER FOUR) Line charts are not recommended for what type of data? A. Trend lines B. Qualitative data C. Normalized data D. Continuous data
B. Qualitative data
In the late 1960s, Ed Altman developed a model to predict if a company was at severe risk of going bankrupt. He called his statistic Altman's Z-score, now a widely used score in finance. Based on the name of the statistic, which statistical distribution would you guess this came from? A. Poisson distribution B. Standardized normal distribution C. Normal distribution D. Uniform distribution
B. Standardized normal distribution
Models associated with regression and classification data approaches have all except this important part: A. identifying which variables (we'll call these independent variables) might help predict an outcome (we'll call this the dependent variable). B. test data. C. the numeric parameters of the model (detailing the relative weights of each of the variables associated with the prediction). D. the functional form of the relationship (linear, nonlinear, etc.).
B. Test data
Which approach to Data Analytics attempts to assign each unit in a population into a small set of classes (or groups) where the unit best fits? A. similarity matching B. classification C. co-occurrence grouping D. regression
B. classification
(CHAPTER TWO) The metadata that describes each attribute in a database is which of the following? A. descriptive attributes B. Data dictionary C. Flat file D. composite primary key
B. data dictionary
Big Data is often described by the three Vs, or A. variability, velocity, and variety. B. volume, velocity, and variety. C. volume, velocity, and variability. D. volume, volatility, and variability.
B. volume, velocity, and variety
Which approach to data analytics attempts to assign each unit in a population into a small set of classes where the unit belongs? A. Regression B. Similarity matching C. Classification D. Co-occurrence grouping
C. Classification
(CHAPTER ONE) Which skills were not emphasized that analytic-minded accountants should have? A. Data visualization B. Data quality C. Data and systems analysis and design D. Descriptive data analysis
C. Data and systems analysis and design
What are attributes that exist in a relational database that are neither primary nor foreign keys? A. Composite key B. Nondescript attributes C. Descriptive attributes D. Relational table attributes
C. Descriptive attributes
Which of the following is not a typical of nominal data? A. Ethnic group B. Hair color C. SAT scores D. Gender
C. SAT scores
Under the guidance of the chief audit executive (CAE) or another manager, these individuals build teams to develop and implement analytical techniques to aid all of the following audits except: A. Process efficiency and effectiveness B. Support for the financial statement audit C. Tax compliance D. Governance, risk, and compliance including internal controls effectiveness
C. Tax compliance
The IMPACT cycles includes all except the following process: A. address and refine results B. communicate insights C. data preparation D. perform test plan
C. data preparation
Mastering the data can also be described via the ETL process. The ETL process stands for: A. extract, total, and load data. B. enter, total, and load data. C. extract, transform, and load data. D. enter, transform, and load data.
C. extract, transform, and load data.
Exhibit 4-8 (in book) gives chart suggestions for what data you'd like to portray. Those options include all of the following except: A. relationship B. comparison C. normalization D. distribution
C. normalization
In general, the more complex the model, the greater the chance of: A. the need to reduce the amount of data considered. B. underfitting the data. C. overfitting the data. D. pruning the data.
C. overfitting the data
Which of the following defines the time period, the level of materiality, and the expected time for an audit? A. Procedures and specific tasks B. Potential risk C. Methodology D. Audit scope
D. Audit scope
As mentioned in the chapter, which of the following is not a common way that data will need to be cleaned after extraction and validation? A. Format negative numbers B. Remove headings and subtotals C. Correct inconsistencies across data D. Clean up trailing zeroes
D. Clean up trailing zeroes
All of the following may serve as standards for the audit methodology except: A. PCAOB's auditing standards B. COSO's ERM framework C. ISACA's COBIT framework D. FASB's accounting standards
D. FASB's accounting standards
Which of these is not included in the five steps of the ETL process? A. Determine the purpose and scope of the data request. B. Obtain the data. C. Validate the data for completeness and integrity. D. Learn what data is available in the data warehouse.
D. Learn what data is available in the data warehouse.
(CHAPTER THREE) Which approach to data analytics attempts to predict a relationship between two data items? A. Co-occurrence grouping B. Classification C. Similarity matching D. Link prediction
D. Link Prediction
Which approach to Data Analytics attempts to predict relationship between two data items? A. Regression B. Profiling C. Classification D. Link prediciton
D. Link prediciton
__________ data would be considered the most sophisticated type of data. A. Interval B. Ordinal C. Nominal D. Ratio
D. Ratio
What is the most appropriate chart when showing a relationship between two variables (according to Exhibit 4-8 in book)? A. Pie graph B. Histogram C. Bar chart D. Scatter chart
D. Scatter chart
__________ is a discriminating classifier that is defined by a separating hyperplane that works first to find the widest margin (or biggest pipe) and then works to find the middle line. A. Multiple regression B. Linear classifier C. Decision trees D. Support vector machines
D. Support vector machines
In general, the simpler the model, the greater the chance of: A. overfitting the data. B. pruning the data. C. the need to reduce the amount of data considered. D. underfitting the data.
D. Underfitting the data
If purchase orders are monitored for unauthorized activity in real time while month-end adjusting entries are evaluated once a month, those transactions monitored in real time would be an example of a: A. traditional audit B. continuous monitoring C. periodic test of internal controls D. continuous audit
D. continuous audit
Which of the following describes part of the goal of the ETL process: A. identify which approach to data analytics should be used. B. load the data into a relational database for storage. C. communicate the results and insights found through the analysis. D. identify and obtain the data needed for solving the problem.
D. identify and obtain the data needed for solving the problem.
The Fahrenheit scale of temperature measurement would best be described as an example of: A. discrete data B. nominal data C. continuous data D. interval data
D. interval data
Justin Zobel suggests that revising your writing requires you to "be egoless—ready to dislike anything you have previously written," suggesting that it is __________ you need to please. A. your boss B. the customer C. yourself D. the reader
D. the reader