Advanced Spreadsheet Exam 1

Ace your homework & exams now with Quizwiz!

Interval Data

0 is just another number (temperature in US)

The trend line in your chart should take up to _______% of the chart.

66%

Asking questions like "Are our customers paying us in a timely manner" would be the first step in which of the following processes?

IMPACT cycle

IMPACT Cycle

Identify questions Master Data Perform test plan Address and refine results Communicate insight Track outcomes

Which approach to Data Analytics attempts to predict relationship between two data items?

Link prediction

Which of the following is not one of the means of cleaning the data after extraction and validation?

Load the data into the software program in preparation for analysis

_____ data are considered the most sophisticated type of data.

Ratio

__________ data would be considered the most sophisticated type of data.

Ratio

What type of database are you most likely to come across when extracting and using accounting and financial data?

Relational

By the year 2020, about 1.7 megabytes of new information will be created every:

Second.

Which approach to Data Analytics attempts to identify similar individuals based on data known about them?

Similarity matching

Data Analytics may use what source to assess the probability of a goodwill write-down, warranty claims or the collectibility of bad debts?

Social media

Foreign Key

attribute that points to a primary key in another table

The Forbes Insight/KPMG report, "Audit 2020: A Focus on Change.", found that the vast majority of survey respondents believe that technology will:

enhance the quality, transparency and accuracy of the audit

A target is an expected attribute or value that you want to ________

evaluate

Steps of classification into order

- Identify the classes you wish to predict - manually classify an existing set of records - select a set of classification models - divide your data into training and testing sets - generate your model - interpret the results and select the "best" model

elements in order

- identify the object or activity you want to profile - determine the types of profiling you want to preform - set boundaries or thresholds for the activity - interpret the results and monitor the activity and/or generate a list of exceptions - follow up on exceptions

In the example provided in the text regarding employee turnover, the analyst is trying to predict employee turnover based on current professional salaries, health of the economy (GDP), and salaries offered by other accounting firms. In this scenario, select the explanatory variable(s).

- salaries offered by other accounting firms - current professional salaries - health of the economy

ETL Process

-Determine the Purpose and Scope of the Data Request -obtain the data -validate the data for completeness and integrity -clean the data -load the data for data analysis

According to the text, cleaning the data takes between ____ and ____ percent of data analytic professional's time.

50; 90

What is a common difference between a bar chart and a pie chart?

A bar chart can easily show comparisons, where a pie chart cannot.

Which of the following is an accurate description of the Audit Data Standards?

A guide for formatting the way in which data are provided to auditors.

Select the correct definition of class.

A manually assigned category applied to a record based on an event.

Asking colleagues what they think of the analysis would be considered to be part of which stage of the IMPACT cycle.

Address and Refine Results

In which step of the IMPACT cycle do data analysts slice and dice the data, find correlations, ask ourselves further questions, ask colleagues what they think, and revise and rerun the analysis?

Address and Refine Results

Slicing and dicing the data, finding correlations, revising and rerunning the analysis would be considered to be part of which stage of the IMPACT cycle.

Address and Refine Results

Revising and refining your testing are in what stage of the IMPACT model?

Address and refine results

Which of the following is not a benefit of storing data in a relational database?

All of the data is stored in the same table

When determining the data scale, which of the following decisions is relevant?

Are there outliers? Will the data be skewed? What is the context?

_____ is an observation about the frequency of leading digits in many real-life sets of numerical data.

Benford's law

An observation about the frequency of leading digits in many real-life sets of numerical data is called:

Benford's law.

Data sets that are too large and complex for businesses' existing systems to handle are called

Big Data

Which of the following is an example of discrete data?

Birth date

Which of the following visualizations is useful for showing the purchasing amounts of different groups of customers?

Box and whisker plot

Qualitative Data

Categorical - nominal and ordinary

Decision Boundaries

Choice, a technique used to mark the split between one class and another a technique used to mark the split between one class and another

Which approach to Data Analytics attempts to assign each unit in a population into a small set of classes (or groups) where the unit best fits?

Classification

Which approach to data analytics attempts to assign each unit in a population into a small set of classes where the unit belongs?

Classification

As mentioned in the chapter, which of the following is not a common way that data will need to be cleaned after extraction and validation?

Clean up trailing zeroes.

Which testing approach would be considered to be an attempt to divide individuals (like customers) into groups (or clusters) in a useful or meaningful way?

Clustering

Which testing approach would be considered to be an attempt to discover associations between individuals based on transactions involving them?

Co-occurrence Grouping

After data analysts slice and dice the data, find correlations, ask ourselves further questions, ask colleagues what they think, and revise and rerun the analysis, what comes next in the IMPACT cycle?

Communicate Insights

Data Visualization would be part of which step of the IMPACT cycle?

Communicate Insights

Deciding whether to use declarative and exploratory visualizations fit in which phase of the IMPACT model?

Communicate Results

Once the data has been analyzed and the results have been refined, what is the next step in the IMPACT cycle?

Communicate insights

Which of the following is not a step for validating the data for completeness and integrity?

Creating a visualization that describes the data that was extracted in order to communicate the results of the test plan

In the example regarding the LendingClub data in which the analyst is researching loan rejection, they identified three possible indicators for why a loan would be rejected, the debt-to-income ratio, length of employment, and credit [risk] score. Which of the following is/are the explanatory variable(s)?

Credit [risk] score Debt-to-income ratio Length of employment

What process involves the technologies, systems, practices, methodologies, databases, statistics, and applications used to analyze diverse business data to give organizations the information they need to make decisions?

Data Analytics

Which would not be considered as one of the seven skills that analytic-minded accountants should have?

Data description Become a data scientist

The metadata that describes each attribute in a database is which of the following?

Data dictionary

Which of these terms is defined as being a central repository of descriptions for all of the data attributes of the dataset?

Data dictionary

_____ are used to make communication easier between the data requester and the data provider.

Data request forms

__________ mark the split between one class and another.

Decision boundaries

Which of the following options are possible answers to the question 'Are you explaining the results of previously done analysis, or are you exploring the data through the visualization?

Declarative Exploratory

What are attributes that exist in a relational database that are neither primary nor foreign keys?

Descriptive attributes

After you have identified the objects or activity you wish to profile, what should you do next?

Determine the types of profiling you want to perform

Which of the following is not one of the considerations for determining the purpose and scope of the data request?

Determining how the data will be cleaned.

Which of the following steps is completed during the Communicate Results stage of the IMPACT model?

Determining if you are explaining the results of previously done analysis, or if you are exploring the data through the visualization.

When is a primary key required?

Every table in a relational database requires a primary key.

While SQL can be used to create, update, and delete records, we will focus on doing which of the following with SQL?

Extracting data

What type of information would be useful to communicate a data analysis project to a programmer or database administrator?

Extraction, transforming, and loading details

True or false: Classification requires that we know a great deal about the observation that we're attempting to place in a class.

False

True or false: Data analytics involves only the analysis of unstructured data.

False

True or false: Data visualizations are just for "visual" learners.

False

True or false: Pie charts are most useful for numerical data.

False

True or false: When clustering works well, observations within a segment should be different, and the data across segments should be very similar.

False

Which of the following visualizations is useful for showing the relative spending of customers in different locations?

Filled geographic map

After you have identified the attribute you would like to reduce or focus on, what is the next step?

Filter the results.

In which format do analysts typically prefer to analyze data?

Flat file (such as Excel)

Which type of attribute is required to facilitate a relationship between two tables in a normalized, relational database?

Foreign Key

_____ looks for similarities between portions, or segments, of the text of each potential match.

Fuzzy match

Which of the following common visualizations is most useful for showing the relative size of a value by using a color scale.

Heat map

Due to its iterative nature, what comes next in the iterative IMPACT cycle after outcomes are tracked?

Identify the questions

Which of the following is not one of the considerations for obtaining the data?

Identifying any risks that exist in data integrity, as well as the mitigation plan.

___________________ data might be used to address many of the questions facing financial reporting.

Internal and external

What type of information would be useful to communicate a data analysis project to a manager?

Interpretation of results and visualization

Why is Supplier ID considered to be a primary key for a Supplier table?

It contains a unique identifier for each supplier.

Which of the following is true regarding the profiling approach?

It is generally performed on data that is readily available.

What is a benefit of storing data in a relational database?

It maintains "one version of the truth" across multiple data elements.

Which of the following is true regarding the Data Reduction approach?

It primarily uses structured data that is readily searchable.

When you need to retrieve data that is stored in more than one table, which type of clause should you use in your SQL query?

Join

Which of the following visualizations is useful for showing the change in stock price over time?

Line chart

Which of these charts does not do a good job visually representing qualitative data?

Line graph

What is the terminology for the items that are useful for ranking observations rather than simply predicting class probability?

Linear classifiers

Which approach to data analytics attempts to predict a relationship between two data items?

Link prediction

In the example regarding the LendingClub data in which the analyst is researching loan rejection, they identified three possible indicators for why a loan would be rejected, the debt-to-income ratio, length of employment, and credit [risk] score. Which is the response variable?

Loan rejection

_____ include both unsupervised exploratory analysis and supervised model generation to provide insight and predictive foresight into the business and decisions made by accountants and auditors.

Machine learning and artificial intelligence

After you have identified the classes you wish to predict, what is the next step?

Manually classify an existing set of records.

ETL (Extraction, Transformation and Loading) would be an example of which step in the IMPACT cycle?

Master the Data

Reviewing data availability in a firm's internal and external systems would be an example of which step in the IMPACT cycle?

Master the Data

Scrubbing the data would be an example of which step in the IMPACT cycle?

Master the Data

After "Identifying the Question", the next step in the IMPACT cycle is to:

Master the data

__________ data include data that contains simple data such as categories, gender, or ethnic group.

Nominal

__________ data would be considered the least sophisticated type of data.

Nominal

The advantages of storing data in a relational database include which of the following? Option A - Help in enforcing business rules Option B - Increased information redundancy Option C - Integrating business processes

Only Option A and Option C.

If the rank or order of the data matters, what kind of data are you working with?

Ordinal data

__________________ discovered from past archives enable business to identify opportunities and risks and better plan for the future.

Patterns

Which type of attribute is required in each table in a normalized, relational database?

Primary Key

Purchase Order Number Date Supplier ID

Primary Key Descriptive Attribute Foreign key

Which attribute is required to exist in each table of a relational database and serves as the "unique identifier" for each record in a table?

Primary key

_____ might be used to identify areas where there is a lack of controls, changes in procedures, or individuals more willing to spend excessively in potential types of T&E expenses which might be associated with higher risk.

Profiling

What is the terminology for removing branches from a decision tree to avoid overfitting the model?

Pruning

Which of the following options are possible answers to the question 'What type of data are being analyzed'?

Qualitative Quantitative

Line charts are not recommended for what type of data?

Qualitative data

Which of the following visualizations is useful for showing the relationship between income and spending?

Scatter plot

SQL can extract data from two related tables. Place the following lines of SQL code in order to create a query that would retrieve all of the data from the Sales_Subset and the Customer tables.

Select &#x002A FROM customer Inner join sales subset On customer ID

Which testing approach would be considered to be an attempt to identify similar individuals based on data known about them?

Similarity Matching

In the late 1960s, Ed Altman developed a model to predict if a company was at severe risk of going bankrupt. He called his statistic Altman's Z-score, now a widely used score in finance. Based on the name of the statistic, which statistical distribution would you guess this came from?

Standardized normal distribution

__________ is a discriminating classifier that is defined by a separating hyperplane that works first to find the widest margin (or biggest pipe) and then works to find the middle line.

Support vector machines

Which of the following common visualizations is most useful for showing geographic areas?

Symbol map

__________ is a set of data used to assess the degree and strength of a predicted relationship.

Test data

When you need to extract data from more than one table in a SQL query, what do you need to identify in order to properly join the tables?

The two fields that the tables have in common.

What is the purpose of profiling?

To gain an understanding of a typical behavior of an individual, group, population, or sample.

Which of the following is a good use of exploratory visualizations?

To identify an appropriate model

What is the purpose of clustering?

To identify groups of similar data elements and the underlying drivers of these groups.

What is the purpose of a data request form?

To make communication easier between data requester and provider.

What is the purpose of classification?

To predict which class an observation that we know little about will belong to.

Which of the following is a reason to use a declarative visualization?

To prompt conversation and debate

________ data are existing data that have been manually evaluated and assigned a class. _________ data are existing data used to evaluate the model.

Training Test

In the following question, what would be the target? Given a set of customer data, we are trying to predict the total transaction amount based on a variety of attributes.

Transaction amount

True or false: Comparing the number of records extracted to the number of records in the source database is a means of validating the data for completeness and integrity.

True

True or false: Data Analytics can impact Financial Accounting by helping evaluate estimates and valuations.

True

True or false: Exploratory visualization aligns with performing the test plan, gaining insights while you are interacting with the data.

True

True or false: When there are many categories, it may make sense to use a rank-ordered bar chart rather than a pie chart.

True

Which of the following is an example of continuous data?

Turnover ratio

The 3 V's describing Big Data include: Velocity, Variety and ________________.

Volume

Which of the following common visualizations is most useful for showing the frequency of words in a document?

Word cloud

A _________ is used to convert the mean of a distribution to 0 and 1 for each standard deviation.

Z-score

Composite key

a combination of 2 foreign keys used for lime items

Decision Tree

a tool that is used to divide data into smaller groups a tool that is used to divide data into smaller groups

Financial accounting often has challenges with valuation and estimation in all but the following area:

accounts payable

The Forbes Insight/KPMG report, "Audit 2020: A Focus on Change.", found that the vast majority of survey respondents believe that technology will enhance the quality, transparency, and ____________ of the audit.

accuracy

Charts for quantitative data

all of the above except word clouds line charts box and whisker plots scatter plots filled geographic maps

When revising your data analysis plan and communications, it is always a good idea to

ask others to read your writing and make sure it is clear.

Visualizations should help ______.

avoid bias. share information in a clear, concise manner. minimize distractions.

Data sets that are too large and complex for businesses' existing systems to handle are called _______________.

big data

Continuous Data

can take on any value within a range (4.5 feet)

Financial accounting often has challenges with valuation and estimates in all but the following area:

cash

A class is a manually assigned _____________ applied to a record based on an event.

category or group

Classification is a method that can be used to predict the ________ of a new observation.

class or category

An attempt to assign each unit (or individual) in a population into a few categories would be called the _____________ approach.

classification

All of the following are considered to be steps for validating the data after extraction except the following:

clean leading zeroes and nonprintable characters

The purpose of comparing the number of records and descriptive statistics for numeric fields is to ensure that the data were extracted _________.

completely

When evaluating classifiers, you need to be careful to strike a balance between what two things?

complexity of the model and accuracy of the classification

Data that are represented by values within a range and include decimals, such as measurements in inches, are considered.

continuous data.

The number 4.67 is an example of a _____ data point.

continuous or quantitative

ordinary data

counted and categorized like nominal but can be ranked

The extraction process requires two steps. One of the steps is obtaining the __________.

data

When obtaining the data yourself, one of the best tools to use to identify the tables that you could use would be a ______dictionary.

data

The firm practice of monitoring competitors, customers and suppliers to better understand its opportunities and threats is called __________.

data analytics

The real value inherent in data comes from ____________, discovering the various buying patterns of customers, investigating anomalies that were not predicted in firm operations, forecasting future demand and supply and so on.

data analytics

Ratio data

data set reaches 0, it is meaningful of the "absence" of 0 - ex $

Discrete Data

data that represented by the whole number's - ex points in football games

A(n) _____ visualization is used to present findings to an audience.

declarative

Excel is more useful than Tableau if your data analysis project is more _______________.

declarative.

Variance analysis, a common practice in management accounting, is an example of _____ analytics.

diagnostic

As Justin Zobel says in Writing for Computer Science, "good style for science is ultimately, nothing more than writing that is easy to understand. [It should be] clear, unambiguous, correct, interesting, and ___________.

direct

A(n) _____ visualization is used to gain insights while you are interacting with the data.

egain insights whil

In the example provided in the text regarding employee turnover, the analyst is trying to predict employee turnover based on current professional salaries, health of the economy (GDP), and salaries offered by other accounting firms. In this scenario, what is the response variable?

employee turnover

Training Data matches

existing data that have been manually evaluated and assigned a class

Test Data

existing data used to evaluate the model

According to Exhibit 4.3, interactive charts would generally be considered to be ______________ in nature.

exploratory

Mastering the data can also be described via the ETL process. The ETL process stands for:

extract, transform, and load data.

A specific type of data profiling that is used to look for correspondences between portions, or segments, of text for potential matches is called __________ match.

fuzzy

Clustering is an unsupervised method that is used to find natural ________ within the data.

groupings, categories, groups, or similarities

Audit firms are increasingly considering operational data such as manufacturing logs, customer relationship management data and supply chain data primarily to ______________________:

help companies refine their operations

Which of the following describes part of the goal of the ETL process:

identify and obtain the data needed for solving the problem.

Descriptive attributes

include everything else

Quantitative data are most easily expressed as ____________ data.

interval

The Fahrenheit scale of temperature measurement would best be described as an example of:

interval data.

According to the textbook, Data Analytics can be applied to taxes by helping to predict the tax consequences of a potential international transaction, a proposed merger or acquisition or ___________.

investment in R&D (research and development)

Identifying trends over time is best visualized in a _____.

line chart

Quantitative Data

made up of observations that are numerical and con be counted and ranked and averaged - ratio, interval, and continuous

Classification predicts a class for a new observation based on the ____________ identification of classes from previous observations.

manual

Tax compliance deals primarily with filing tax returns. In contrast, tax planning primarily helps

minimize the amount of taxes paid.

As part of mastering the data, data analysts perform data ________________ to reduce data redundancy and improve data integrity.

normalization

Gold, silver, and bronze medals would be examples of:

ordinal data.

Generally the more complex and complete the model, the higher degree of the model _____ the data.

overfitting

In general, the more complex the model, the greater the chance of:

overfitting the data.

Charts for Qualitative Data

pie chart, bar graph stacked bar chart tree maps and heat maps symbol maps word clouds

Machine learning, artificial intelligence and decision support systems are all examples of _____ analytics.

prescriptive

Decision support systems are an example of _____.

prescriptive analytics

The extraction process requires two steps. One of the steps is determining the _________ and __________ of the data request.

purpose scope

Nominal data and ordinal data are examples of _____ data.

qualitative

According to Exhibit 4.3, conventional and static charts would be considered to be declarative and ____________.

quantitative

Data that has a meaningful difference between data points is considered

quantitative data.

Qualitative data are most easily expressed as ____________ data.

ratio

With _______ data the number 0 means that the item is now missing or absent.

ratio

The four benefits of storing data in a relational database are completeness of data, no __________ data, business rules are enforced, and communication and integration of business processes.

redundant

An attempt to estimate or predict, for each unit, the numerical value of some variable using some type of statistical model would be called the _____________ approach.

regression

A UML Class Diagram is used to support and design a _________ database.

relational

Traditional audit approaches tested data _____________ of the transactions; in contrast, audits that fully integrates big data and analytics will test the full _______________ of data.

sample; population

Structured data is stored in a database or spreadsheet and are readily ___________.

searchable

nominal data

simple form, but cant rank it - ex. hair color, gender

Data visualization can be used with _____ data to make the results easier to interpret.

small big

In a significant paradigm shift, data analytics will allow auditors to:

stay engaged with clients beyond the audit

Data that are organized and reside in a fixed field with a record or a file are generally contained in a relational database or spreadsheet and are readily searchable by search algorithms. The term matching this definition is:

structured data.

A/an ________ approach is used when you are performing analysis that uses historical data to predict a future outcome based on a specific question.

supervised

Models associated with regression and classification data approaches have all except this important part:

test data.

Justin Zobel suggests that revising your writing requires you to "be egoless—ready to dislike anything you have previously written," suggesting that it is __________ you need to please.

the reader

What is XBRL used for?

to facilitate the exchange of financial reporting information between a company and the SEC.

The purpose of transforming data is:

to validate the data for completeness and integrity.

The Forbes Insight/KPMG report, "Audit 2020: A Focus on Change.", found that the vast majority of survey respondents believe that technology will enhance the quality, __________, and accuracy of the audit.

transparency

A decision _________ is a tool used to divide data into smaller groups. Decision ________ is a technique used to mark the split between one class and another.

tree or trees boundaries or boundary

McKinsey Global Institute estimates that Data Analytics could generate up to $3 _____ in value each year.

trillion

In general, the simpler the model, the greater the chance of:

underfitting the data.

Primary key

unique identifier

Profiling is a/an _______ method that is used to discover patterns of behavior, based on the distance of z-scores from the mean.

unsupervised

As part of mastering the data, data analysts perform data ________________ to provide a sense of the reliability of the data.

validation and completeness testing

Big Data is often described by the three Vs, or

volume, velocity, and variety

Using a classification model, you can predict ___________ a new vendor belongs to one class or another based on the behavior of others.

whether

Knowing the mean and standard deviation, and assuming a normal distribution, one can compute which statistic that can be used to identify abnormal transactions?

z-score


Related study sets

ISSA Domain One Basic and applied science

View Set

P.M. Ch.8: Project Quality Management

View Set

GEO111 - Grundzüge und Sphären

View Set

Philosophy short answer questions

View Set

Chapter 9: Earthquakes and Volcanoes

View Set