ACC Systems 2 Trinkle Ch 1-4

Ace your homework & exams now with Quizwiz!

As mentioned in the chapter, which of the following is not a common way that data will need to be cleaned after extraction and validation?

Clean up trailing zeroes.

Big Data is often described by the three V's, or

Volume, velocity, and variety

__________ mark the split between one class and another.

Decision boundaries

Which approach to Data Analytics attempts to identify similar individuals based on data known about them?

Similarity matching

Models associated with regression and classification data approaches have all but this important part:

Test data.

The observation that the frequency of leading digits in many real-life sets of numerical data is called:

Benford's law.

Which approach to Data Analytics attempts to assign each unit in a population into a small set of classes where the unit belongs?

Classification

Which approach to data analytics attempts to assign each unit in a population into a small set of classes where the unit belongs?

Classification

Which skills were not emphasized that analytic-minded accountants should have?

Classification of test approaches & Data and systems analysis and design

The metadata that describes each attribute in a database is which of the following?

Data dictionary

Which of these terms is defined as being a central repository of descriptions for all of the data attributes of the dataset?

Data dictionary

What are attributes that exist in a relational database that are neither primary nor foreign keys?

Descriptive attributes

Mastering the data can also be described via the ETL process. The ETL process stands for:

Extract, transform, and load data.

The goal of the ETL process is to

Identify and obtain the data needed for solving the problem.

The Fahrenheit scale of temperature measurement would best be described as an example of:

Interval data.

Why is Supplier ID considered to be a primary key for a Supplier table?

It contains a unique identifier for each supplier.

Which of these is not included in the five steps of the ETL process?

Learn what data is available in the data warehouse.

Which approach to Data Analytics attempts to predict relationship between two data items?

Link prediction

Which approach to data analytics attempts to predict a relationship between two data items?

Link prediction

__________ data would be considered the least sophisticated type of data.

Nominal

The advantages of storing data in a relational database include which of the following: Option A - Help in enforcing business rules. Option B - Increased information redundancy. Option C - Integrating business processes.

Only Option A and Option C.

Gold, silver and bronze medals would be examples of

Ordinal data.

In general, the more complex the model, the greater the chance of

Overfitting the data.

Which attribute is required to exist in each table of a relational database and serves as the "unique identifier" for each record in a table?

Primary key

Line charts are not recommended for what type of data?

Qualitative data

__________ data would be considered the most sophisticated type of data.

Ratio

Which of the following is not a typical example of nominal data?

SAT scores

By the year 2020, about 1.7 megabytes of new information be created every:

Second.

In the late 1960s, Ed Altman developed a model to predict if a company was at severe risk of going bankrupt. He called his statistic, Altman's Z-score, now a widely used score in finance. Based on the name of the statistic, which statistical distribution would you guess this came from?

Standardized normal distribution

Data that are organized and reside in a fixed field with a record or a file. Such data are generally contained in a relational database or spreadsheet and are readily searchable by search algorithms. The term matching this definition is:

Structured data.

__________ is a discriminating classifier that is defined by a separating hyperplane that works first to find the widest margin (or biggest pipe) and then works to find the middle line.

Support vector machines

__________ is a set of data used to assess the degree and strength of a predicted relationship.

Test data

Justin Zobel suggests that revising your writing requires you to "be egoless—ready to dislike anything you have previously written," suggesting that it is __________ you need to please:

The reader

The purpose of transforming data is:

To validate the data for completeness and integrity.

In general, the more simple the model, the greater the chance of

Underfitting the data.

The IMPACT cycle includes all but the following process:

Visualize the data. & Data preparation.


Related study sets

PrepPat Cours 1 - Enregistrement et analyse d'un ECG au repos

View Set

A5_M4: Agreed-upon procedures and prospective financial statements.

View Set

Hematology embriology & Anatomy Amboss Q&A

View Set

Phys6C Ch41 Nuclear Physics and Radioactivity

View Set

General Psychology Final Exam Review (Columbia College: McMahon)

View Set