MIS ch 6

Ace your homework & exams now with Quizwiz!

Why is an effective ETL process essential to data warehousing?

"Dirty data" can result in incorrect or misleading statistics used for decision making.

Donna is a member of a team trying to select the best type of database for a business problem. If the database must handle a variety of data, she would like to store the data on a group of servers, and the data structures must be very flexible, what would you suggest?

A NoSQL database would likely be a better fit than a relational database.

Because they must deal with large quantities of data from so many different sources, IS employees at financial institutions may be at increased risk of failing to comply with government regulations designed to prevent money laundering, such as the _____.

Bank Secrecy Act

_____ is the term used to describe enormous and complex data collections that traditional data management software, hardware, and analysis processes are incapable of handling.

Big data

Which of the following is IBM's BI product?

Cognos Business Intelligence

_____ is used to explore large amounts of data for hidden patterns to predict future trends.

Data mining

____ is the last phase of the six-phase CRISP-DM method.

Deployment

Suppose your employer has four manufacturing units, each with its own data operations. They are considering using BI and analytics tools. Which of the following is a valid recommendation regarding the BI and analytics initiative?

Ensure that an effective company-wide data management program, including data governance, is in place.

Which of the following is required to create a traditional data warehouse but NOT a data lake?

Extract Transform Load process

What open-source software framework includes several software modules that provide a means for storing and processing extremely large data sets, organized into two primary components?

Hadoop

Which statement about Hadoop is correct?

Hadoop's HDFS divides data into subsets and distributes them onto different servers.

Which of the following is a potential disadvantage of self-service analytics?

It can lead to over-spending on unapproved data sources and business analytics tools.

You are to advise XYZ Corporation so that their BI and analytics efforts are fruitful. Which among the following is the most crucial advice of all?

Management must have a strong commitment to data-driven decision making.

Hadoop processes data using a Java-based system called _____.

MapReduce

Which statement regarding processing task completion using Hadoop's MapReduce program is correct?

MapReduce employs a JobTracker residing on the master server and TaskTrackers residing on other servers.

A newly discovered entity or attribute can be added to a NoSQL database dynamically because _____.

NoSQL databases do not require a predefined schema

One difference between NoSQL and relational databases is that _____.

NoSQL databases have a greater horizontal scaling capability

Sometimes Theodore's queries for data from his employer's NoSQL database do not return the most current data. Why is this?

NoSQL databases provide for "eventual consistency" when processing transactions.

Once customer orders are organized into queues based on product ID, the component of the Hadoop software that performs a summary operation, such as determining how frequently each product was ordered, is the _____.

Reduce method

From which vendor is the BI product Business Objects available?

SAP

_________ encourages nontechnical end users to make decisions based on facts and analyses rather than intuition.

Self-service analytics

The graphical representation that summarizes the steps a consumer takes in making the decision to buy your product and become a customer is called _____.

a conversion funnel

Hadoop's two major components are _____.

a data processing component and a distributed file system

Which of the following events is an example of a transformation that might occur during the second stage of the ETL process?

a sales district is substituted for a customer's street address

Marina is a data scientist at a large financial corporation, and therefore she _____.

aims to uncover insights that will influence organizational decisions

During the modeling phase of the CRISP-DM method, the team conducting the data mining project ______.

applies selected modeling techniques

Graph NoSQL databases _____.

are well-suited for analyzing interconnections

Which of the following is the LEAST essential characteristic for success as a data scientist?

business leadership skills

One challenge presented by the volume of big data is that _____.

business users can have a hard time finding the information they need to make decisions

An important role of the data scientist is to _____.

communicate his or her findings to organizational leaders

Melanie's company takes a "store everything" approach to big data, saving all of it in a raw, unaltered form. Only when she needs to analyze some of the data is it extracted from this _____.

data lake

A _____ is a subset of a data warehouse that is used by small- and medium-sized businesses and departments within large companies to support decision making.

data mart

Which of the following is NOT considered a component of business intelligence?

data transaction processing

Suppose you have access to quantitative data on populations by ZIP code and crime rates. You wish to determine if there is a relationship between the two variables and display the results in a graph. Which BI tool will be most useful?

data visualization tool

Barry's job responsibilities include helping maintain a large database that holds business information from over a dozen source systems, covering all aspects of his company's processes, products, and customers. This database contains not only enterprise data but also data from other organizations. Barry works with a(n) _____.

data warehouse

The Amazon DynamoDB and Oracle NoSQL Database products both support which data storage and retrieval models?

document and key-value

Jan is using her firm's data warehouse. If she is starting from monthly sales data, and wishes to get weekly sales data, she should use the ___ feature.

drill down

If not well managed, self-service BI and analytics may lead to poor decisions based on _____.

erroneous analysis and reporting

During which step of the ETL process can data that fails to meet expected patterns or values be rejected to help clean up "dirty data"?

extract

Which step of the ETL process has the goal of collecting source data from all the desired sources and converting it into a single format suitable for processing?

extract

A conversion funnel is a visual depiction of a set of words that have been grouped together because of the frequency of their occurrence.

false

A marketing manager who does not have deep knowledge of information systems or data science will NOT be able to use BI and analytics tools.

false

Business analytics can be used only for forecasting future business results.

false

Creative data scientists--a key component of BI efforts in an organization--are people who are primarily focused on coming up with novel ways of analyzing data.

false

Data scientists do not need much business domain knowledge.

false

In the regression equation y = ax1 + bx2, the coefficients a and b represent the dependent variables.

false

Organizations can collect many types of data from a wide variety of sources, but typically they only collect structured data that fits neatly into traditional relational database management systems.

false

Regression analysis focuses on plotting a sequence of well-defined data points measured at uniform time intervals.

false

Scenario analysis tools are the best choice for solving optimization-type problems.

false

Suppose a manager wishes to analyze historical trends in sales. He would use the online transaction processing (OLTP) system.

false

Users still need help from the IT function of the organization to create customer reports using modern reporting tools.

false

Jerome recommends that his company's IS team consider moving from a database on secondary storage to an in-memory database because the in-memory database would provide _____.

faster access to data

What does Hadoop's Map procedure from its MapReduce program do?

filters and sorts

What determines the size of words in a word cloud?

frequency of occurrence of the word in source documents

Mollie has observed that her company's leadership lack a strong commitment to data-driven decision making. As a result, when she learns that her company will be initiating a business intelligence and analytics program, she anticipates that _____.

her company will miss out on the real value of their BI and analytics

Which of these analysis methods describes neural computing?

historical data is examined for patterns that are then used to make predictions

A hospital system that wants to utilize big data can use HIPAA regulations to help them _____.

identify which data needs to be protected from unauthorized access

A database system that stores the entire database in random access memory is known as a(n) _____.

in-memory database

KDDI Corporation chose to consolidate their servers into a single Oracle SuperCluster running the Oracle Times Ten in-memory database in order to _____.

increase data access rates and efficiency

Which of the following is NOT a recognized BI and analytics technique?

online transaction processing

One of the goals of business intelligence is to _________.

present the results of analysis in an easy-to-understand manner

Some people are alarmed that big data applications allow organizations to develop extensive profiles of individuals without their knowledge or consent. This represents which type of concern related to big data?

privacy

Data scientists are a necessary component to ensure an organization's business intelligence and analytics efforts are effective because they _____.

pull together knowledge of the business and data analytics tools and techniques

The key challenges associated with big data include the difficulty of locating and deriving value from _____.

relevant data to make decisions

Self-service BI and analytics can exacerbate problems by _____.

removing checks and balances on data preparation and use

Jamie's corporation comes under scrutiny by the media when former employees allege that the IS department failed to correctly identify which data needed protection from unauthorized access. These accusers say that this organization is not ensuring its big data is _____.

secure

To identify and make predictions about various alternative scenarios, a manager would use _______.

simulation techniques

The purpose of business intelligence is to _____.

support improved decision making

Big data veracity is a measure of _____.

the accuracy, completeness, and currency of the data

During the load phase of the ETL process, _____.

the data is checked against the constraints defined in the database schema

According to the McKinsey Global Institute, _____.

the demand for data scientists could outpace supply by up to 250,000 jobs in 2024

The use of in-memory databases for processing big data has become feasible in recent years, thanks to _____.

the increase in RAM capacities

Between 2017 and 2025, _____.

the volume of data in the digital universe is expected to grow tenfold

One key difference between a relational database and a NoSQL database is _____.

the way data storage and retrieval are modeled

Roberta's IS team processes data using an in-memory database with a multiple-core CPU. This means that _____.

they can process large amounts of data rapidly

Why are data managers recommended to determine key metrics, an agreed-upon vocabulary, and how to define and implement security and privacy policies when setting up a self-service analytics program?

to mitigate the associated risks

A well-designed series of rules or algorithms is a key component of which stage of the ETL process?

transform

Data governance involves identifying people who are responsible for fixing and preventing issues with data.

true

Data mining is used to explore large amounts of data, looking for hidden patterns that can be used to predict future trends and behaviors.

true

During drill-down, you go from high-level summary data to detailed levels of data.

true

For an organization to get real value from its BI and analytics efforts, it must have a solid data management program.

true

If you wish to study a visual depiction of the relative frequencies of words in a document, a word cloud would be an appropriate option.

true

Regression analysis is useful when you wish to predict the value of a quantitative variable based on a another quantitative variable.

true

Some insurance companies can detect fraudulent claims using BI and analytics software.

true

Suppose you are good in math and statistics. Adding programming to your skill-set will likely be necessary if you want to be a data scientist.

true

The data for BI (business intelligence) comes from many sources.

true

To solve a linear programming problem in order to maximize profits for certain product, you can use Excel's Solver add-in.

true

Unstructured data comes from sources such as word-processing documents and surveillance video.

true

Haley's employer has asked her to review a database containing thousands of social media posts about their company's products and extract the data the executive team needs to make decisions about these products and their marketing. In terms of the characteristics of big data, Haley is focusing on ________.

value

Guillarme, a data scientist, utilizes data from company documents, machine logs, Data.gov, and Facebook Graph in his work. What characteristic of big data does this best demonstrate?

variety

One key characteristic of big data is that it is being generated at a rate of 2.5 quintillion bytes per day. This is known as big data's _____.

velocity

Which of the following is NOT a component required for effective BI and analytics?

well-maintained NoSQL databases

Marshall's company currently maintains their data on in-house servers, but his supervisor has asked him to research their options for having some or all of it hosted by a cloud service provider. Which challenge of big data is Marshall helping to address?

where and how to store the data


Related study sets

7.3.8 Testout Practice Questions; Security +

View Set

Enlightenment and Scientific Revolution

View Set

Principles of Private Flight Quiz 3

View Set

Catcher in the Rye Study Guide English 10 Honors Academia del Perpetuo Socorro

View Set

Chapter 36 - Economic Foundations

View Set