ISDS 2001 CH. 6

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

________ refers to the conformity to facts: accuracy, quality, truthfulness, or trustworthiness of the data.

Veracity

In-motion ________ is often overlooked today in the world of BI and Big Data.

analytics

In the eBay use case study, load ________ helped the company meet its Big Data needs with the extremely fast data handling and application availability requirements.

balancing

As volumes of Big Data arrive from multiple sources such as sensors, machines, social media, and clickstream interactions, the first step is to ________ all the data reliably and cost effectively.

capture

HBase is a nonrelational ________ that allows for low-latency, quick lookups in Hadoop.

database

Hadoop is primarily a(n) ________ file system and lacks capabilities we'd associate with a DBMS, such as indexing, random access to data, and support for SQL.

distributed

As the size and the complexity of analytical systems increase, the need for more ________ analytical systems is also increasing to obtain the best performance.

efficient

Data ________ or pulling of data from multiple subject areas and numerous applications into one repository is the raison d'être for data warehouses.

integration

Most Big Data is generated automatically by ________.

machines

In open-source databases, the most important performance enhancement to date is the cost-based ________.

optimizer

Big Data employs ________ processing techniques and nonrelational data storage capabilities in order to process unstructured and semistructured data.

parallel

In the energy industry, ________ grids are one of the most impactful applications of stream analytics.

smart

In the U.S. telecommunications company case study, the use of analytics via dashboards has helped to improve the effectiveness of the company's ________ assessments and to make their systems more secure.

threat

A job ________ is a node in a Hadoop cluster that initiates and coordinates MapReduce jobs, or the processing of the data.

tracker

The ________ of Big Data is its potential to contain more useful patterns and interesting anomalies than "small" data.

value proposition

Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near—real time with highly accurate insights. What is this process called? A) in-memory analytics B) in-database analytics C) grid computing D) appliances

A

Companies with the largest revenues from Big Data tend to be A) the largest computer and IT services firms. B) small computer and IT services firms. C) pure open source Big Data firms. D) non-U.S. Big Data firms.

A

In the Big Data and Analytics in Politics case study, which of the following was an input to the analytic system? A) census data B) assessment of sentiment C) voter mobilization D) group clustering

A

A newly popular unit of data in the Big Data era is the petabyte (PB), which is A) 109 bytes. B) 1012 bytes. C) 1015 bytes. D) 1018 bytes.

C

In a Hadoop "stack," what is a slave node? A) a node where bits of programs are stored B) a node where metadata is stored and used to organize data processing C) a node where data is stored and processed D) a node responsible for holding all the source programs

C

In the Big Data and Analytics in Politics case study, what was the analytic system output or goal? A) census data B) assessment of sentiment C) voter mobilization D) group clustering

C

In the health sciences, the largest potential source of Big Data comes from A) accounting systems. B) human resources. C) patient monitoring. D) research administration.

C

Under which of the following requirements would it be more appropriate to use Hadoop over a data warehouse? A) ANSI 2003 SQL compliance is required B) online archives alternative to tape C) unrestricted, ungoverned sandbox explorations D) analysis of provisional data

C

Using data to understand customers/clients and business operations to sustain and foster growth and profitability is A) easier with the advent of BI and Big Data. B) essentially the same now as it has always been. C) an increasingly challenging task for today's enterprises. D) now completely automated with no human intervention required.

C

Which Big Data approach promotes efficiency, lower cost, and better performance by processing jobs in a shared, centrally managed pool of IT resources? A) in-memory analytics B) in-database analytics C) grid computing D) appliances

C

Which of the following sources is likely to produce Big Data the fastest? A) order entry clerks B) cashiers C) RFID tags D) online customers

C

HBase, Cassandra, MongoDB, and Accumulo are examples of ________ databases.

NoSQL

________ bring together hardware and software in a physical unit that is not only fast but also scalable on an as-needed basis.

Appliances

In a Hadoop "stack," what node periodically replicates and stores data from the Name Node should it fail? A) backup node B) secondary node C) substitute node D) slave node

B

In the Luxottica case study, what technique did the company use to gain visibility into its customers? A) visibility analytics B) data integration C) focus on growth D) customer focus

B

Traditional data warehouses have not been able to keep up with A) the evolution of the SQL language. B) the variety and complexity of data. C) expert systems that run on them. D) OLAP.

B

What is Big Data's relationship to the cloud? A) Hadoop cannot be deployed effectively in the cloud just yet. B) Amazon and Google have working Hadoop cloud offerings. C) IBM's homegrown Hadoop platform is the only option. D) Only MapReduce works in the cloud; Hadoop does not.

B

What is the Hadoop Distributed File System (HDFS) designed to handle? A) unstructured and semistructured relational data B) unstructured and semistructured non-relational data C) structured and semistructured relational data D) structured and semistructured non-relational data

B

All of the following statements about MapReduce are true EXCEPT A) MapReduce is a general-purpose execution engine. B) MapReduce handles the complexities of network communication. C) MapReduce handles parallel programming. D) MapReduce runs without fault tolerance.

D

Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage. What is this feature of Big Data called? A) volatility B) periodicity C) inconsistency D) variability

D

How does Hadoop work? A) It integrates Big Data into a whole so large data elements can be processed as a whole on one computer. B) It integrates Big Data into a whole so large data elements can be processed as a whole on multiple computers. C) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on one computer. D) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on multiple computers.

D

In the Discovery Health insurance case study, the analytics application used available data to help the company do all of the following EXCEPT A) predict customer health. B) detect fraud. C) lower costs for members. D) open its own pharmacy.

D

________ speeds time to insights and enables better data governance by performing data integration and analytic functions inside the database.

In-database analytics

In the world of Big Data, ________ aids organizations in processing and analyzing large volumes of multi-structured data. Examples include indexing and search, graph analysis, etc.

MapReduce

The ________ Node in a Hadoop cluster provides client information on where in the cluster particular data is stored and if any nodes fail.

Name


Set pelajaran terkait

Chapter 10 Standard Costs and Variances

View Set

Python for CP1-2 (Part 1:Beginning)

View Set

Marine Science Chapter Four: Waves and Tides

View Set

everything about world history up until the protestant reformation

View Set

Effective Supervisory Practices 5th Ed. Chapters 8-16

View Set