BI Chapter 7 S/A
The ________ Node in a Hadoop cluster provides client information on where in the cluster particular data is stored and if any nodes fail.
Name
HBase, Cassandra, MongoDB, and Accumulo are examples of ________ databases.
NoSQL
As volumes of Big Data arrive from multiple sources such as sensors, machines, social media, and clickstream interactions, the first step is to ________ all the data reliably and cost effectively.
capture
HBase is a nonrelational ________ that allows for low-latency, quick lookups in Hadoop.
database
Hadoop is primarily a(n) ________ file system and lacks capabilities we'd associate with a DBMS, such as indexing, random access to data, and support for SQL.
distributed
As the size and the complexity of analytical systems increase, the need for more ________ analytical systems is also increasing to obtain the best performance.
efficient
Big Data comes from ________.
everywhere
________ bring together hardware and software in a physical unit that is not only fast but also scalable on an as-needed basis.
Appliances
________ speeds time to insights and enables better data governance by performing data integration and analytic functions inside the database.
In-database analytics
________ of data provides business value; pulling of data from multiple subject areas and numerous applications into one repository is the raison d'être for data warehouses.
Integration
In the world of Big Data, ________ aids organizations in processing and analyzing large volumes of multistructured data. Examples include indexing and search, graph analysis, etc.
MapReduce
________ refers to the conformity to facts: accuracy, quality, truthfulness, or trustworthiness of the data.
Veracity
The problem of forecasting economic activity or microclimates based on a variety of data beyond the usual retail data is a very recent phenomenon and has led to another buzzword — ________.
alternative data
In-motion ________ is often overlooked today in the world of BI and Big Data.
analytics
In open-source databases, the most important performance enhancement to date is the costbased ________.
optimizer
Big Data employs ________ processing techniques and nonrelational data storage capabilities in order to process unstructured and semistructured data.
parallel
In the energy industry, ________ grids are one of the most impactful applications of stream analytics.
smart
A job ________ is a node in a Hadoop cluster that initiates and coordinates MapReduce jobs, or the processing of the data.
tracker
The ________ of Big Data is its potential to contain more useful patterns and interesting anomalies than "small" data.
value proposition
Organizations are working with data that meets the three V's-variety, volume, and ________ characterizations.
velocity