Database chapter 10

¡Supera tus tareas y exámenes ahora con Quizwiz!

Data in MongoDB is represented in:

BSON.

Hive uses ________ to query data.

HiveQL

The Hadoop framework consists of the ________ algorithm to solve large scale problems.

MapReduce

HBASE is a wide-column store database that runs on top of HDFS (modeled after Google).

True

HP HAVEn integrates HP technologies with open source big data technologies.

True

NoSQL databases DO NOT support ACID (atomicity, consistency, isolation, and durability).

True

NoSQL stands for 'Not only SQL.'

True

The 'schema on read' approach often incorporates JSON or XML.

True

The original three 'v's' attributed to big data include volume, variety, and velocity.

True

NoSQL focuses on:

flexibility.

The NoSQL model that includes a simple pair of a key and an associated collection of values is called a:

key-value store.

It is true that in an HDFS cluster the DataNodes are the:

large number of slaves.

Big data includes:

large volumes of data with many different data types that are processed at very high speeds.

Big data requires effectively processing:

many data types.

Structured Query Language (SQL) is a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful information.

False

MongoDB databases are composed of:

collections.

When a data repository (including internal and external data) does NOT follow a predefined schema, this is called a:

data lake.

The Hadoop Distributed File System (HDFS) is the foundation of a ________ infrastructure of Hadoop.

data management

With HDFS it is less expensive to move the execution of computation to data than to move the:

data to computation.

Big data:

does not require a strictly defined data model.

Value (related to the five 'v's' of big data) addresses the pursuit of a meaningful goal.

True

________ includes NoSQL accommodation of various data types.

Variety

The NoSQL model that is specifically designed to maintain information regarding the relationships (often real-world instances of entities) between data items is called a:

graph-oriented database.

NoSQL includes data storage and retrieval:

not based on the relational model.

Hive is a(n) ________ data warehouse software.

Apache

The dive in anywhere characteristic of a data lake overrides constraints related to confidentiality.

False

The schema on write and schema on read are considered synonymous approaches.

False

Transaction processing and management reporting tend to fit big data databases better than relational databases.

False

An organization that requires a sole focus on performance with the ability for keys to include strings, hashes, lists, and sorted sets would select ________ database management system.

Redis

________ is the most popular key-value store NoSQL database management system.

Redis

________ includes the value of speed in a NoSQL database.

Velocity

________ includes concern about data quality issues.

Veracity

________ is an important scripting language to help reduce the complexity of MapReduce.

Pig

Although volume, variety, and velocity are considered the initial three v dimensions, two additional Vs of big data were added and include:

veracity and value.

The three 'v's' commonly associated with big data include:

volume, variety, and velocity.

Big data allows for two different data types (text and numeric).

False

The target market for Hadoop is small to medium companies using local area networks.

False

Word processing documents are commonly stored in a 'document store' NoSQL database model.

False

Hive creates MapReduce jobs and executes them on a Hadoop Cluster.

True

NoSQL systems allow ________ by incorporating commodity servers that can be easily added to the architectural solution

scaling out

When reporting and analysis organization of the data is determined when the data is used is called a(n):

schema on read

Economies of storage indicate data storage costs increase every year.

False

Neo4j is a wide-column NoSQL database management system developed by Oracle.

False

NoSQL focuses on avoidance of replication and minimizing storage space.

False

An organization using HDFS realizes that hardware failure is a(n):

norm.

An organization that decides to adopt the most popular NoSQL database management system would select:

MongoDB.

An organization that requires a graph database that is highly scalable would select the ________ database management system.

Neo4j

According to your text, NoSQL stands for:

Not Only SQL.

________ generally processes the largest quantities of data.

Transaction processing

A business owner that needs carefully normalized tables would likely need a relational database instead of a NoSQL database.

True

Apache Cassandra is a wide-column NoSQL database management system.

True

Big data databases tend to sacrifice consistency for availability.

True

Collect everything is a characteristic of a data lake.

True

JSON is commonly used in conjunction with the 'document store' NoSQL database model.

True

At a basic level, analytics refers to:

analysis and interpretation of data.

NoSQL systems enable automated ________ to allow distribution of the data among multiple nodes to allow servers to operate independently on the data located on it.

sharding

It is true that in an HDFS cluster the NameNode is the:

single master server.

The primary use of Pig is to:

transform raw data into a format that is useful for analysis.

Apache Cassandra is a leading producer of ________ NoSQL database management systems.

wide-column

The NoSQL model that incorporates 'column families' is called a:

wide-column store.


Conjuntos de estudio relacionados

Science - What are the Planets in our Solar System?

View Set

HUMAN BIOLOGY - Possible Final Questions

View Set

Binary, Denary and Hexadecimal Conversion HJ

View Set

New Issues - Corporate Underwritings Practice Questions

View Set