IST 210 Databases Chapter: 14 Big Data Analytics and NoSQL

Ace your homework & exams now with Quizwiz!

Graph Database

A NoSQL database model based on graph theory that stores data on relationship-rich data as a collection of nodes and edges.

Column Family Database

A NoSQL database model that organizes data into key-value pairs, in which the value component is composed of a set of columns that vary by row.

Key-Value Database (KV)

A NoSQL database model that stores data as a collection of key-value pairs in which that value component is unintelligible to DBMS.

Document Database

A NoSQL database model that stores data in key-value pairs in which the value component of a tag-encoded document.

Job Tracker

A central control program used to accept, distribute, monitor, and report on MapReduce processing jobs in a Hadoop environment.

Volume

A characteristic of Big Data that describes the quanity of data to be stored.

Velocity

A characteristic of Big Data that describes the speed at which data enters the system and must be processed.

Variety

A characteristic of Big Data that describes the variations in the structure of data to be stored.

Batch Processing

A data processing method that runs data processing tasks from beginning to end without any user interaction.

NewSQL

A database model that attempts to provide ACID-compliant transactions across a highly distributed infrastructure.

Hadoop Distributed File System (HDFS)

A highly distributed, fault-tolerant file storage system designed to manage latge amounts of data at high speeds.

JSON (JavaScript Object Notation)

A human-readable text format for data interchange that defines attributes and values in a document.

BSON (Binary JSON)

A human-readable text format for data interchange that expands the JSON format to include addtional data types including binary objects

Scaling Out

A method for dealing with data growth that involves distributing data storage structures across a cluster of commodity servers.

Scaling Up

A method for dealing with data growth that involves migrating the same structure to move powerful systems.

Sentiment Analysis

A method of text analysis that attempts to determine if a statement conveys a positive, negative, or neutral attitude.

NoSQL

A new generation of database management systems that is not based on the traditional relational database model.

Column-Centric Storage

A physical data storage technique in which data is stored in blocks, which hold data from a single column across many rows.

Row-Centric Storage

A physical data storage technique in which data is stored in blocks, which hold data from all columns of a given set of rows.

Algorithm

A process or set of operations in a calculation.

Data Mining

A process that employs automated tools to analyze data in a data warehouse and other sources and to proactively identify possible relationships and anomalies.

Task Tracker

A program in the MapReduce framework responsible to running map and reduce tasks on a nide.

Mapper

A program that performs a map function.

Reducer

A program that performs a reduce function.

Traversal

A query graph database.

Data Analytics

A subset of business intelligence functionality that encompasses a wide range of mathematical, statistical, and modeling techniques with the purpose of extracting knowledge from data.

MapReduce

An open-source application programming interface (API) that provides fast data analytics services; one of the main Big Data technologies that allows organizations to process massive data stores.

Feedback Loop Processing

Analyzing stored data to produce actionable results.

Explanatory Analytics

Data analysis that provides ways to discover relationships, trends, and patterns among data.

Predictive Analytics

Data analytics that use advanced statistical and modeling techniques to predict future business outcomes with great accurancy.

Unstructured Data

Data that does not conform to predefined data model.

Structured Data

Data the conforms to a predefined data model.

Column Family

In a column family database, a column that is composed of a group of other related coloumns.

Super Column

In a column family database, a column that is composed of a group of other related columns.

Edge

In a graph database, the representation of a relationship between nodes.

Node

In a graph database, the representation of a single entity instanace.

Properties

In a group database, the attributes or characteristics of a node or edge that are of interest to the users.

Bucket

In a key-value database, a logical collection of related key-value pairs.

Block Report

In the Hadoop Distributed File System (HDFS), a report sent every 6 hours by the data node to the name informing the name node which blocks are on that data node.

Heartbeat

In the Hadoop Distributed File System (HDFS), a signal sent every 3 seconds from the data node to notify the name node that the data node is still available.

Visualization

The ability to graphically present data in such a way as to make it understandable to users.

Varability

The characteristic of Big Data for the same data values to vary in meaning over time.

Polyglot Persistence

The coexistence of a variety of data storage and data management technologies within an organization's infrastructure.

Value

The degree to which data can be analyed to provide meaningful insights.

Reduce

The function in a MapReduce job that collects and summarizes the results of map functions to produce a single result.

Map

The function in a MapReduce job that sorts and filters data into a set of key-value pairs as a subtask within a larger job.

Stream Processing

The processing of data inputs in order to make decisions about which data to keep and which data to discard before storage.

Veracity

The trustworthiness of a set of data.


Related study sets

CH02 Sizing up Computers and Device Hardware

View Set

Harr MLS Review Chemistry 5.6 Proteins, Electrophoresis, and Lipids

View Set

Art history test 2 fall semester

View Set

Pennys Book Chapter 14- Musculoskeletal Imaging, Breast, and Superficial Structures

View Set

Critical Reading and Listening: Argument Mapping

View Set

Quicksand of Racial Injustice Test

View Set

Permit Test Unit 2 (LEARNING TO DRIVE)

View Set