Chapter 8
____________ is the process of extracting novel patterns and knowledge structures from continuous, rapid data records.
Data stream mining
In typical data stream mining applications, the purpose is to predict the class or value of new instances in the data stream, given some knowledge about the class membership or values of previous instances in the data stream.
True
The term velocity in big data analytics refers to how fast the digitized data is created and processed.
True
Which of the following is not among the V's used to define Big Data?
Variance
Which of the following is not considered as the key to the success of Big Data analytics.
A fact-based transaction system
Which of the following facts are related to Hadoop?
All of the other answers are true
Which of the following are the main problems that can be addressed by big data analytics?
All the answers are true
In big data analytics initiative, to create a fact-based decision-making culture, the senior management needs to:
Be a vocal supporter
____________ refers to the analytic process of extracting actionable information from continuously generated data.
Both A and B are true
A stream in a stream analytics is defined as a discrete and aggregated level of data elements.
False
Hadoop distributed file system was invented before Google developed MapReduce. Hence, the early versions of MapReduce relied on HDFS.
False
Hadoop is the replacement for a data warehouse that stores and processes large amounts of structured data.
False
MapReduce is a contemporary programming language designed to be used by computer programmers.
False
The main benefit of Hadoop is that it allows enterprises to process and analyze large volumes of structured and semi-structured data on specialized hardware.
False
The term veracity in big data analytics refers to the processing of different types and formats of data, structured and unstructured.
False
Which of the following is not a product from Apache Hadoop foundation.
Hana
The application examples of the MapReduce includes:
Indexing, graph analysis, text analysis, machine learning
___________ is the node in a Hadoop cluster that initiates and coordinates MapReduce jobs or the processing of the data.
Job tracker
Which of the following is not true for MapReduce?
MapReduce code can be written in SQL
Which of the following is not considered as a key component of Hadoop?
SQL
Hadoop is not just about the volume, but also the processing of diversity of data types.
True
The main drawback of NoSQL functions in database processing is:
They have traded ACID compliance for performance and scalability
A data scientist's main objective is to organize and analyze large amounts of data, to solve complex problems, often using software specifically designed for the task.
True
Among the variety of factors, the key driver for big data analytics is the business needs at any level, including strategic, tactical, or operational.
True
Hadoop is a batch-oriented computing framework, which implies it does not support real-time data processing and analysis.
True