Final Exam Practice (Chapters 6-9 quiz questions)
Which of the following is not considered as the key to the success of Big Data analytics.
A fact-based transaction system
_____________ is a collection of neurons that takes inputs from the previous layer and converts those inputs into outputs for further processing.
A hidden layer
Why is cognitive search is different from traditional search?
Can handle a variety of data types
Which of the following is not among the data sampling methods?
Clustering
____________ refers to computing systems that use mathematical models to emulate the human cognition process to find solutions to complex problems and situations where the potential answers can be imprecise.
Cognitive computing
In predictive analytics, the term variance commonly refers to
Consistency
MapReduce is a contemporary programming language designed to be used by computer programmers.
False
Polygraph is a non-intrusive deception-detection technique commonly used to assess the level of truthfulness in the textual content.
False
The main characteristic of convolutional networks is having two or more layers involving a convolution weight function instead of general matrix multiplication.
False
The purpose of artificial intelligence is to augment human capability.
False
The term veracity in big data analytics refers to the processing of different types and formats of data, structured and unstructured.
False
Which of the following technologies is not part of artificial intelligence?
Fast furrier transformation
SCM and ERP are the first two beneficiaries of the NLP and WordNet.
Flase
Which of the following from the bullseye diagram interprets the predictions that are neither consistent nor accurate?
High bias, high variance
The application examples of the MapReduce includes:
Indexing, graph analysis, text analysis, machine learning
____________ is the node in a Hadoop cluster that initiates and coordinates MapReduce jobs or the processing of the data.
Job tracker
Which of the following methods is used to explain the prediction of any classifier in a human-interpretable manner by learning a surrogate model locally based on the specifics of the prediction?
LIME
Which of the following are the best options available to manage the TDM matrix size?
Labor-intensive process, singular value decomposition, eliminate terms.
Which of the following is not among the steps involved in sentiment analysis?
Latent Dirichlet allocation
Which of the following are the most commonly used normalization methods?
Log, Binary and Inverse document frequencies
Which of the following from the bullseye diagram interprets the predictions that are inconsistent but represents a reasonably well-performing prediction model.
Low Bias, High Variance
In the context of text mining, lemmatization is a process of statically reducing words to their stem/root form.
True
In the context of the text mining process, both structured and unstructured data are extracted from the data sources and converted into context-specific knowledge.
True
Model ensembles are known to be more robust against outliers and noise in the data, compared to individual models.
True
Multilayer perceptron type deep networks are also known as feedforward networks because the flow of information that goes through them is always forwarding, and no feedback connections are allowed.
True
Overfitting is the notion of making the model too specific to the training data to capture not only the signal but also the noise in the data set.
True
Sensitivity analysis based on the leave-one-out methodology can be applied to any predictive analytics method because of its model agnostic implementation methodology.
True
Singular value decomposition help reduce the overall structure of the term-document matrix to a lower-dimensional space for further pattern/knowledge discovery.
True
Text-to-speech is a text processing function that can read textual content and detects and corrects syntactic and semantic errors.
True
The main aim of NLP is to move away from word counting to a real understanding and processing of natural human language.
True
The term velocity in big data analytics refers to how fast the digitized data is created and processed.
True
The data elements in a stream is often referred as
Tuples
Which of the following methods are included in Deep Learning (select the most inclusive option)?
All (CNN, RNN, LSTM)
Structured data is usually organized into records with simple data values that include:
All of the answers are true
Which of the following are the main problems that can be addressed by big data analytics?
All of the answers are true
Which of the following facts are related to Hadoop?
All of the other answers are true
Which of the following are considered as the common use cases for cognitive computing?
All the answers are true
Which of the following is/are among the common challenges that are associated with the implementation of NLP:
All the answers are true
Which of the following skills a Data Scientist should have:
All the answers are true
Which of the following is not considered as a key component of Hadoop?
AQL
What are the key attributes that cognitive computing systems must have?
Adaptive, Interactive, Iterative, Contextual
_____________ is typically a simplified abstraction of the human brain and its complex biological networks of neurons.
Artificial neural networks
In explainable AI, the LIME and SHAP methods are considered as global interpreters.
False
____________ is the most popular neural network learning method that applies the chain rule of calculus to compute the derivatives of functions.
Backpropagation
In which of the following model ensembles method, multiple decision trees are created from resampled data and then combine the predicted values through averaging.
Bagging
The other names for heterogeneous model ensembles are:
Both Information Fusion and Stacking
In prediction modeling, reducing bias also reduces the variance.
False
Which of the following techniques are used to solve the imbalanced data problems?
Data Sampling Methods, Cost Sensitive Methods, Algorithmic Methods, Ensemble Methods
____________ is the process of extracting novel patterns and knowledge structures from continuous, rapid data records.
Data stream mining
In the context of text mining, structured data is for humans to process while unstructured data is for computers to process and understand.
False
Information fusion type model ensembles utilize meta-modeling called super learners.
False
Which of the following industries most commonly uses cognitive computing?
Entertainment
In predictive analytics, the term bias commonly refers to
Error
A data set is imbalanced when the distribution of different classes in the input variables are significantly dissimilar
False
A model with low variance is the one that captures both noise and generalized patterns in the data and therefore produces an overfit model.
False
A stream in a stream analytics is defined as a discrete and aggregated level of data elements.
False
Automatic summarization is a program that is used to assign documents into a predefined set of categories.
False
Big data comes from a variety of sources within the organization, including marketing and sales transaction, inventory records, financial transaction, and human resources and accounting records.
False
Connection weights are the key elements of an ANN. They produce the final value through the summation and transfer function.
False
Deep learning analytics is a term that refers to the computing−branded technology platforms, such as IBM Watson, that specialize in processing and analyzing large, unstructured data sets.
False
Delta (or an error) is defined as the difference the network weights in two consecutive iterations.
False
Hadoop distributed file system was invented before Google developed MapReduce. Hence, the early versions of MapReduce relied on HDFS.
False
Hadoop is the replacement for a data warehouse that stores and processes large amounts of structured data.
False
In ensemble modeling, boosting builds several independent simple trees for the resultant prediction model.
False
Which of the following is not true for MapReduce?
MapReduce code can be written in SQL
Which of the following applications do not utilize the capabilities of text mining?
None of the answers is true
_____________ is the system that automatically translate images of handwritten documents into machine editable textual document.
Optical character recognition
Creating a good prediction model requires finding a(n) ___________ between the errors related to bias and variance.
Optimal Balance
In text mining process, which of the following is not a method category used for knowledge extraction?
Regression
____________ is the technique that is used to detect the direction of opinions about specific products and/or services using large textual data sources.
Sentiment analysis
Which of the following is not among the popular tasks performed by NLP?
Speech acts
Among the variety of factors, the key driver for big data analytics is the business needs at any level, including strategic, tactical, or operational.
True
Artificial intelligence has the capability to find hidden patterns in a variety of data sources to identify problems and provide potential solutions.
True
Bias is often defined as the difference between a model's prediction output and the actual values for a given prediction problem.
True
Cognitive computing has the capability to simulate human though processes to assist humans in finding solutions to complex problems.
True
Deep learning is an extension of neural networks that deal with more complicated tasks with a higher level of sophistication by employing many layers of connected neurons.
True
Hadoop is an open-source framework for processing, storing, and analyzing massive amounts of distributed, wide variety of data.
True
Hadoop is not just about the volume, but also the processing of diversity of data types.
True
In artificial neural networks, neurons are processing units also called processing elements that perform predefined mathematical operations on the numeric values from the input variables or the other neuron outputs to create and push out their own outputs.
True
In marketing applications, text mining can be used to assess and help predict a customer's propensity to attrite.
True
