Predictive Analytics and BI Exam 2
Which of the following tasks involved in the learning process of ANN?
-Adjust the weights and repeat the process -Compute temporary outputs -Compare outputs with desired targets
Which of the following methods are included in Deep Learning?
-CNN, RNN -LSTM
Which of the following skills a Data Scientist should have:
-Communication and Interpersonal skills -Programming and Scripting -Creativity and out of the box thinking -Domain expertise
Which of the following facts are related to Hadoop?
-Hadoop is a file management system that employs several products -Hadoop is not a single product, it is an ecosystem -Hadoop empowers analytics
Which of the following are considered as the common use cases for cognitive computing?
-Language translation -Fraud detection -Language translation -Context-based sentiment analysis
Structured data is usually organized into records with simple data values that include:
-Ordinal -Nominal -Numeric -Categorical
____________ refers to the analytic process of extracting actionable information from continuously generated data.
-Stream analytics -Data-in-motion analytics
Which of the following is/are among the common challenges that are associated with the implementation of NLP:
-Text-parsing task that requires the identification of word boundaries. -Processing of regional accents. -To mark up the terms in a text that corresponds to a particular part of speech. -Ambiguous syntax and semantic.
_____________ is a collection of neurons that takes inputs from the previous layer and converts those inputs into outputs for further processing.
A hidden layer
Why is cognitive search is different from traditional search?
Can handle a variety of data types
Which of the following is not among the data sampling methods?
Clustering
In predictive analytics, the term variance commonly refers to
Consistency
In the context of text mining, the large and structured set of texts that commonly stored and processed electronically and prepared for the purpose of conducting knowledge discovery is referred to ______________.
Corpus
Information fusion type model ensembles utilize meta-modeling called super learners.
False
MapReduce is a contemporary programming language designed to be used by computer programmers.
False
Polygraph is a non-intrusive deception-detection technique commonly used to assess the level of truthfulness in the textual content.
False
SCM and ERP are the first two beneficiaries of the NLP and WordNet.
False
Text-to-speech is a text processing function that can read textual content and detects and corrects syntactic and semantic errors.
False
The main characteristic of deep learning solutions is that they use AI to understand and organize data, predict the intent of a search query, improve the relevancy of results, and automatically tune the relevancy of results over time.
False
The term veracity in big data analytics refers to the processing of different types and formats of data, structured and unstructured.
False
Underfitting is mainly characterized on the bias-variance trade-off continuum as a low-bias/low-variance outcome.
False
The most popular approach in a neural network is __________ allows all neurons to link the output in one layer to the next layer's input.
Feedforward-multi- layered perceptron
Which of the following is not a product from Apache Hadoop foundation.
Hana
Which of the following from the bullseye diagram interprets the predictions that are neither consistent nor accurate?
High bias, high variance
The application examples of the MapReduce includes:
Indexing, graph analysis, text analysis, machine learning
____________ is often used to characterize the relationships between the individual terms and individual documents.
Indices
Which of the following are the best options available to manage the TDM matrix size?
Labor-intensive process, singular value decomposition, eliminate terms.
_____________ provides an explanation for an individual data point of the joint distribution of independent variables.
Local interpretability
Which of the following are the most commonly used normalization methods?
Log, Binary and Inverse document frequencies
Model ensembles are known to be more robust against outliers and noise in the data, compared to individual models.
True
Multilayer perceptron type deep networks are also known as feedforward networks because the flow of information that goes through them is always forwarding, and no feedback connections are allowed.
True
Sensitivity analysis based on input value perturbation is often used in trained feed-forward neural network modeling where all of the input variables are numeric and standardized.
True
Sensitivity analysis based on the leave-one-out methodology can be applied to any predictive analytics method because of its model agnostic implementation methodology.
True
Singular value decomposition help reduce the overall structure of the term-document matrix to a lower-dimensional space for further pattern/knowledge discovery.
True
The main aim of NLP is to move away from word counting to a real understanding and processing of natural human language.
True
The term velocity in big data analytics refers to how fast the digitized data is created and processed.
True
Which of the following is not among the V's used to define Big Data?
Variance
What is the proper summation function of a single neuron with two inputs and corresponding weights?
Y = X1W1 + X2W2
____________ is the most popular neural network learning method that applies the chain rule of calculus to compute the derivatives of functions.
Backpropagation
In which of the following model ensembles method, multiple decision trees are created from resampled data and then combine the predicted values through averaging.
Bagging
In big data analytics initiative, to create a fact-based decision-making culture, the senior management needs to:
Be a vocal supporter
Which of the following method is classified under model ensemble?
Boosting
The other names for heterogeneous model ensembles are:
Both Information Fusion and Stacking
In explainable AI, the LIME and SHAP methods are considered as global interpreters.
False
In prediction modeling, reducing bias also reduces the variance.
False
Which of the following techniques are used to solve the imbalanced data problems?
Data Sampling Methods, Cost Sensitive Methods, Algorithmic Methods, Ensemble Methods
____________ is the process of extracting novel patterns and knowledge structures from continuous, rapid data records.
Data stream mining
In the context of text mining, lemmatization is a process of statically reducing words to their stem/root form.
False
In the first task of the text mining process, the data is structured and preprocessed to achieve hidden patterns and knowledge nuggets.
False
In the term-document-matrix, the columns represent the documents and the rows represent the terms, and the cells represent the frequencies.
False
Which of the following industries most commonly uses cognitive computing?
Entertainment
Which of the following sequence of tasks represents the text mining process?
Establish the Corpus, Preprocess the Data, Extract the Knowledge
A stream in a stream analytics is defined as a discrete and aggregated level of data elements.
False
Big data comes from a variety of sources within the organization, including marketing and sales transaction, inventory records, financial transaction, and human resources and accounting records.
False
Connection weights are the key elements of an ANN. They produce the final value through the summation and transfer function.
False
Deep learning analytics is a term that refers to the computing−branded technology platforms, such as IBM Watson, that specialize in processing and analyzing large, unstructured data sets.
False
Delta (or an error) is defined as the difference the network weights in two consecutive iterations.
False
Hadoop distributed file system was invented before Google developed MapReduce. Hence, the early versions of MapReduce relied on HDFS.
False
Hadoop is the replacement for a data warehouse that stores and processes large amounts of structured data.
False
In a typical neural network, the goal of the testing process is to adjust the network weights and biases, such that the network output for each set of inputs is adequately close to its corresponding target value.
False
In ensemble modeling, boosting builds several independent simple trees for the resultant prediction model.
False
Which of the following is not true for MapReduce?
MapReduce code can be written in SQL
In the context of text mining, which of the following is a part of NLP that studies the internal structure of words (i.e., the patterns of word formation within a language or across languages).
Morphology
_____________ is the system that automatically translate images of handwritten documents into machine editable textual document.
Optical character recognition
Creating a good prediction model requires finding a(n) ___________ between the errors related to bias and variance.
Optimal balance
The neural networks in which feedback connections are allowed are called ____________.
Recurrent neural networks
In text mining process, which of the following is not a method category used for knowledge extraction?
Regression
Which of the following is not considered as the drawback of model ensembles?
Reliable
Which of the following is not considered as a key component of Hadoop?
SQL
A data scientist's main objective is to organize and analyze large amounts of data, to solve complex problems, often using software specifically designed for the task.
True
Among the variety of factors, the key driver for big data analytics is the business needs at any level, including strategic, tactical, or operational.
True
Artificial intelligence has the capability to find hidden patterns in a variety of data sources to identify problems and provide potential solutions.
True
Bias is often defined as the difference between a model's prediction output and the actual values for a given prediction problem.
True
Cognitive computing has the capability to simulate human though processes to assist humans in finding solutions to complex problems.
True
Hadoop is a batch-oriented computing framework, which implies it does not support real-time data processing and analysis.
True
Human-computer interaction is a critical component of cognitive systems that allow users to interact with cognitive machines and define their needs.
True
In artificial neural networks, neurons are processing units also called processing elements that perform predefined mathematical operations on the numeric values from the input variables or the other neuron outputs to create and push out their own outputs.
True
In ensemble modeling, bagging uses the bootstrap sampling of cases to create a collection of decision trees.
True
In marketing applications, text mining can be used to assess and help predict a customer's propensity to attrite.
True
In text mining, associations refer to direct relationships between terms or sets of concepts.
True
