MSIS Final, Final Business Intelligence

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Which of the following is/are among the common challenges that are associated with the implementation of NLP: A)Ambiguous syntax and semantic. B)Text-parsing task that requires the identification of word boundaries. C)All the answers are true D)Processing of regional accents. E)To mark up the terms in a text that corresponds to a particular part of speech.

C

Which of the following technologies is not part of artificial intelligence? A)Bayesian belief networks B)Machine learning C)Fast furrier transformation D)Deep learning E)Natural language processing

C

____________ refers to the analytic process of extracting actionable information from continuously generated data. A)Text stream analytics B)Stream analytics C)Both A and B are true D)All the answers are true E)Data-in-motion analytics

C

_____________ is a collection of neurons that takes inputs from the previous layer and converts those inputs into outputs for further processing. A)Connection weights B)An input layer C)A hidden layer D)An output layer E)Transfer function

C

_____________ is the system that automatically translate images of handwritten documents into machine editable textual document. A)Speech recognition B)Text segmentation C)Optical character recognition D)Text proofing E)Machine translation

C

_____________ is typically a simplified abstraction of the human brain and its complex biological networks of neurons. A)Machine learning B)Deep learning C)Artificial neural networks D)Cognitive learning E)Representation learning

C

Which of the following is not true for MapReduce? A)MapReduce can be used in machine learning applications B)MapReduce code can be written in SQL C)Shuffle, sorting, and combiner are the key steps in MapReduce D)MapReduce has two main components, which are Mapper and Reducer E)MapReduce is a programming model

B

Which of the following methods is used to explain the prediction of any classifier in a human-interpretable manner by learning a surrogate model locally based on the specifics of the prediction? A)Boosting B)LIME C)Bagging D)Random Forest E)SMOTE

B

____________ is the technique that is used to detect the direction of opinions about specific products and/or services using large textual data sources. A)Data mining B)Sentiment analysis C)NLP analysis D)Web mining E)Consumer analytics

B

In text mining process, which of the following is not a method category used for knowledge extraction? A)Clustering B)Classification C)Regression D)Trend analysis E)Association

C

In the knowledge extraction method of the text mining process, ____________ refers to the natural grouping, analysis, and navigation of large text collections, such as web pages. A)Classification B)Association C)Clustering D)Regression E)Trend analysis

C

In which of the following model ensembles method, multiple decision trees are created from resampled data and then combine the predicted values through averaging. A)Bootstrapping B)Stacking C)Bagging D)Boosting E)All the answers are true

C

The optimization of performance in a neural network is usually done by an algorithm called __________? A)Back-weight propagation B)None of the answers are true C)Stochastic gradient descent D)Network optimization E)Forward propagation

C

The other names for heterogeneous model ensembles are: A)Information Fusion B)Both Boosting and Stacking C)Both Information Fusion and Stacking D)Boosting E)Stacking

C

Which of the following from the bullseye diagram interprets the predictions that are inconsistent but represents a reasonably well-performing prediction model. A)Low bias, medium variance B)Low bias, low variance C)Low bias, high variance D)High bias, low variance E)High bias, high variance

C

Which of the following is not among the V's used to define Big Data? A)Velocity B)Veracity C)Variance D)Volume E)Variety

C

A model with low variance is the one that captures both noise and generalized patterns in the data and therefore produces an overfit model.

False

A stream in a stream analytics is defined as a discrete and aggregated level of data elements.

False

Automatic summarization is a program that is used to assign documents into a predefined set of categories.

False

Big data comes from a variety of sources within the organization, including marketing and sales transaction, inventory records, financial transaction, and human resources and accounting records.

False

Clustering is a supervised learning process in which objects are assigned to a pre-determined number of artificial groups called clusters.

False

Connection weights are the key elements of an ANN. They produce the final value through the summation and transfer function.

False

Deep learning analytics is a term that refers to the computing−branded technology platforms, such as IBM Watson, that specialize in processing and analyzing large, unstructured data sets.

False

Delta (or an error) is defined as the difference the network weights in two consecutive iterations.

False

Hadoop distributed file system was invented before Google developed MapReduce. Hence, the early versions of MapReduce relied on HDFS.

False

Hadoop is the replacement for a data warehouse that stores and processes large amounts of structured data.

False

In ensemble modeling, boosting builds several independent simple trees for the resultant prediction model.

False

In explainable AI, the LIME and SHAP methods are considered as global interpreters.

False

In prediction modeling, reducing bias also reduces the variance.

False

In the term-document-matrix, the columns represent the documents and the rows represent the terms, and the cells represent the frequencies.

False

Information fusion type model ensembles utilize meta-modeling called super learners.

False

Model ensembles are much easier and faster to develop than individual models.

False

SCM and ERP are the first two beneficiaries of the NLP and WordNet.

False

Text-to-speech is a text processing function that can read textual content and detects and corrects syntactic and semantic errors.

False

The main characteristic of convolutional networks is having two or more layers involving a convolution weight function instead of general matrix multiplication.

False

The purpose of artificial intelligence is to augment human capability.

False

The term veracity in big data analytics refers to the processing of different types and formats of data, structured and unstructured.

False

Underfitting is mainly characterized on the bias-variance trade-off continuum as a low-bias/low-variance outcome.

False

Grid computing increases efficiency, lowers total cost, and enhances production by processing computational jobs in a shared, centrally managed ordinary pool of computing resources.

True

Hadoop is an open-source framework for processing, storing, and analyzing massive amounts of distributed, wide variety of data.

True

Hadoop is not just about the volume, but also the processing of diversity of data types.

True

Human-computer interaction is a critical component of cognitive systems that allow users to interact with cognitive machines and define their needs.

True

In artificial neural networks, neurons are processing units also called processing elements that perform predefined mathematical operations on the numeric values from the input variables or the other neuron outputs to create and push out their own outputs.

True

In ensemble modeling, bagging uses the bootstrap sampling of cases to create a collection of decision trees.

True

In marketing applications, text mining can be used to assess and help predict a customer's propensity to attrite.

True

In text mining, associations refer to direct relationships between terms or sets of concepts.

True

In the context of the text mining process, both structured and unstructured data are extracted from the data sources and converted into context-specific knowledge.

True

In typical data stream mining applications, the purpose is to predict the class or value of new instances in the data stream, given some knowledge about the class membership or values of previous instances in the data stream.

True

Model ensembles are known to be more robust against outliers and noise in the data, compared to individual models.

True

The main aim of NLP is to move away from word counting to a real understanding and processing of natural human language.

True

The term long short-term memory network refers to a network that is used to remember what happened in the past for a long enough time that it can be leveraged in accomplishing the task when needed.

True

Tokenizing refers to the process of breaking sentences into blocks of text that performs a specific linguistic function.

True

Bias is often defined as the difference between a model's prediction output and the actual values for a given prediction problem.

True

Cognitive computing has the capability to simulate human though processes to assist humans in finding solutions to complex problems.

True

The data elements in a stream is often referred as A)Variables B)Tuples C)Vectors D)Quartiles E)Doubles

B

____________ is the node in a Hadoop cluster that initiates and coordinates MapReduce jobs or the processing of the data. A)Name supervisor B)Execution manager C)Data node identifier D)Node tracker E)Job tracker

E

In big data analytics initiative, to create a fact-based decision-making culture, the senior management needs to: A)Be a vocal supporter B)Establish alignment between business and IT strategy C)Have a clear business need D)Hire personnel with advanced analytical skills E) establish a strong data infrastructure

A

The main drawback of NoSQL functions in database processing is: A)They have traded ACID compliance for performance and scalability B)They can't handle batch processing C)They can't be used for querying unstructured data D)None of the answers is true E)They can't handle large amount of data

A

Which of the following from the bullseye diagram interprets the predictions that are neither consistent nor accurate? A)High bias, high variance B)Low bias, low variance C)Low bias, medium variance D)Low bias, high variance E)High bias, low variance

A

Which of the following skills a Data Scientist should have: A)All the answers are true B)Creativity and out of the box thinking C)Communication and Interpersonal skills D)Programming and Scripting E)Domain expertise

A

In the context of text mining, the large and structured set of texts that commonly stored and processed electronically and prepared for the purpose of conducting knowledge discovery is referred to ______________. A)Lemmatization B)Corpus C)Stemming D)Terms E)Concepts

B

In the context of text mining, which of the following is a part of NLP that studies the internal structure of words (i.e., the patterns of word formation within a language or across languages). A)Concepts B)Terms C)Stemming D)Morphology E)Corpus

D

What is the proper summation function of a single neuron with two inputs and corresponding weights? A)Y = 2(X1W1 + X2W2) B)Y = X1W2 + X2W1 C)Y = W1 + W2 D)Y = X1W1 + X2W2 E)Y = X1 + X2

D

Which of the following are considered as the common use cases for cognitive computing? A)Language translation B)Language translation C)Context-based sentiment analysis D)All the answers are true E)Fraud detection

D

Which of the following industries most commonly uses cognitive computing? A)Securities B)Finance C)Manufacturing D)Entertainment E)Banking

D

Which of the following is not among the popular tasks performed by NLP? A)Foreign language reading and writing B)Speech recognition C)Text to speech D)Speech acts E)Machine translation

D

Why is cognitive search is different from traditional search? A)Works on a narrow search space B)Uses advanced statistical technologies C)Builds general purpose search applications D)Can handle a variety of data types E)Focuses on the syntactics nature of the searched data

D

____________ is often used to characterize the relationships between the individual terms and individual documents. A)Term-document matrix B)Dimensionality C)Confusion matrix D)Indices E)Modeling

D

_____________ provides an explanation for an individual data point of the joint distribution of independent variables. A)Accuracy B)Sensitivity C)None of the answers are true D)Local interpretability E)SMOTE

D

In predictive analytics, the term bias commonly refers to A)Variance B)Constant C)Accuracy D)Consistency E)Error

E

The most popular approach in a neural network is __________ allows all neurons to link the output in one layer to the next layer's input. A)LSTM B)Complex topology design C)Recurrent neural connections D)None of the answers are true E) Feedforward-multi- layered perceptron

E

Which of the following applications do not utilize the capabilities of text mining? A)Biomedical applications B)Academic literature applications C)Marketing applications D)Security applications E)None of the answers is true

E

Which of the following are the benefits of model ensembles? A)Robustness B)Accuracy C)Coverage D)Stable E)All the answers are true

E

Which of the following facts are related to Hadoop? A)Hadoop is not a single product, it is an ecosystem B)Hadoop empowers analytics C)Hadoop is a file management system that employs several products D)All of the other answers are true E)Only A and B are true

E

Which of the following is not a product from Apache Hadoop foundation. Which of the following is not a product from Apache Hadoop foundation. A)Hive B)MapReduce C)Pig D)Hbase E)Hana

E

Which of the following method is classified under model ensemble? A)Clustering B)SMOTE C)SVM variants D)Bootstrapping E)Boosting

E

Which of the following methods are included in Deep Learning? A)PCA B)LSTM C)CNN, RNN D)Both B and C are true E)Both A and C are true

E

Which of the following techniques are used to solve the imbalanced data problems? A)Data Sampling Methods, Cost Sensitive Methods, Algorithmic Methods, Resemble Methods B)Data Sampling Methods, Cost Sensitive Methods, Data effective Methods, Ensemble Methods C)Data Sampling Methods, Cost Sensitive Methods, Simulation Methods, Ensemble Methods D)Data Sampling Methods, Cost Insensitive Methods, Algorithmic Methods, Ensemble Methods E) Data Sampling Methods, Cost Sensitive Methods, Algorithmic Meth

E

A data scientist's main objective is to organize and analyze large amounts of data, to solve complex problems, often using software specifically designed for the task.

True

Among the variety of factors, the key driver for big data analytics is the business needs at any level, including strategic, tactical, or operational.

True

Artificial intelligence has the capability to find hidden patterns in a variety of data sources to identify problems and provide potential solutions.

True


Kaugnay na mga set ng pag-aaral

Thoracic Cage + Vertebral Column

View Set

Ch. 4- Life Insurance Policy Provisions, Options, and Riders

View Set

Chapter 4: The Balance Sheet and its Analysis

View Set

Chapter 18: Speaking on Special Occasions

View Set

Wk. 1 & 2: Public Understanding of Science

View Set

U.S. History: Chapter 9-Progressive Era, Muckrakers

View Set