Analytics II Natural Language Proccessing

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

How are prescriptive analytics methods different from the other two types?

"what to do?" queries, not "what-is?" queries.

Imagine, you are solving a classification problem with highly imbalanced classes. The majority class is observed 99% of times in the training data. Your model has 99% accuracy after taking the predictions on test data. Which of the following is true in such a case? (1) Accuracy metric is not a good idea for imbalanced class problems. (2) Accuracy metric is a good idea for imbalanced class problems. (3) Precision and recall metrics are good for imbalanced class problems. (4) Precision and recall metrics aren't good for imbalanced class problems. a. 1 and 3 b. 1 and 4 c. 2 and 3 d. 2 and 4

(1) Accuracy metric is not a good idea for imbalanced class problems. (3) Precision and recall metrics are good for imbalanced class problems.

Suppose we train a model to predict whether an email is Spam or Not Spam. After training the model, we apply it to a test set of 500 new emails and the model produces the following table. What is the precision of this model?

70%

Overfitting occurs when: A. The training data is good but the testing data is bad B. The model is too simple C. Both the training and testing data is bad D. Both the training and testing data is good

A. The training data is good but the testing data is bad

Which of the following makes a neural network non-linear?

Activation function

Choose form the following areas where NLP can be useful.

All of the above

Convolutional Neural Network is used in____ a. Image classification b. Text classification c. Computer vision d. All of the above

All of the above

Convolutional Neural Network is used in_________ a. Image classification b. Text classification c. Computer vision d. All of the above

All of the above

How to Improve a Model Performance? A Add more data to Improve data quality B Choose more advanced models C Search for most appropriate parameters D All of the above

All of the above

What are the common types of data analytics? A Predictive. B Descriptive. C Prescrptive. D All of the above.

All of the above

What is learning in deep learning? a. Learning, in the context of machine learning, describes an automatic search process for better representation b. A process that is "learned" from exposure to known examples of inputs and outputs c. The procedure of finding the weights that minimize the loss function d. All of the above e. None of the above

All of the above

Which of the following algorithm have adopted a pre-training paradigm? a. GPT-3 b. BERT c. Autoencoder d. All of the above e. None of the above

All of the above

What is the purpose of a loss function? Calculate the error value of the forward network a. Calculate the error value of the forward network b. Optimize the error values according to the error rate c. Both A and B d. None of the above

Both A and B

What is the use of forward propagation? A. Calculate errors B. Train the model C. Improve the accuracy of the model D. Recalculate the weight of the parameter

Calculate errors

Which data analytics application partitions a collection of objects into non-predefined groupings with similar features?a) Classification

Clustering

Facebook's facial recognition is an example of _________.

Computer vision

What is NOT true about data mining? a. Data analytics is defined as the procedure of extracting information from huge sets of data to support business decision-making b. Data analytics also involves other processes such as Data Cleaning, Data Integration, Data Transformation c. Data analytics is the procedure of mining knowledge from data d. Data analytics will always improve business decision-making e. None of the above

Data analytics will always improve business decision-making

Data obtained during customer review can be used in which stage of data analytics? A Descriptive Analytics B Predictive Analytics C Prescriptive Analytics D None of Above

Descriptive Analytics

Which of the following is a common use of unsupervised clustering?

Detect outliers

Over a N number of experiments what will be the average value of a random variable- A Mean Squared Error B Expected Value C Mean Absolute Error D Area Under Curve

Expected Value

A model with 91% accuracy suggests that the model must work very well.

False

Adding nonlinearity allows the model to express simple nonlinear functions A. True B. False

False

Clustering is the most common type of supervised learning? A. True B. False

False

Dropout can be used to prevent underfitting problems during the deep learning training process.

False

Increase in size of a convolutional kernel would necessarily increase the performance of a convolutional network.

False

One Hot Representations use continuous vectors to retain syntactic and semantic word relationships? A. True B. False

False

Text summarization is the process of assigning a topic label to a piece of text.

False

Topic Modeling is a type of supervised learning. A. True B. False

False

True/false question: One of the pros of training language models all prior context information is helpful

False

True/false question: Skip-gram (SG): predict center word from context/surrounding words

False

Which of the following is FALSE about Deep Learning and Machine Learning?

Feature Extraction needs to be done manually in both ML and DL algorithms

What is the purpose of gradient decent? A. Minimize training time B. Compress data hidden layers C. Find the parameters that can give lowest errors D. Train the model to learn new problems faster

Find the parameters that can give lowest errors

Consider the scenario. The problem you are trying to solve has a small amount of data. Fortunately, you have a pre-trained neural network that was trained on a similar problem. Which of the following methodologies would you choose to make use of this pre-trained network?

Freeze first several layers and fine-tune the retaining layers on the new dataset

how a model applies to new data that has not been used to build the model , This is the definition for? a/ Underfitting b/ Bad Generalization c/ Generalization

Generalization

Word2vec is used to_______

Generate vectors out of words

In which scope RNN is used to achieve the best results? A Handwriting and speech recognition B Handwriting and images recognition C Speech and images recognition D Financial predictions

Handwriting and speech recognition

Assume we have built a spam-email detection system. Simon doesn't even know where the "Junk" directory is. He would much prefer to see spam emails in his inbox than to miss genuine emails without knowing. Which of the following evaluation metrics is most important for Simon?

High precision

The main reason why text data is particularly hard to analyze is

Is unstructured and difficult to represent effectively

Which of the following is true about max-pooling?

It allows the network to reduce the number of parameters and in the meantime, to retain the relevant information as much as possible

Which of the following sentence is FALSE regarding regression? a. It relates inputs to outputs b. It is used for prediction c. It may be used for interpretation d. It discovers causal relationships

It discovers causal relationships

What is deep in deep learning?

It stands for the idea of successive layers of representations in deep learning

knowledge

Knowledge: use information for a given task

What if we use a learning rate that is too large?

Modeling training may never converge and even diverge

Suppose we want to compute 10-Fold Cross-Validation error on 100 training examples. We need to compute error N1 times, and the Cross-Validation error is the average of the errors. To compute each error, we need to build a model with data of size N2, and test the model on the data of size N3.What are the appropriate numbers for N1;N2;N3?

N1 = 10;N2 = 90; N3 = 10

___________, is the sub-field of AI that is focused on enabling computers to understand and process human languages.

Natural Language Processing

What is not an advantage of using RNN language models? A. They consider word context B. They can process any length input C. They use the same weight at all sequence points D. None of the above

None of the above

Which of the following model does not apply transfer learning techniques?

None of the above

Which of the following techniques does NOT prevent a model from overfitting?

None of the above

Which of the following techniques does NOT prevent a model from overfitting? a. Data augmentations b. Dropout c. Early stopping d. None of the above

None of the above

What is gradient descent?

Optimization algorithm

tanh activation function is often used in ? A RNN B CNN C Both D Not applicable

RNN

You are giving data about seismic activity in Japan, and you want to predict if an earthquake will happen in the next year, this is an example of

Supervised learning

Spam email detection comes under which domain?

Text classification

Prescriptive Analytics

The set of analytical techniques that yield a best course of action.

What does a gradient descent algorithm do?

Tries to find the parameters of a model that minimizes the cost function based on mathematical calculations

Building effective traditional machine learning models needs expert knowledge about the relevant business process and knowledge about data attributes.

True

Deep Artificial Neural Networks and Deep Learning are generally the same thing and mostly used interchangeably. (True/False)

True

Distributed Vector Representation's advantage over the bag-of-words technique lies in the capability to represent text with limited number of entries it displays, regardless of the length of the text.

True

During classification in data mining, a false positive is an occurrence classified as true by the algorithm while being false in reality.

True

If we increase the size of the training data, this will likely improve the performance of the model on new data.

True

K-fold cross validation mitigates the biased effects of picking a training and a testing dataset particularly different?

True

Learning means finding a set of values for the weights of all layers in a network, such that the network will correctly map example inputs to their associated targets.

True

One major advantage of using Long Short-Term Memory (LSTM) model, rather than vanilla RNN, is that LSTM can deal with the exploding gradients problem.

True

One typical sign of a learning rate being too small is that the cost function may take a very long time to converge.

True

Recurrent Neural Networks handle sequence input A. True B. False

True

Sentiment analysis using Deep Learning is a many-to one prediction task.

True

The fundamental trick in deep learning is to use this score as a feedback signal to adjust the value of the weights a little, in a direction that will lower the loss score.

True

The neural networks get the optimal weights and bias values through an Error Gradient. (True/False)

True

The purpose of performing model evaluation is to judge how the trained model performs outside the sample on test data.

True

True/false question: Sentiment analysis for example the negative words in email one of common typr of of NLP Techniques

True

Webpages can be created using HTML (HyperText Markup Language).

True

While recurrent network networks (RNNs) can handle a sequence of arbitrary length, training RNNs is hard because of banishing and exploring gradient problems.

True

While recurrent network networks (RNNs) can handle a sequence of arbitrary length, training RNNs is hard because of vanishing and exploring gradient problems.

True

Natural language processing is divided into the two sub fields of

Understanding and Generation

For a balanced binary dataset, suppose your model has 60% training performance and 55% testing performance, which of the following is a valid way to try and resolve this problem?

Use a more powerful model

What is the basic concept of Recurrent Neural Network? A Use previous inputs to find the next output according to the training set. B Use loops between the most important features to predict next output C Use a loop between inputs and outputs in order to achieve the better prediction. D Use recurrent features from dataset to find the best answers.

Use previous inputs to find the next output according to the training set.

Based on what we have covered about deep learning, we have learned that: A neural network is a (crude) mathematical representation of a brain, which consists of smaller components called neurons. Each neuron has an input, a processing function, and an output. These neurons are stacked together to form a network, which can be used to approximate any function. To get the best possible neural network, we can use techniques like gradient descent to update our neural network model. Given above is a description of a neural network. When does a neural network model become a deep learning model?

When you add more layers and increase depth of neural network

Which of the following statements is false? a. When creating a model, a key goal is to ensure that it is capable of making accurate predictions for data it has not yet seen. Two common problems that prevent accurate predictions are overfitting and underfitting b. Underfitting occurs when a model is too simple to make accurate predictions, based on its training data. An example of underfitting is using a linear model, such as simple linear regression, when in fact, the problem really requires a more sophisticated non-linear model c. Overfitting occurs when your model is too complex. In the most extreme case of overfitting, a model memorizes its training data d. When you make predictions with an overfit model, the model won't know what to do with new data that matches the training data, but the model will make excellent predictions with data it has never seen

When you make predictions with an overfit model, the model won't know what to do with new data that matches the training data, but the model will make excellent predictions with data it has never seen

Autoencoder

an unsupervised approach for learning a lower dimensional feature representation from unlabeled training data

Supervised learning differs from unsupervised clustering in that supervised learning requires

at least one output attribute

Classification problems are distinguished from regression problems in that

classification problems require the output attribute to be categorical

What is not a key component of data analytics? A. Data mining B. Data C. Modeling D. Business Strategies

data mining

Data intelligence/analytics is the conversion of large raw ___ into a smaller amount of more useful _____.

data; information

The Bag-of-Words approach_________

disregards word order, keeps word multiplicity

Predictive Analytics

extracts information from data and uses it to predict future trends and identify behavioral patterns

Data analytics is best described as the process of? [Medium]

identifying patterns in data

Select a non-linear model from below- A linear regression B neural network C logistic regression D support vector machine (SVM)

neural network

CNN

often for visual

RNN

often used for sequence

A trader who wants to predict short-term movements in stock prices is likely to use ________analytics.

predictive

information

processed data with meaning

data

raw facts, no specific meaning

Which of these applications will derive the LEAST benefit from text mining?

sales transaction files

types of data

structured categorical/numerical classification/regression unstructured textual/image or video clustering/association

Descriptive Analytics

the use of data to understand past and current business performance and make informed decisions

Supervised leraning

train the model using labeled data

unsupervised learning

train the model using unlabeled data

Data used to build a data mining model

training data

Overfitting, underfitting

training good testing good good training good testing bad overfitting training bad testing bad underfitting training bad testing good unlikely

___________ refers to a model that can neither model the training data nor generalize to new data.

underfitting


Kaugnay na mga set ng pag-aaral

Client with a spinal cord injury

View Set

BA302 Organizational Behavior - Chapter 6 Design COMBINED

View Set

Logical reasoning and rationality experiments

View Set

My QA: BDD, Cucumber, Gherkin, TestNG, Junit

View Set

Practice Test 2- Medical Terminology

View Set