big data test 3

Ace your homework & exams now with Quizwiz!

Support vector machine

1) the maximal margin classifier, 2) the support vector classifier, and 3) the support vector machine

CRISP-DM

1. business understanding 2. data understanding 3. data preparation 4. model building 5. testing and evaluation 6. deployment

unsupervised learning

An algorithm explores input data without being given an explicit output variable. The algorithm identifies groups of data that exhibit similar behavior.

supervised learning

An algorithm uses training data and feedback from humans to learn the relationship of given inputs to a given output. The model is trained on the data to find the connection between the input variables and the output.

K-Means

An approach for partitioning a data set into K distinct, non-overlapping clusters. is a simple iterative method to partition a given dataset into a user specified number of clusters, k creates k groups from a set of objects so that the members of a group are more similar. It's a popular cluster analysis technique for exploring a dataset. A good clustering is one for which the within-cluster variation is as small as possible.

most common standard processes

CRISP-DM (Cross-Industry Standard Process for Data Mining) SEMMA (Sample, Explore, Modify, Model, and Assess) KDD (Knowledge Discovery in Databases)

CRISP-DM stands for

Cross-Industry Standard Process for Data Mining

refers to knowing what is happening in organization and understanding some underlying trends and causes of such occurrences.

Descriptive Analytics

Artificial Neural Networks (ANN)

Each neuron: 1) calculates a weighted sum of the incoming values, 2) transforming this input using the activation function, and 3) passes on the value to the subsequent neuron(s)

Clustering is supervised learning?

False

Given a set of news articles found on the web, group them into set of articles about the same story. This is a classification task

False

Clustering segments data into groups that are previously defined.

False, NOT previously defined

K-means is used for classification?

False, clustering

K-Means

Hierarchical clustering Principal component analysis (PCA) Singular value decomposition (SVD) Time series clustering

Lift

If lift (milk -> bread) is greater than one, it implies that the two items are found together more often than one would expect by chance. A large lift value is therefore a strong indicator that a rule is important, and reflects a true connection between the items.

Network information processing

Input and output Connection weights: the relative strength or importance of each input to a processing element Summation function: computing the weighted sums of all the input elements entering each processing element

Maximum margin classifier

Margin is the minimum perpendicular distance between each point and the separate line. Find the line which maximizes the margin. The classification of a point depends on which side of the line it falls on.

Apiori

Rules of form: condition - >result E.g. [peanut butter, jelly] -> [bread] This association rule states that if peanut butter, and jelly are purchased together, then bread is also likely to be purchased.

( ) refers to the occurring frequency of the rule.

Support

Association rule mining uses three metrics (________), (_________), and (________).

Support; Confidence; Lift

maximal margin classifier

Suppose that the two classes are "linearly separable" i.e. one can draw a straight line in which all points on one side belong to the first class and points on the other side to the second class. natural approach is to find the straight line that gives the biggest separation between the classes i.e. the points are as far from the line as possible

true negative rate (specificity)

TN/TN+FP

Accuracy

TP + TN/ TP+TN+FP+FN

True positive rate (sensitivity)

TP/TP+FN

Precision

TP/TP+FP

Three examples of the activation function

Threshold function Sigmoid function Rectifier function

Given a dataset of patients diagnosed as either having diabetes or not, learn to classify new patients as having diabetes or not. This is an example, which you would address using an supervised learning algorithm.

True

Prediction

Understanding the possibility of future values based on past patterns.

Some telecommunication company wants to segment their customers into distinct groups in order to send appropriate subscription offers, this is an example of ________.

Unsupervised Learning

The problem of finding hidden structure in unlabeled data is called:

Unsupervised Learning

hidden layer

a layer of neurons that takes input from the previous layer and converts those inputs into outputs for further processing.

Predictive analytics

aims to determine what is likely to happen in the future. It is based on statistical techniques as well as other more recently developed techniques that fall under the categories of data mining or machine learning. What will happen? Why will it happen?

support vector machine (SVM)

an extension of the support vector support vector machine classifier that results from enlarging the feature space in a specific way, using kernels. Kernel functions: linear, polynomial, sigmoid, etc.

logistic regression

appropriate regression analysis to conduct when the dependent variable is dichotomous (binary) Models the probability of an event occurring depending an the values of the independent variables, which can be categorical or numerical Rather than modeling this response Y directly, logistic regression models the probability that Y belongs to a particular category.

processing element (PE)

artificial neuron, receiving inputs, processing them, and delivering a single output

The phases of CRISP-DM are ( ), ( ), ( ), Model building, Testing and evaluation, and Deployment.

business understanding , data understanding, data preparation

ANN

can have one or more layers of neurons. Theses neurons can be fully connected or only certain layers can be connected

two major types of prediction

classification regression

two most common DNNs

convolutional neural network (CNN) recurrent neural networks (RNN)

deep neural network (dnn)

differs from traditional machine learning techniques: deep learning techniques can automatically learn representations from data such as images, video or text, without introducing hand-coded rules or human domain knowledge.

linear regression

establishes a relationship between dependent variable (Y) and one or more independent variables (X) using a best fit straight line (also known as regression line

negative predicted class positive true class

false negative count

negative true class positive predicted class

false positive class

Activation function (Transfer function)

in an artificial neural network, the activation function of a neuron defines the output of that neuron given a set of inputs; defines how to pass the value from inputs through the neuron and make the output

lift values > 1.0

indicate that transactions containing the condition tend to contain the result more often than transactions that do not contain the condition

Network architecture

input, hidden and output layers

how to obtain best fit line?

least square method

kernel functions

linear, polynomial, sigmoid, etc.

Confidence

measurement of its predictive power tells us the proportion of transactions where the presence of item (or item-set) X result in the presence of item (or item-set) Y.

A (_______) is the relative importance of each input to a processing element in a neural network.

neuron

classification

predicts categorical variables

regression

predicts continuous variables

The goal of (_______) analytics is to provide a decision or a recommendations for a specific action?

prescriptive

Descriptive analytics

refers to knowing what is happening in organization and understanding some underlying trends and causes of such occurrences. What happened? What is happening?

Assume you want to perform supervised learning and to predict number of newborns according to size of storks' population, it is an example of _______

regression

The goal of which task is to predict continuous variables?

regression

Prescriptive analytics

seeks to make decisions to achieve the best performance possible: provide recommendations on what to do to achieve goals What should I do? Why should I do it?

Deep learning

subset of machine learning that uses multi-layered artificial neural networks (also known as deep neural networks) to deliver state-of-the-art accuracy in tasks such as object detection, speech recognition, and language translation. Refers to a neural network with more than one hidden layer get's it's name from the deep layers associated with the networks - typically there are a lot of hidden layers

classification

supervised predictive model that segments data by assigning them to groups that are already defined examines already classified data and develops a predictive pattern (rule)

data mining

the intersection of machine learning, statistics, database systems... many different disciplines a process that uses statistical, mathematical and artificial intelligence techniques to extract and identify useful information and knowledge (or patterns) from large data sets.

Support

the percentage of baskets where the rule was true; the occurring frequency of the rule

Recall

tp/tp+fn

. Data mining focuses on discovering useful information.

true

Data mining is a cross-disciplinary field

true

Given email labeled as spam/non-spam, building a spam filter model is a classification task.

true

Regression works by minimizing the vertical deviation between each data point in the dataset and the regression model.

true

negative predicted class negative true class

true negative count

positive true class positive predicted class

true positive count

clustering

unsupervised learning method to segment data into groups that are NOT previously defined

clustering

unsupervised learning technique that attempts to create partitions in the data according to some distance metric. Divides data into different groups Finds groups that are different from each other AND whose members are similar

convolutional neural network (CNN)

used for image classification

Recurrent neural networks (RNN)

used for natural language processing and for sequential data

Lift

whether the condition product is present without the result product. Generally looking for: Support as high as possible Confidence close to 1.0 Lift higher than 1.0

Confidence

𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 (𝑋→𝑌)= (𝑆𝑢𝑝𝑝𝑜𝑟𝑡 (𝑋, 𝑌))/(𝑆𝑢𝑝𝑝𝑜𝑟𝑡 (𝑋))

Lift

𝐿𝑖𝑓𝑡 (𝑋→𝑌)= (𝑆𝑢𝑝𝑝𝑜𝑟𝑡 (𝑋, 𝑌))/(𝑆𝑢𝑝𝑝𝑜𝑟𝑡 (𝑋)⨯𝑆𝑢𝑝𝑝𝑜𝑟𝑡 (𝑌) )

support

𝑆𝑢𝑝𝑝𝑜𝑟𝑡 (𝑋)=(𝐶𝑜𝑢𝑛𝑡 (𝑋))/𝑁 where N is the # of transactions and count(X) is the # of transactions containing item-set X.


Related study sets

Module 2, 2.02 - Cuando Manejo en Mi Ciudad

View Set

Composition 2 Option 2 Spanish Questions and Answers

View Set

Molec cell - Exam 1 help (Chap 4-8, 11)

View Set

Real Estate Prelicensing National Exam

View Set