AI Midterm

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

3 laws of Robotics

- 1st: a robot may not injure a human or allow human to come to harm -2nd: a robot must obey human orders -3rd: a robot must protect its own existence

Linear Regression

- a statistical method used to fit a linear model to a given data set -is a way to teach a computer how to make predictions based on past data by finding the best-fitting straight line. - what is the temp tomorrow? - will it be hot or cold tm?

confusion matrix

- actual class vs predicted class - nxn

ANI (Artificial Narrow Intelligence) examples

- has one specialized task or limited specialized tasks - smart speaker - self-driving car - web search

Cross-Entropy Error

- helps us understand how well our guesses match what's actually true. - The better we get at guessing, the lower the cross-entropy error, and that means we're doing a great job! - measures the performance of a classification model whose output is a probability value between 0 and 1 - minimize the difference between the predicted probability and the true label. - If the computer predicts with high confidence but is wrong, the error will be large. I - If it predicts correctly with high confidence, the error will be small. formula = −(y⋅log(yhat) + (1−y)⋅log(1−yhat​)) y = real answer (use 0 (wrong) or 1 (correct)) yhat = % confidence

What is AI

- intelligence of machines or software - science of making machines think like humans - make machine do things that are considered "smart"

Transistor

- microscopic structures that contain carbon and silicon molecules - used in digital logic circuits, memory chips, and microprocessors -move electricity along the circuit faster

True negative

- predicted negative and its negative/true

False negative

- predicted negative but its positive/true -Type 2 error

True positive

- predicted positive and its positive/true

False positve

- predicted positive but its negative/false - Type 1 Error

SVM Margin

- the distance between the hyperplane and the nearest data points (the support vectors) from both classes. - SVM aims to maximize this margin, meaning it wants to make the gap between the classes as wide as possible. A larger margin means better separation and potentially more accurate predictions.

Gradient Descent

- to find the value of the weights w that minimizes the error E(w) 1. Start with initial values for the parameters (weights) of the model, often chosen randomly. 2. Calculate the gradient of the cost function with respect to the parameters. This tells you how much the cost function will change as you change each parameter. 3. Adjust the parameters in the opposite direction of the gradient. new value= current value − step size × slope

How to find misclassification formula?

1) start with the number 1 2) write the # after 3) match up that weights underneath 4) multiply 5) add together

Perceptron Learning Algorithm

1. Select random sample from training set as input 2. If classification is correct, do nothing 3. If classification is incorrect, modify the weight y=sign(W⋅X+b) X: Input (like features of the fruit—e.g., color, shape). W: Weights (how important each feature is). b: Bias (helps shift the decision boundary to get better predictions). y: The output (the guess: is it an apple or a banana?). sign(): A function that gives +1 (if it thinks apple) or -1 (if it thinks banana).

ML is seperated into what 3 categories

1. supervised learning 2. unsupervised learning 3. reinforcement learning

AI vs Machine Learning vs Deep Learning

AI - ability of a machine to imitate intelligent human behavior ML - application of AI that allows a system to automatically learn and improve from experience Deep - application of Ml that uses complex algorithms and deep neural nets to train a model

Linear Classification vs Linear Regression

Classification: predicts which category an observation belongs to Regression: Predicts continuous values along a straight line

Alan Turing book is called___

Computing and Machinery and Intelligence-Alan Turing

train/test split

Dividing data into training and testing sets for model evaluation -if not used = low generalization = overfitting = when model gives accurate predictions for training data but not for new data

K-cross fold validation

In each round, you use different folds for practice (training) and one fold for testing. This way, every question gets a chance to be in the test set once!

Min likelihood

MLE is used to estimate the coefficients that best explain the data.

Support Vectors:

Support vectors are the closest data points to the hyperplane. These points are critical in determining the hyperplane and the margin in SVM. (For hard margin, the number of support vectors are the data samples on the margin lines, for soft margin, the number of support vectors are the data samples on and inside the margin lines )

C

The C parameter in SVM is a regularization term that balances margin maximization and the penalty for misclassifications. A higher C value imposes a stricter penalty for margin violations, leading to a smaller margin but fewer misclassifications. -the C parameter helps decide how strict or flexible the model should be when classifying data

F1 score

The F1-Score gives equal importance to both precision and recall. If either precision or recall is low, the F1-score will also be low. 2 x [(Precision x Recall)/ (Precision + Recall)]

Max likelihood

The goal of MLE is to find the parameter values that make the observed data most likely. success(1−p)=fail x p

hyperplane

The hyperplane is the decision boundary used to separate data points of different classes in a feature space. In geometry a hyperplane is a subspace whose dimension is one less than that of is ambient space.

kernel

The kernel is a mathematical function used in SVM to group data samples by measuring the similarities between data sample pairs.This allows the SVM to find a hyperplane in cases where data points are not linearly separable. Common kernel functions include linear, polynomial, radial basis function (RBF), and sigmoid

Perception Fromula

X = Vector, w0= 1 , W= vector 1.) yhat = g(w0 + XT*W) 2.) yhat = g(1 +- x1(answer) +- x2(answer) 3.) yhat = g(answer)

Alan Turing is known for developing ________. (1950)

a test that tests a machine's ability to display intelligent behavior showed that a simple machine could solve any problem, provided it could be described algorithmically.

Linear Classification

a way for computers to learn how to sort things into two (or more) groups by drawing a line (or hyperplane) between them.

Soft Margin

allows for some misclassification of training data points. This approach is particularly useful in cases where the data is not perfectly separable by a hyperplane, meaning there are some overlaps or noise in the data. - allows room for error, not perfect When data contains outliers or is not perfectly separable,SVM uses the soft margin technique. This method introduces a slackvariable for each data point to allow some misclassifications whilebalancing between maximizing the margin and minimizing violations

Fran Rosenblatt is known for

book: The Perception: the basic structure of a neural network

hold-out sampling

divide the full data into training, validation, and tests sets

Maximal Margin Classifier - hard margin

focuses on finding the hyperplane that best separates two classes in a dataset with the largest possible margin. - cant solve when data is non-separable - sensitive to outliers refers to the maximum-margin hyperplane that perfectly separates the data points of different classes without any misclassifications

Support Vector Machines (SVM)

helps find the best line (or hyperplane) that divides the two groups (classes) so that you can easily identify which item belongs to which class. -used for both classification and regression tasks 1. SVM takes the training data and finds the best hyperplane that separates the classes. It does this by optimizing the margin between the support vectors and the hyperplane. 2. The goal is to solve a mathematical optimization problem to maximize the margin. This involves finding the weights for the hyperplane that minimize classification errors while keeping the margin as large as possible. 3. Once the SVM is trained, you can use it to classify new data points. You check which side of the hyperplane the new point falls on to determine its class.

Mean Squared Error (MSE)

how close our predictions are to the actual values. formula= 1. subtract actual - predicted(yhat) 2. squared each answer 3. add all together 4. divide by sample size

MNIST Database

large database of handwritten digits that is commonly used for training various image processing systems

Reinforcement Learning

learn from mistakes and trial and error use positive and negative rewards -(input, some output, grade for this output)?

likelihood

measures how probable the observed data is given a particular set of parameters in a model.

parameters vs hyperparameters

parameters are what the model learns during training hyperparameters are the settings you choose before training starts

Pocket Algorithm

pockets the best set of weights it has seen so far, even if it hasn't found a perfect solution. 1.) Go through the data points. If the guess is wrong, update the weights like in the Perceptron algorithm 2.) After each update, compare the new solution with the one in the pocket. If the new weights give fewer errors, save them in the pocket.

kernel function

quantifies the similarities between data samples by summarizing the relationship between signal pairs in the data set separates points in a plane

hyperparameter

setting parameters to a set value and testing

Supervised Learning

the data is already labeled

Moore's Law

the number of transistors on a microchip doubles about every two years, though the cost of computers is halved

Recall/Sensitivity

the proportion of actual positive cases which are correctly identified TP/TP+FN

Precision

the proportion of positive cases that were identified correctly TP/ TP + FP

What is ACC and its formula for confusion matrix ?

the proportion of total correct predictions / total predictions TN + TP / TN+TP+FN+FP

cross-validation

to determine how many misclassifications and observations to allow inside of the soft margin to get the best classification - dividing into training and testing and adjusting

AGI (Artificial General Intelligence)

type of AI that can learn, understand, and perform any intellectual task a human can do.

Unsupervised Learning

use unlabeled data to discover patterns - (input, ?)


Set pelajaran terkait

AICE English Language - Linguistic Devices

View Set

EMT Chapter 37 Transport Operations

View Set

chapter 15: mendelian inheritance

View Set