Lecture 7 - Fundamentals of Machine Learning & Intro to Keras

¡Supera tus tareas y exámenes ahora con Quizwiz!

Three things to consider when choosing an evaluation protocol

1) Data Representativeness; 2) Arrow of Time; 3) Redundancy in the data

Two reasons why feature engineering is important.

1) Features allow for more elegant and efficient problem-solving, removing tasks that the model does not require. 2)Critical when dealing with limited data because deep-learning models rely on having lots of training data to learn on their own.

Four most common ways to prevent overfitting?

1) More training data; 2) reduce capacity of the network; 3) add weight regularization; 4) add dropout

Four key aspects of Data-Preprocessing

1) Vectorization; 2) Normalization; 3) Dealing with Missing Values; 4) Feature Extraction

The purpose of Normalization is to ___

Aid machine learning models learn more effectively and efficiently, particularly when working with data that has a wide range of values.

How is the model's capacity set?

By the number of learnable parameters in the model (the number of layers and number of units per layer).

What is the current state of Reinforcement learning?

Currently, reinforcement learning is mostly limited to research use-cases because it hasn't proven reliable accuracy

Why is dealing with Missing Values important?

Dealing with missing values is an important part of data preprocessing and analysis, as missing values can affect the accuracy and validity of statistical analyses and models. NOTE - It's important to choose the appropriate method for dealing with missing values based on the characteristics of the data and the specific analysis being conducted.

Define Feature Engineering

Feature Engineering makes algorithms work more effectively by applying hard-coded (non-learned) transformation to the data before it goes into the model

Purpose of Feature Extraction

Feature engineering is to improve performance by transforming the raw image data into more meaningful features that the model can use to make accurate predictions.

Define Generalization

Generalization is how well the train model performs on unseen data

Key goal of machine learning is to achieve a model that can____

Generalize. Generalization occurs when a model is capable of performing on unseen data

What is hold-out validation?

Holding out validation is intentionally setting aside a fraction of the data from training, to evaluate and prevent information leaks

What is mean by the Hyperparameters of a model?

Hyperparameters refers to the number of layers or size of the model

Describe hyperparameter leaks.

Information leaks. Every time user tunes a hyperparameter of the model, based on the model's performance on the validation set, some information about the validation data leaks into the model.

Describe K-Fold Validation

K-Fold validation the model automatically splits data into partitions (aka, folding data) and tests the folded data for accuracy

What is the key feature of sell-supervised learning?

No humans in the loop. Models map input data to pre-tagged targeted, give a set of examples with machine-generated tags

Define Normalization

Normalization is the process of transforming input data to ensure that it falls within a similar range or scale, typically with a mean of 0 and a standard deviation of 1.

Define Optimization

Optimization is the process of adjusting a model to get the best performance possible on the training data

Define Overfitting

Overfitting occurs when a machine learning model is trained too well on the training data and starts to fit the noise in the data rather than the underlying pattern. In other words, the model becomes too complex and starts to memorize the training data instead of learning the general patterns that can be applied to new, unseen data.

What is the parameters of a model

Parameters are the network's weight

What is dropout (overfitting context)?

Process of randomly setting to zero (aka dropping out) a number of output features of the layer during training. The dropout rate is a fraction of the features (between 0.2 - 0.5%)

What is weight regularization?

Putting constraints on complexity of a network by force the model's weights to take only small values, thereby making the distribution of weight values more regular.

Simplest way to prevent overfitting?

Reduce the network size by limiting the number of parameters in the model.

Define Regularization

Regularization is the process of fighting overfitting by regularizing the parameters that constrain, regularizes, or shrinks the coefficient estimates towards zero. Regularization discourages learning a more complex or flexible model.

Define Reinforcement Learning

Reinforcement learning is a type of ML, in which the machine receives information inputs about environment & model learns to choose actions that maximize reward

What are hyperparameter leaks a problem?

Repeatedly tuning parameters —running one experiment, evaluating on the validation set, and modifying your model as a result— many times will result in a leak of increasingly large amount of information about the validation set into the model.

List 3 theoretical examples of reinforcement learning?

Self-driving cars, robotics, resource management

Three ways to assess a model's accuracy (hint: data splitting)

Splitting the data into three sets: 1) training, 2) validation, and 3) testing; then running the model's accuracy using all three data groups.

What is the purpose of supervised learning?

Supervised learning models map input data to known targets, when given a set of tagged examples (often tagged by humans)

Purpose of Vectorization

The purpose of vectorization is to transform the data to process into tensors of floating point data.

Describe the purpose of weight regularization

To avoid overfitting, weight regularization adds the lost function of a network a cost associated with large weights.

What is the cause of Underfitting?

Underfitting is usually caused by a model that is too simple or not complex enough to capture the relationships between the input and output variables

Define Underfitting

Underfitting is when model is not able to capture the underlying patterns in the data and performs poorly both on the training data and the test data.

What is the purpose of unsupervised learning?

Unsupervised learning models seek to identify correlations between current data to create relational deductions

Define Vectorization

Vectorization is the process of transforming data that needs to be processed into tensors

Distinguish Hold-Out from K-Fold Validation

hold-out validation is a simpler and faster technique but can have a high variance, while K-fold cross-validation provides a more reliable estimate but can be computationally expensive. The choice of technique depends on the size of the dataset, the computational resources available, and the desired accuracy of the performance estimate.

Describe Feature Extraction

process of selecting and transforming raw data features to improve the performance of machine learning models. It involves creating new features, selecting relevant features, and transforming features to improve the quality of the data.


Conjuntos de estudio relacionados

Sociology Exam over Lessons 5 & 7.

View Set

Chapters 1 and 2 Introduction to Psychology and Research Methods

View Set

Algebra 2 B Unit 2: Radical Functions and Rational Exponents

View Set

9th Grade Abeka Test 1 Algebra I

View Set

Ch. 35: The Adolescent and Family

View Set

Objective 3.01 Fruits and Vegetables

View Set

VW Level F unit 13 Synonyms and Antonyms

View Set

NCLEX Questions-Health and Physical Assessment of the Adult Client

View Set