BCOR 2205 Final Exam Information and Analytics Management

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Round 1 %

32%

Round 2 %

64%

LogLoss

A measure of accuracy; Rather than evaluating the model directly on whether it assigns cases (rows) to the correct label, the model is evaluated based on probabilities generated by the model and their distance from the correct answer; Lower scores are BETTER

Algorithm

A step by step procedure for solving a problem or achieving a specific result. Used in computer programming, mathematics, engineering, etc

Holdout Set

A subsection of a dataset to provide a final estimate of the MLS models performance after it has been trained and validates. Should never be used to make decisions about which algorithms to use for improving tuning algorithms

Machine Learning

A subset of AI, practice of using algorithms to parse data, learn from it, and then make a determination or prediction about something in the world

8 Criteria of Auto ML Excellence

Accuracy, Productivity, Ease of use, Understanding and learning, Resource availability, Process transparency-effects understanding and learning, Generalizability across contexts, Recommended action

Blending

After cross validation has run models are internally sorted by cross validation score and then the best models are blended

Features

Can be thought of as the independent variables we will use to predict

Example of Discrete Data

Course Letter Grade

Supervised ML

Data scientist tells the machine what it wants it to learn (identifies target)

Ml Life Cycle

Define Project and Objectives, Acquire and Explore Data, Model Data, Interpret and Communicate, Implement, Document and Maintain

False Negative Rate

FN/(FN+TP)

False Positive Rate

FP/(FP+TN)

Speed (Datarobot)

Fastest on left, slowest on right

Feature Effects

Feature Impact for specific feature values

ChatGPT

Large Language Model created by OpenAI (Generative Pretrained Transformer)

Accuracy (Datarobot)

Least accurate at top, most accurate at bottom

Artificial Intelligence

Machines that can preform tasks that are characteristic of human intelligence

Binary Data

Nominal attribute with only two categories/states

Example of Nominal Data

Occupation

Categorical Data

Qualitative; Described by words rather than numbers

Numerical Data

Quantitative: Arise from counting, measuring, or some kind of mathematical operation

Machine Learning Pipeline

Raw data to features to models to deploy in production to predictions

Example of Algorithm

Recipe for baking a cake, the method we use to solve long division problems, process of doing laundry, etc.

Business Problem Requirements

State the problem in language of business, specify action, include specifics, explain bottom line impact

Training Set

Subsection of a dataset from which the MLS uncovers or learns relationships between the features and the target variable

Validation Set

Subsection of a dataset to which we apply the MLS to see how accurately it identifies relationships between the known outcomes for the target variable and the datasets other features

Specificity

TN/(TN+FP)

Accuracy

TP+TN/All Cases

Sensitivity

TP/(TP+FN)

Learning Curves

Teaches us if additional cases will help or not; Shows the predicative ability changes with 'sample size'

Over Training/Over Fitting

The model simply memorizes the training examples and is not able to give correct outputs also for patterns that were not in the training dataset

Feature Impact

The overall impact of a feature adjusted for the impact of the other features

Importance

The overall impact of a feature without consideration of the impact of other features

Target

The variable we are trying to predict and gain insights about

Cross Validation

Top four models; Only if validation set is <=10,000 rows

Large Language Model

Type of computer program that has been trained on a lot of text to understand and generate human-like text

Discrete Data

Under Categorical; Finite number of options

Continuous Data

Under Categorical; Infinite number of possible responses, like any point on a number line

Unsupervised ML

Up to the machine to decide what it wants to learn

Nominal Data

You can identify groups are different, but no meaningful ranking

Text (Strings)

You specify a number of characters


Ensembles d'études connexes

Web Design Chapter 6 Study Guide

View Set

The War of 1812 and the Expanding American Republic

View Set

13C Body systems to know in Vertebrate Animals

View Set

Questions For General Skeletal System

View Set

BIOCHEM Unit 2 (ch. 7, 8, 16, 15)

View Set

BIOL 2404 Chapter 18 -- Cardiovascular System - Blood (TEST)

View Set