Exam 1 - Generative AI and Chapter 1-3

Ace your homework & exams now with Quizwiz!

What is the purpose of Data Reduction?

The purpose of Data Reduction is to reduce the size of a data set to a more manageable and suitable size for business analysis.

What are hallucinations?

Hallucinations are words, phrases that the software has or will make up.

** know the chart from the textbook that describes the purpose of visualizations, and the common types of visualizations used to meet each purpose **

** know the chart from the textbook that describes the purpose of visualizations, and the common types of visualizations used to meet each purpose **

What does LLM stand for?

Large-Language Model

What is "artificial general intelligence"?

AI that can do any intellectual task that a human can

What are the TWO general things to evaluate to determine if augmentation, automation of a task is worth pursuing?

Can AI do it? How valuable is it for the business?

What are the TWO types, subtypes of data?

Categorical: Nominal and Ordinal Numerical: Interval and Ratio

** know the difference between random sampling, stratified random sampling, cluster sampling, and convenience, non-probability sampling **

* know the difference between random sampling, stratified random sampling, cluster sampling, and convenience, non-probability sampling **

Tableau Prep

- a basic but powerful tool for preparing a data analysis

Data Dictionary

- a centralized repository of information about data containing a separate record for each field or variable in a database

String

- a collection of one or more characters that are stored as categorical data

OLAP

- a computing method that enables users to easily and selectively extract and query data for analysis from a different point of view

R or Python

- a programming language that can be used to clean data and conduct a business analysis

Tableau

- a spreadsheet software for advanced analysis of data

Excel

- a spreadsheet software for basic analysis of data

T-Test

- a statistical test that is used to determine if there is a significant difference between the mean of a group or set of data (X2)

ANOVA

- a statistical test that is used to determine if there is a significant difference between the mean of a group or set of data (X3)

Microsoft Power Query

- a tool that's built-into Excel and Power BI that will allow them to connect to a variety of different data sources

Histogram

- a visual representation of a frequency distribution

Box Plot

- a visual representation of data that is disbursed by a quartile

Alteryx

- an advanced, powerful tool for preparing a data analysis

Measure

- an attribute that is characterized as numerical

Hadoop

- an open-source framework for storing and processing Big Data

Dimension

- any attribute that is characterized as categorical

Geographic

- any data that can be linked to a map

Data Warehouse

- can integrate different "database" across a company

Data Lake

- integrates data from different sources

What is correlation?

- the measure of the relationship between a variable and a variable by measuring how they change with respect to each other

Microsoft Power BI

- uses basic and advanced data analytic model(s) and visualization(s)

What is the common file format to deliver structured TABULAR data?

.csv

What is the common file format to deliver structured TEXT data?

.txt

What are the THREE general tips for creating a prompt for Generative AI?

1. Be detailed, specific 2. GUIDE 3. Iterate ** EXPERIMENT **

What are the FOUR general steps to prepare data for analysis?

1. ENSURE data QUALITY 2. VALIDATE the data for COMPLETENESS and INTEGRITY 3. CLEANSE the data 4. PERFORM preliminary "exploratory analysis"

What are FOUR ways to improve a model's performance?

1. Prompting 2. RAG 3. Fine-tune Model 4. Pretrain Model

What is a Z-Score?

A Z-Score is there to tell you how many Standard Deviation(s) a data point is from the Mean.

What does AI automate?

AI doesn't automate a job, but it does automate a task.

What is an example of augmentation of a task with Generative AI?

Augmentation is meant to help a human with a task. EX: You can recommend a response for a customer-service agent to edit, approve.

What is Big Data? What are the FOUR V's that describe Big Data?

Big Data: - data that is too large and complex for a business's centralized system to capture, store, manage, and analyze VOLUME - VARIETY - VELOCITY - VERACITY

What are some characteristics that data should have?

Data should be RELEVANT and RELIABLE.

What are the different types of analytics?

Descriptive, Diagnostic, Predictive, Prescriptive, and Adaptive or Autonomous

What is the difference between descriptive, inferential statistics?

Descriptive: - a measure that will describe a group of interest Inferential: - a measure that is calculated using only a sample of the desired population

What are some tips for responsible AI?

Fairness, Transparency, Privacy, Security, and Ethicalness

ETL

EXTRACT - TRANSFORM - LOAD

Which sector(s) is, are expected to have the most impact knowledge workers?

Educator, Legal Professional, etc...

What does ERP mean? What is it?

Enterprise Resource Planning ERP is a type of business management software that integrates applications from throughout the business into one system.

What are exploratory and explanatory data visualizations?

Exploratory: - a graphical representation that is useful for uncovering patterns and useful insights in the data Explanatory: - a graphical representation useful in communicating the findings of the analysis to stakeholders

What is Generative AI? Who are the major players in the market at this time?

Generative AI is an AI system that can produce high-quality content: text, images, and audio. EX: Chat GPT (Open AI), Bard (Google), Bing Chat (Microsoft)

What are the FOUR AI tools mentioned in the course?

Generative AI, Supervised Learning, Unsupervised Learning, and Reinforcement Learning

What is a TOKEN, and what is it used for related to Generative AI?

It is a method of payment.

What does it mean that Generative AI is a "general purpose technology"?

It is a technology that comes around "once in a generation", and it affects just about every human.

What kind of job(s) will have more impact from Generative AI?

It will impact "higher-paid" jobs, more.

What are some common measures of central tendency?

Mean, Median, and Mode

Can Generative AI filter out bias automatically?

No

Does Generative AI work well with structured data?

No

What are the types of bias that may be an issue when working with data?

Nonresponse, Selection, Confirmation, and Outlier

What is a Parameter vs. a Statistic when doing analyses on data?

Parameter: - a characteristic of a population Statistic: - a characteristic of a sample

What is a Primary and Foreign Key in a relational database?

Primary Key: - any key that will function as a unique identifier in a table Foreign Key: - is a key that will create a relationship between a table and another table

The instructions given to an LLM to perform a task is called, what?

Prompt

What are some common measures of dispersion?

Range, Variance, and Standard Deviation

What does RLHF stand for? What is it used for?

Reinforcement Learning from Human Feedback It is used to train the system. The system is being trained to produce an answer that is of the preference of the user.

What does RAG stand for and what does it do?

Retrieval Augmented Generation It gives LLMs access to external data sources.

What are the FOUR components to the SOARs analytics model?

S: SPECIFY THE QUESTION O: OBTAIN THE DATA A: ANALYZE THE DATA R: REPORT THE RESULTS

What are some functional business areas where businesses spend money on Generative AI, and they receive a significant impact?

Sales, Marketing, Software Engineering, Product R&D, etc...

What is a general time-frame for developing and running a supervised learning AI technique with experienced personnel?

Scope Project - Build, Improve System - Internal Evaluation - Deploy, Monitor

What are some common roles that people play in building Generative AI software? What does each one do?

Software Engineer: - responsible for writing the application Machine-Learning Engineer: - responsible for implementing the system Product Manager: - responsible for identifying, scoping the project

What are the THREE basic components of a relational database?

TABLE - FIELD - RECORD

Can Generative AI create programming code?

Yes

Does ChatGPT capture all publicly available information on the Internet at some point in time?

Yes

Is Generative AI a replacement for doing a web-search?

Yes

Is it a true statement that LLMs are used as a reasoning engine to process information rather than just as a source of information?

Yes

Are there limits to how much data can be included in the input to or output from a LLM?

Yes. Usually, a few-thousand words, phrases.

How can you determine the current knowledge cutoff point?

You can review the day, time of the information that the technology is referencing.


Related study sets

Chapter 3 Biology - Cell Structure and Function

View Set

Management of Patients with Hypertension (Chapter 31)

View Set

Unit VI Documentation : reporting and recording

View Set

Cultural Anthropology - Applied Perspective: Chapter 2: The Concept of Culture

View Set

Lab Safety Review Quiz - Lab Flow

View Set