AIGP Key Terms

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

OECD Risk Dimensions

- people and planet - economic context - data and input dimension - ai model - tasks and output

Artificial Intelligence

A broad term used to describe an engineered system that uses various computational techniques to perform or automate tasks. This may include techniques, such as machine learning, where machines learn from experience, adjusting to new input data and potentially performing tasks previously done by humans. More specifically, it is a field of computer science dedicated to simulating intelligent behavior in computers. It may include automated decision-making.

Turing Test

A check of a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. Alan __ (1912-1954) originally thought of the check to be an AI's ability to converse through a written text, such that a human reader would not be able to tell a computer-generated response from that of a human.

Overfitting

A concept in machine learning in which a model (see machine learning model) becomes too specific to the training data and cannot generalize to unseen data, which means it can fail to make accurate predictions on new datasets. Occurs when an algorithm fits too closely to even exactly to its training model, resulting in a model that can't make accurate predictions or conclusions from any data other than the training data

Underfitting

A concept in machine learning in which a model (see machine learning model) fails to fully capture the complexity of the training data. This may result in poor predictive ability and/or inaccurate outputs. Factors leading to underfitting may include too few model parameters, too high a regularization rate, or an inappropriate or insufficient set of features in the training data.

Computer Vision

A field of AI that enables computers to process and analyze images, videos and other visual inputs.

Generative AI

A field of AI that uses deep learning trained on large datasets to create new content, such as written text, code, images, music, simulations and videos. Unlike discriminative models, __ makes predictions on existing data rather than new data. These models are capable of generating novel outputs based on input data or user prompts.

Chatbot

A form of AI designed to simulate human-like conversations and interactions that uses natural language processing and deep learning to understand and respond to text or other media. Because these are often used for customer service and other personal help applications, they often ingest users' personal information.

Expert System

A form of AI that draws inferences from a knowledge base to replicate the decision-making abilities of a human expert within a specific field, like a medical diagnosis.

Large Language Model (LLM)

A form of AI that utilizes deep learning algorithms to create models (see machine learning model) pre-trained on massive text datasets for the general purpose of language learning to analyze and learn patterns and relationships among characters, words and phrases. There are generally two types: generative models that make text predictions based on the probabilities of word sequences learned from its training data (see generative AI) and discriminative models that make classification predictions based on probabilities of data features and weights learned from its training data (see discriminative model). The term "large" generally refers to the model's capacity measured by the number of parameters and to the enormous datasets that it is trained on.

Corpus

A large collection of texts or data that a computer uses to find patterns, make predictions or generate specific outcomes. May include structured or unstructured data and cover a specific topic or a variety of topics.

Foundation Model

A large-scale, pretrained model with AI capabilities, such as language (see large language model), vision, robotics, reasoning, search or human interaction, that can function as the base for use-specific applications. The model is trained on extensive and diverse datasets.

Machine Learning Model

A learned representation of underlying patterns and relationships in data, created by applying an AI algorithm to a training dataset. It can then be used to make predictions or perform tasks on new, unseen data. - black box capabilities - lack transparency and explainability

Bootstrap Aggregating (bagging)

A machine learning method that aggregates multiple versions of a model (see machine learning model) trained on random subsets of a dataset. This method aims to make a model more stable and accurate.

Federated Learning

A machine learning method that allows models (see machine learning model) to be trained on the local data of multiple edge devices or servers. Only the updates of the local model, not the training data itself, are sent to a central location where they get aggregated into a global model — a process that is iterated until the global model is fully trained. This process enables better privacy and security controls for the individual user data.

Reinforcement Learning

A machine learning method that trains a model to optimize its actions within a given environment to achieve a specific goal, guided by feedback mechanisms of rewards and penalties. This training is often conducted through trial-and-error interactions or simulated experiences that do not require external data. For example, an algorithm can be trained to earn a high score in a video game by having its efforts evaluated and rated according to success toward the goal.

Robotics

A multidisciplinary field that encompasses the design, construction, operation and programming of robots. Allows AI systems and software to interact with the physical world.

Transformer Model

A neural network architecture that learns context and maintains relationships between sequence data, such as words in a sentence. It does so by leveraging the technique of attention, i.e. it focuses on the most important and relevant parts of the input sequence. This helps to improve model accuracy. For example, in language-learning tasks, by attending to the surrounding words, the model is able to comprehend the meaning of a word in the context of the whole sentence.

Challenger Model

A new model used to test and compare against the existing model to assess drift, unexpected results, etc.

Federated Learning

A new way to train models whereby you don't have to share data that might be sensitive among different locations i.e., Have a central model in a central location (i.e., in the cloud) - take different local locations, download the central model, and train that central model with their own data in each location. Then they take the results of that training, send it back to the cloud (center location) and there it gets aggregated together

Algorithm

A procedure or set of instructions and rules designed to perform a specific task or solve a particular problem, using a computer.

Retrieval-augmented generation

A process that optimizes the output of a large language model (LLM) by referencing a knowledge base beyond training data sources

Variance

A statistical measure that reflects how far a set of numbers are spread out from their average value in a dataset. A high __ indicates that the data points are spread widely around the mean. A low __ indicates the data points are close to the mean. In machine learning, higher __ can lead to overfitting. The trade-off between __ and bias is a fundamental concept in machine learning. Model complexity tends to reduce bias but increase __. Decreasing complexity reduces __ but increases bias.

Deep Learning

A subfield of AI and machine learning that uses artificial neural networks. Especially useful in fields where raw data needs to be processed, like image recognition, natural language processing and speech recognition.

Active Learning / Query Learning

A subfield of AI and machine learning where an algorithm can select some of the data it learns from. Instead of learning from all the data it is given, it requests additional data points that will help it learn the best.

Machine Learning (ML)

A subfield of AI involving algorithms that enable computer systems to iteratively learn from and then make decisions, inferences or predictions based on input data. These algorithms build a model from training data to perform a specific task on new data without being explicitly programmed to do so. Implements various algorithms that learn and improve by experience in a problem-solving process that includes data cleansing, feature selection, training, testing and validation. Companies and government agencies deploy these algorithms for tasks such as fraud detection, recommender systems, customer inquiries, health care, or transport and logistics.

Natural Language Processing

A subfield of AI that helps computers understand, interpret and manipulate human language by transforming information into content. It enables machines to read text or spoken language, interpret its meaning, measure sentiment and determine which parts are important for understanding.

Semi-supervised Learning

A subset of machine learning that combines both supervised and unsupervised learning by training the model on a large amount of unlabeled data and a small amount of labeled data. This avoids the challenges of finding large amounts of labeled data for training the model. Generative AI commonly relies on this.

Supervised Learning

A subset of machine learning where the model (see machine learning model) is trained on labeled input data with known desired outputs. These two groups of data are sometimes called predictors and targets, or independent and dependent variables, respectively. This type of learning is useful for classification or regression. The former refers to training an AI to group data into specific categories and the latter refers to making predictions by understanding the relationship between two variables.

Unsupervised Learning

A subset of machine learning where the model is trained by looking for patterns in an unclassified dataset with minimal human supervision. The AI is provided with preexisting unlabeled datasets and then analyzes those datasets for patterns. This type of learning is useful for training an AI for techniques such as clustering data (outlier detection, etc.) and dimensionality reduction (feature learning, principal component analysis, etc.).

Transfer Learning Model

A type of model (see machine learning model) used in machine learning in which an algorithm learns to perform one task, such as recognizing cats, and then uses that learned knowledge as a basis when learning a different but related task, such as recognizing dogs.

Discriminative Model

A type of model (see machine learning model) used in machine learning that directly maps input features to class labels and analyzes for patterns that can help distinguish between different classes. It is often used for text classification tasks, like identifying the language of a piece of text. Examples are traditional neural networks, decision trees and random forests.

Classification Model

A type of model (see machine learning model) used in machine learning that is designed to take input data and sort it into different categories or classes.

Neural Networks

A type of model (see machine learning model) used in machine learning that mimics the way neurons in the brain interact with multiple processing layers, including at least one hidden layer. This layered approach enables __ to model complex nonlinear relationships and patterns within data. Artificial __ have a range of applications, such as image recognition and medical diagnosis.

Multimodal Models

A type of model used in machine learning (see machine learning model) that can process more than one type of input or output data, or 'modality,' at the same time. For example, a __ can take both an image and text caption as input and then produce a unimodal output in the form of a score indicating how well the text caption describes the image. These models are highly versatile and useful in a variety of tasks, like image captioning and speech recognition.

Decision Tree

A type of supervised learning model used in machine learning (see machine learning model) that represents decisions and their potential consequences in a branching structure. - Not a black box - More explainable - Changing just a little bit of training data can have a significant impact on the algorithm itself - Susceptible to security attacks and hacks

Artificial General Intelligence (AGI)

AI that is considered to have human-level intelligence and strong generalization capability to achieve goals and carry out a variety of tasks in different contexts and environments. AGI still remains a theoretical field of research. It is contrasted with "narrow" AI, which is used for specific tasks or problems.

Conformity Assessment

An analysis, often performed by a third-party body, on an AI system to determine whether requirements, such as establishing a risk-management system, data governance, record keeping, transparency and cybersecurity practices, have been met. Often referred to as an audit.

Robustness

An attribute of an AI system that ensures a resilient system that maintains its functionality and performs accurately in a variety of environments and circumstances, even when faced with changed inputs or adversarial attacks.

Reliability

An attribute of an AI system that ensures it behaves as expected and performs its intended function consistently and accurately, even with new data that it has not been trained on.

Fairness

An attribute of an AI system that prioritizes relatively equal treatment of individuals or groups in its decisions and actions in a consistent, accurate manner. Every model must identify the appropriate standard of fairness that best applies, but most often it means the AI system's decisions should not adversely impact, whether directly or disparately, sensitive attributes like race, gender or religion.

AI System Inventory

An organized database of artifacts relating to an ai system or model. May include system documentation, incident response plans, data dictionaries, links to implementation software or source code, names and contact info for relevant ai actors or other info that may be helpful for model model or system maintenance and incident response purposes

Clustering (Clustering Algorithms)

An unsupervised machine learning method where patterns in the data are identified and evaluated, and data points are grouped accordingly based on their similarity.

Deepfakes

Audiovisual content that has been altered or manipulated using AI techniques. Can be used to spread misinformation and disinformation.

Disinformation

Audiovisual content, information and synthetic data that is intentionally manipulated or created to cause harm. Can be spread through deepfakes by those with malicious intentions.

Model card

Brief document that discloses information about an ai model like explanations about intended use, performance metrics and benchmarked evaluation in various conditions such as across different cultures, demographics or race. Primarily focused on documenting the models behavior and intended deployment context rather than serving as a complete system governance tool

Exploratory Data Analysis

Data discovery process techniques that take place before training a machine learning model in order to gain preliminary insights into a dataset, such as identifying patterns, outliers, and anomalies and finding relationships among variables.

Synthetic Data

Data generated by a system or model that can mimic and resemble the structure and statistical properties of real data. It is often used for testing or training machine learning models, particularly in cases where real-world data is limited, unavailable or too sensitive to use.

Input Data

Data provided to or directly acquired by a learning algorithm or machine learning model for the purpose of producing an output. It forms the basis upon which the machine learning model will learn, make predictions and/or carry out tasks.

Strict liability regimes

Do not need to prove intentional wrongdoing, only that the product itself was defective and this caused the harm. Victims have to prove that they were harmed by a defective product Ex: Reform of the 1985 product liability law of the EU

Misinformation

False audiovisual content, information or synthetic data that is unintentionally misleading. It can be spread through deepfakes by those who lack intent to cause harm.

Variables (Features)

In the context of machine learning, a __ is a measurable attribute, characteristic or unit that can take on different values. Can be numerical/quantitative or categorical/qualitative.

Hallucinations

Instances where a generative AI model creates content that either contradicts the source or creates factually incorrect output under the appearance of fact.

Containerization

Involves packaging the model and dependencies (everything the model needs to run) into a self-_ unit. Can help reduce comparability issues and make it easier to deploy the model in different environments)

Fault liability regimes

It must be proven that an action or inaction by the product maker caused harm (i.e., non-compliance with a product safety law, or negligence) Ex: AI liability directive

Exposing the model

Making the model accessible for real-world use allows systems or applications to interact with it. Many options exist for this step including REST APIs (representational state transfer application programming interfaces) and embedding into an application

Model Drift

Occurs when the relationship between input data and output predictions changes over time. This means the conditions under which the model was trained no longer apply, causing a decline in model performance Ex: a spam detection model that fails to recognize new types of spam when the nature of spam evolves

Trustworthy AI (Responsible AI / Ethical AI)

Principle-based AI governance and development, including the principles of security, safety, transparency, explainability, accountability, privacy, nondiscrimination/nonbias (see bias), among others.

Negligence liability claim

Product maker has failed to exercise due care and leads to harm

Breach of Warranty liability claim

Promises about a product have not been met

OECD AI Principles

Promote use of AI that is innovative and trustworthy and that respects human rights and democratic values. Set standards for AI that are practical and flexible enough to stand the test of time - inclusive growth, sustainable development and well-being - human rights and democratic values, including fairness and privacy - transparency and explainability - robustness, security and safety - accountability

US NIST AI Risk Management Framework

Provides practical guidance on risk management activities for AI principles. 7 characteristics of trustworthy AI - valid and reliable - safe - secure and resilient - accountable and transparent - explainable and interpretable - privacy-enhanced - fair With harmful bias managed Key steps: test, evaluate, verify, validate Govern - cultivate and implement a culture of risk management Map - identify use and risks and related to use Measure - assess, analyze and track risks Manage - prioritize risks and act based on projected impact

Compute

Refers to the processing resources that are available to a computer system. This includes the hardware components such as the central processing unit or graphics processing unit. Computing is essential for memory, storage, processing data, running applications, rendering graphics for visual media, powering cloud computing, among others.

Post Processing

Steps performed after a machine learning model has been run to adjust the output of that model. This can include adjusting a model's outputs and/or using a holdout dataset — data not used in the training of the model — to create a function that is run on the model's predictions to improve fairness or meet business requirements.

Preprocessing

Steps taken to prepare data for a machine learning model, which can include cleaning the data, handling missing values, normalization, feature extraction and encoding categorical variables. Can play a crucial role in improving data quality, mitigating bias, addressing algorithmic fairness concerns, and enhancing the performance and reliability of machine learning algorithms.

Generalization

The ability of a machine learning model to understand the underlying patterns and trends in its training data and apply what it has learned to make predictions or decisions about new, unseen data.

Explainability (XAI)

The ability to describe or provide sufficient information about how an AI system generates a specific output or arrives at a decision in a specific context to a predetermined addressee. Important in maintaining transparency and trust in AI.

Safety

The development of AI systems that are designed to minimize potential harm, including physical harm, to individuals, society, property and the environment.

Transparency

The extent to which information regarding an AI system is made available to stakeholders, including disclosing whether AI is used and explaining how the model works. It implies openness, comprehensibility and accountability in the way AI algorithms function and make decisions.

Parameters

The internal variables that an algorithmic model learns from the training data. They are values that the model adjusts to during the training process so it can make predictions on new data. Specific to the architecture of the model. For example, in neural networks, ___ are the weights and biases of each neuron in the network.

Entropy

The measure of unpredictability or randomness in a set of data used in machine learning. A higher __ signifies greater uncertainty in predicting outcomes.

Accountability

The obligation and responsibility of the creators, operators and regulators of an AI system to ensure the system operates in a manner that is ethical, fair, transparent and compliant with applicable rules and regulations (see fairness and transparency). Ensures that actions, decisions and outcomes of an AI system can be traced back to the entity responsible for it.

Model Training

The point where you train, test, evaluate and retrain different models. This is done to determine the best model to use, and the best settings for that model, in order to achieve a desired outcome for the AI system

Contestability (Redress)

The principle of ensuring that AI systems and their decision-making processes can be questioned or challenged. This ability to challenge the outcomes, outputs and/or actions of AI systems can help promote transparency and accountability within AI governance.

Oversight

The process of effectively monitoring and supervising an AI system to minimize risks, ensure regulatory compliance and uphold responsible practices. Important for effective AI governance, and mechanisms may include certification processes, conformity assessments and regulatory authorities responsible for enforcement.

Automated Decision Making

The process of making a decision by technological means without human involvement, either in whole or in part.

Bias

There are several types within the AI field. Computational __ is a systematic error or deviation from the true value of a prediction that originates from a model's assumptions or the input data itself. Cognitive __ refers to inaccurate individual judgment or distorted thinking, while societal __ leads to systemic prejudice, favoritism and/or discrimination in favor of or against an individual or group. __ can impact outcomes and pose a risk to individual rights and liberties.

Feature Engineering

transforming raw data into useful representations (features), using domain knowledge from experts; performed when defining the features in the AI Development Cycle: development phase

Training Data

A subset of the dataset that is used to train a machine learning model until it can accurately predict outcomes, find patterns or identify structures within the training data.

Validation Data

A subset of the dataset used to assess the performance of the machine learning model during the training phase. __ is used to fine-tune the parameters of a model and prevent overfitting before the final evaluation using the test dataset.

Testing Data

A subset of the dataset used to test and evaluate a trained model. It is used to test the performance of the machine learning model with new data at the very end of the initial model development process and for future upgrades or variations to the model.

Random Forest

A supervised machine learning (see supervised learning) algorithm that builds multiple decision trees and merges them together to get a more accurate and stable prediction. Each decision tree is built with a random subset of the training data (see bootstrap aggregating), hence the name "random forest." Random forests are helpful to use with datasets that are missing values or are very complex.

AI Governance

A system of laws, policies, frameworks, practices and processes at international, national and organizational levels. AI governance helps various stakeholders implement, manage and oversee the use of AI technology. It also helps manage associated risks to ensure AI aligns with stakeholders' objectives, is developed and used responsibly and ethically, and complies with applicable requirements.

Differential Privacy

A technique that protects information about training data from being revealed by "blurring" data points using an algorithm to generate values that remain meaningful yet nonspecific. "Injecting noise" drives down utility. Limit the amount of inquiries

Adversarial Machine Learning

A technique that raises a safety and security risk to the model and can be seen as an attack. These attacks can be instigated by manipulating the model, such as by introducing malicious or deceptive input data. Such attacks can cause the model to malfunction and generate incorrect or unsafe outputs, which can have significant impacts. For example, manipulating the inputs of a self-driving car may fool the model to perceive a red light as a green one, adversely impacting road safety.

Greedy Algorithms

A type of __ that makes the optimal choice to achieve an immediate objective at a particular step or decision point, based on the available information and without regard for the longer-term optimal solution.

Inference

A type of machine learning process where a trained model (see machine learning model) is used to make predictions or decisions based on input data.


Ensembles d'études connexes

Abnormal Psychology Quiz 2: Ch. 4 & 5

View Set

ACC 301 Ch. 21 Statement of Cash Flows

View Set

CCNA 1 Chapter 10 Exam Answers 2020

View Set

Mortgage Loan Origination activities

View Set

Business Intel and Analytics - Chapter 1 HW Q&A

View Set

Quiz: Chapter 31, Serious Mental Illness

View Set

Chapter 1: The Principles and Practice of Economics

View Set

Anatomy Quiz 2 - Muscles and Bones of the Axial Skeleton

View Set

MIS 111 LAB Problem Sets 1-13 Q&A's

View Set

Gestational Diabetes Hesi Case Study - 2019

View Set

Chapter 19 patterns of chromosomal Inheritance

View Set