AIGP
Fuzzy Rules
"If-Then" stmts to express r-ships b/w variables. Ex: If temp "hot", then fan "high".
3 Ecosystem Harm Types
(1) Emissions; (2) Energy; (3) Water
3 AI Governance Model Types
CHD: (1) Centralized; (2) Hybrid; (3) Decentralized (Local).
Misinformation
False A/V content, info or synthetic data that is unintentionally misleading. Can be spread via deepfakes.
Large Language Models (LLMs)
Form of AI using DL a-rithms to create models pre-trained on massive text datasets for gen. purpose of language learning to analyze/learn patterns & r-ships among characters, words & phrases. Generally 2 LLM types: gen. models predicting text based on probabilities of word sequences & discriminative models making classification predictions based on probabilities of data features &weights. The term "large" generally refers to model's capacity measured by # of parameters & to enormous datasets it's trained on.
Cognitive Bias
Inaccurate individual judgment or distorted thinking.
Parameters
Internal variables an a-rithmic model learns from training data. They're values model adjusts to during training so it can make predictions on new data & are specific to model's architecture. Ex: in NNs, they're weights/biases of each neuron in the network.
3 Elements - Expert Systems
KI-UI: (1) Knowledge base; (2) Inference engine; (3) User Interface.
Autonomy
Level of independence to achieve objectives.
Overfitting
Model OK for training data but not new data.
Classification Model
Model sorts input into diff. categories/classes (classifiers).
Decision Tree
Supv. learning model type that represents decisions & potential consequences in branching structure.
5 Vs of Data Wrangling
VALUE VARIETY VELOCITY VERACITY VOLUME
Testing Data
A dataset subset used to test & evaluate a trained model, used to test perf. of ML model w/new rmance of the machine learning model with new data at the very end of the initial model development process and for future upgrades or variations to the model.
Training Data
A dataset subset used to train a ML model until it can accurately predict outcomes, find patterns or ID structures w/in the training data.
Greedy Algorithms
A-rithm chooses to achieve immediate goal per available info & w/no regard for long-term optimal solution.
Disinformation
A/V content, info & synthetic data intentionally manipulated/created to cause harm. Can spread via deepfakes.
Deepfakes
A/V generated/altered to portray diff. reality (is % true?).
Fairness
AI attribute prioritizing relatively equal treatment of individuals/groups in decisions/actions in consistent, accurate manner. Each model must ID appropriate std of fairness that best applies, but gen. means AI decisions shouldn't adversely impact (directly or disparately), sensitive attributes like race/gender/ religion.
Accountability
AI creators/operators/regulators' obligation & resp. to ensure AI operates in ethical, fair, transparent & compliant way (w/appl. rules & regs). Ensures AI's actions, decisions & outcomes can be traced back to resp. entity.
Chatbot
AI designed to simulate human-like conversations & interactions. Uses NLP & DL to u-std & respond to text/other media. Often used for customer service & other help applications & ingest users' personal data.
Computer Vision
AI enabling computers to process/analyze images, videos & other visual inputs.
Machine Learning (ML)
AI subfield involving algorithms enabling computer systems to iteratively learn from & make decisions, inferences or predictions based on input data. A-rithms build model from training data to perf. specific task on new data w/out being explicitly programmed to do so. ML implements various a-rithms that learn/improve by experience in problem-solving process incl. data cleansing, feature selection, training, testing & validation. Companies & govt. agencies deploy ML algorithms for tasks such as fraud detection, recommender systems, customer inquiries, health care, or transport & logistics.
Natural Language Processing
AI subfield that helps computers understand, interpret & manipulate human language by transforming info into content. Enables machines to read text/spoken language, interpret its meaning, measure sentiment & determine which parts are important for understanding.
Reliability
AI system attribute ensuring AI behaves as expected & performs intended function consistently & accurately, even w/new data it wasn't trained on.
Robustness
AI system attribute ensuring a resilient system that maintains its functionality & performs accurately in a variety of environments & circumstances, even when faced with changed inputs or adversarial attacks.
Generative AI
AI using DL trained on large datasets to create new content (ex: text, code, pics/videos, music, simulations). Unlike discriminative models, GenAI makes predictions on existing data, not new data, & can create new outputs based on input data/user prompts.
Artificial General Intelligence (AGI)
AI w/human-level IQ & strong generalization capability to achieve goals & carry out variety of tasks in diff. contexts/environments. AGI still theoretical research. Contrasted w/"narrow" AI used for specific tasks/problems.
Deep Learning
AI/ML subfield using artificial NNs. Esp. useful in fields where raw data must be processed (image/speech recog, NLP).
Active Learning
AI/ML subfield where an a-rithm can select some of data it learns from. Instead of just learning from all data given, AL model requests add'l data points to help it learn.
Generalization
Ability of model to u-std underlying patterns & trends in training data & apply what it learned to make predictions or decisions about new, unseen data.
Explainability (XAI)
Ability to describe/give sufficient info about how AI generates specific output/arrives at decision in specific context to predetermined addressee. Important in maintaining transparency & trust in AI.
Training Data Poisoning
Alter training data w/bad data to degrade model perf. [Note: Hackers can use AI to hack other AI models.]
Conformity Assessment (Audit)
Analysis, often by 3rd pty, on AI to det. if requirements, such as est. risk-mgmt system, data gov., recordkeeping, transparency & cybersecurity practices were met.
AI Deployment - 7 Risks to Test/Validate
BAR-PISR: Bias Accuracy Robustness - Privacy Interpretability Safety Reliability
Societal Bias
Broad bias leads to systemic prejudice, favoritism &/or discrimination in favor of/against an individual/group.
8 FIPS/OECD Principles
CUP-SAID-O: (1) Collection Limitation/Data Minimization; (2) Use Limitation; (3) Purpose Specification; (4) Security Safeguards; (5) Accountability; (6) Individual Participation/Access; (7) Data Quality/Relevance; (8) Openness/Notice.
5 OECD AI Sysyems Classification Categories
DEPTA: (1) Data & Input; (2) Economic Context; (3) People & Planet (**incl. PRIVACY**); (4) Tasks & Output; (5) AI Model.
Spillover Data
Data coll. on ppl not target of purpose (ex: broad surv.).
Exploratory Data Analysis
Data discovery process techniques occurring *before* training ML model to gain preliminary insights into dataset, such as identifying patterns, outliers & anomalies & finding relationships among variables.
Synthetic Data
Data generated by a system/model that mimics & resembles structure & statistical properties of real data. Often used to test/train ML models, esp. where real-world data is limited, unavailable or too sensitive to use.
Input Data
Data provided to or directly acquired by a learning algorithm or ML model to produce an output. Forms basis on which ML model will learn, make predictions and/or carry out tasks.
Safety
Development of AI designed to minimize potential harm, incl. physical harm, to individuals, society, property & the environment.
Artificial Intelligence
Engineered system using various computational techniques to perf./automate tasks. May incl. techniques, (ex: ML) where machines learn from experience, adjust to new inputs & can perf. tasks previously done by humans. Also compsci field dedicated to simulating intelligent behavior in computers - may incl. automated dec-mkg.
Contestability (Redress)
Ensure AI systems & decision-making processes can be questioned/challenged. Ability to contest/challenge AI's outcomes, outputs &/or actions can help promote transparency & accountability w/in AI governance.
4 Fuzzy Logic Steps
F-READ: (1) Fuzzification; (2) Rule Evaluation; (3) Aggregation; (4) Defuzzification.
5 Areas Potentially Impacted by Core AI Harms
G-SICE: (1) Groups; (2) Society; (3) Individuals; (4) Companies/Institutions; (5) Ecosystems
Hallucinations
GenAI output contradicts source/factually wrong but appears to state facts.
Filter Bubbles / Echo Chambers
GenAI repeats back what user told it/already believes.
Clustering
Groups data by common attributes (ex: DNA patterns).
3 Trustworthy AI Attributes
HAT: (1) Human-centric; (2) Accountability; (3) Transparency.
Risk Assessment Formula
Harm Severity X Occurrence Probability
4 Stages of AI
IPTO: Ingestion, Preparation, Training, Output
4 Responsible AI Operalization Steps
ISP-O: (1) Inventory; (2) Standards; (3) Playbooks; (4) Org Structures.
Association Rule Learning
Identifies r-ships b/w data points (ex: detect fraud, predict buying habits, detect anomalies for mechanical faults, genetics, segment consumers for mktg).
Corpus
Large collection of texts/data computer uses to find patterns, make predictions or generate specific outcomes. May incl. structured/unstructured data & cover specific topic/variety of topics.
Foundation Model
Large-scale, pretrained model w/AI capabilities, such as language, vision, robotics, reasoning, search or human interaction, that can function as base for use-specific apps. Model trained on extensive and diverse datasets.
Machine Learning Model
Learned representation of underlying data patterns & relationships, created by applying AI algorithm to a training dataset. Model can then make predictions or perform tasks on new, unseen data.
Bootstrap Aggregating (Bagging)
ML method aggregating multiple model versions trained on random subsets of a dataset; goal - make model more stable & accurate.
Reinforcement Learning
ML method to train model to optimize its actions w/in a given environment to achieve a specific goal, guided by reward/penalty feedback. Training often conducted via trial-&-error interactions or simulated experiences that don't need external data. Ex: a-rithm can be trained to earn a high score in a video game by having its efforts evaluated & rated per success toward the goal.
Neural Networks
ML model type mimicing how brain neurons interact w/multiple processing layers, incl. at least 1 hidden layer. Layered approach enables NNs to model complex nonlinear relationships & patterns w/in data. Artificial NNs have range of applications (ex: image recognition & medical diagnosis).
Transfer Learning Model
ML model type where an a-rithm learns to perform one task, such as recognizing cats, and then uses that learned knowledge as a basis when learning a different but related task, such as recognizing dogs.
Inference
ML process where trained model makes predictions or decisions based on input data.
Semi-supervised Learning
ML subset combining supervised &unsupervised learning by training model on a large amount of unlabeled data & a small amount of labeled data. Avoids challenges of finding large amounts of labeled data for training. GenAI commonly relies on semi-supervised learning.
Unsupervised Learning
ML subset where model is trained by looking for patterns in unclassified dataset w/minimal human supervision. AI is given preexisting unlabeled datasets & analyzes them for patterns. This learning type is useful for training an AI for techniques such as clustering data (outlier detection, etc.) & dimensionality reduction (feature learning, principal component analysis, etc.).
Adversarial Machine Learning
ML technique w/AI safety/security risk; can be seen as attack. Can be done by manipulating model (adding malicious or deceptive input). Cause AI to malfunction & generate incorrect/unsafe outputs, which can have significant impacts. Ex: manipulating self-driving car inputs - AI may see red lights as green.
Discriminative Model
ML that directly maps input features to class labels & analyzes for patterns to help distinguish b/w different classes. Often used for text classification (ID languages). Ex: traditional NNs, decision trees & random forests.
Supervised Training
ML where model trained on labeled input data w/known desired outputs. 2 data groups sometimes called predictors & targets, or independent & dependent variables. Useful for classification or regression.
Transparency
Make info about AI s made available to s-holders, incl. disclosing whether AI is used & explaining how it works. Implies openness, comprehensibility & accountability in way AI a-rithms function & make decisions.
Automated Decision Making
Making decision by tech w/out any human involvement.
Entropy
Measure of unpredictability/randomness in ML dataset. > entropy = > uncertainty in predicting outcomes.
Expert Systems
Mimics human expertise in specific field. Infers from knowledge base, replicates expert judgment/ behavior to help (not repl.) humans. Ex: Cancer tumor diagnosis. Deployed x-sectors (Ag, $$$, med, engineering).
Multimodal Models
Model can process 1/+ input/output data type (or 'modality') at same time. Ex: Can take in an image & text caption & produce unimodal output score indicating how well text describes image. Highly versatile & useful in variety of tasks (image captioning & speech recognition).
Underfitting
Model fails to fully capture training data complexity. May result in poor predictive ability and/or inaccurate outputs. Factors leading to underfitting: too few parameters, too high a regularization rate, or an inappropriate or insufficient set of features in training data.
Federated Learning
Models train on local data of multiple edge devices/ servers. Local model updates (not training data) sent to central location & aggregated into global model — process repeats until global model fully trained. Enables better privacy/security for individual user data.
Robotics
Multidisciplinary field incl. design, construction, operation & programming of robots. Robotics allow AI systems & s/w to interact w/physical world.
Linguistic Variables
NL concepts, ex: L/M/H, warm/hot/boiling.
Transformer Model
NN architecture that learns context & maintains r-ships b/w sequence data (ex: words in a sentence). It leverages technique of attention, i.e. it focuses on most important & relevant parts of input sequence. This helps to improve model accuracy. Ex: in language-learning tasks, by attending to surrounding words, model can comprehend the meaning of a word in context of whole sentence.
Challenger Model
New model to test/compare against 1st one ("Champion").
Champion Model
Orig. model; 2nd one ("Challenger") may be used to test against it.
Narrow/Weak AI
Perf. single/narrow set of related tasks at > proficiency. Narrow constraints/ltd scope. >productivity/eff. -- a-mate repet. tasks, enable smarter d-mkg & optimize by trend analysis (chess). Embed in > sectors ($$$, mfg, med, c-svc) to benefit orgs/users.
Regression
Predicts continuous value per past data (ex: stock price).
Algorithm
Procedure/set of instructions & rules designed to perf. a specific task/solve particular problem, using a computer.
Oversight
Process of effectively monitoring & supervising an AI system to minimize risks, ensure regulatory compliance & uphold responsible practices. Important for effective AI governance & mechanisms may include certification processes, conformity assessments & regulatory authorities responsible for enforcement.
Compute
Processing resources avail. to computer system, incl h/w such as CPU/GPU. Essential for memory, storage, processing data, running apps, rendering graphics for visual media, powering cloud computing, etc.
3 AI Overall Risk Categories
SO-PB: Security/Operational Privacy Business
Platform
Software to develop/test/deploy/refresh AI applications.
Classification (output)
Sorts output into specific categories (ex: puppy pics).
Variance
Statistical measure showing how far set of #s spread out from avg value in dataset. > variance = data points spread widely around mean. < variance = data points close to mean. In ML, > variance can lead to overfitting. Trade-off b/w variance & bias - fundamental ML concept. Model complexity tends to < bias but > variance. Decreasing complexity < variance but > bias.
Post-processing
Steps performed *after* ML model has been run to adjust its output. Can include adjusting a model's outputs &/or using holdout dataset (data not used in model training) to create a function run on the model's predictions to improve fairness or meet business requirements.
Preprocessing
Steps taken to prepare data for ML model, which can incl. cleaning data, handling missing values, normalization, feature extraction & encoding categorical variables. Data preprocessing can play a crucial role in improving data quality, mitigating bias, addressing algorithmic fairness concerns & enhancing perf. & reliability of ML algorithms.
Neural Networks
Structure based on human brain - neuron-like nodes in layers. Continuously improves ability to find right answer. Doesn't need training to make complex nonlinear inferences in unstructured data.
Random Forest
Supervised ML a-rithm building multiple decision trees & merging together for more accurate/stable prediction. Each decision tree built w/a random training data subset (bootstrap aggregating), hence "random forest." Helpful to use w/datasets missing values or are very complex.
AI Governance
System of laws, policies, frameworks, practices & processes at internat'l, nat'l & org levels. Helps various s-holders implement, manage & oversee AI tech use, + manage associated risks to ensure AI aligns w/ s-holder objectives, is developed/used responsibly & ethically, and complies w/applicable requirements.
Computational Bias
Systematic error/deviation from true value of a prediction originating from model's assumptions or the input data.
4 Elements of AI
TA-HI-O: (1) Technology; (2) Autonomy; (3) Human Involvement; (4) Output.
Turing Test
Test of machine ability to exhibit intelligent behavior equivalent to, or indistinguishable from, a human. Alan Turing (1912-1954) originally considered it as an AI's ability to converse via text, such that a human reader couldn't discern a computer-generated response from a human's.
Data Leakage
Unauth. data disclosure to 3rd party (common w/fed. learning).
Clustering Algorithms
Unsupervised ML method - data patterns IDd/evaluated, & data points grouped into clusters based on similarity.
Trustworthy AI
Used interchangeably w/responsible AI & ethical AI, all referring to principle-based AI governance & dev., incl. principles of security, safety, transparency, explainability, accountability, privacy, nondiscrimination/nonbias, among others.