AWS AI Practitioner

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Initial Steps to train a ML model on AWS

- Step 1: Upload the dataset to Amazon S3. - Step 2: Create a training job in Amazon SageMaker. - Step 3: Configure the training job to use the dataset from Amazon S3.

foundation model

A large, pre-trained model that can be adapted for multiple tasks.

Retrieval Augmented Generation (RAG)

A method for enhancing model output by integrating retrieved information.

chunking

A technique to split large inputs into smaller, manageable segments.

Amazon Quicksight

BI tool for dashboards and reports

Amazon Sagemaker Canvas

Build ML models with no code by simply interacting with data and obtaining predictions. Best for nontechnical people.

AWS Cloud Adoption Framework

Business People Governance Platform Security Operations

Batch Inference

Choose for processing large sets of data offline without requiring a persistent endpoint

Amazon Polly

Converts text to speech—important for voice-enabled applications.

Amazon SageMaker Model Cards

Create documents of the lifecycle of ML models. Provide info on training data, performance metrics, and intended use cases

A data scientist is tasked with ensuring that an AI system is designed in a way that prioritizes user trust and comprehension. What does human-centered design mean in this context?

Creating AI systems that prioritize human needs and understanding

RLHF Steps

Creating a reward model, supervised fine-tuning

Amazon SageMaker

Crucial for building, training, and deploying ML models. You don't need to be an expert, but understanding SageMaker's feature services (like Clarify, Model Monitor, and Ground Truth) is essential.

key difference between data access control and data integrity

Data access control handles user authentication and authorization; data integrity ensures data is accurate and unchanged.

How best to tackled skewed examples in training data?

Data augmentation. Also can create more data from existing data.

AWS Glue

Data cleaning service. Extract, transform, and load (ETL) service that can categorize, clean, and transform unstructured data, like medical records, into a structured format

Most improtant chatbot capabilities for ensuring regulatory compliance

Data protection, Monitoring threat detection

Why use Decision Trees over KNN and SVM?

Decision Trees allow to clearly illustrate how different factors influence the outcome, making it easy to understand and interpret the predictions

distinction between discriminative and generative models

Generative models create new data from learned patterns, while discriminative models distinguish between classes in the data.

Transparent Model

Gives insight into decision making

Significant challenges of generative AI

Hallucinations, Toxicity, Intellectual property, plagiarism, disruption the nature of work, knowledge cutoff, lack of explainability, security and privacy concerns

Amazon Comprehend

Handles natural language processing (NLP) tasks like sentiment analysis and entity recognition. Finds insights and relationships in text.

Model Monitoring

Helps with Model Explainability, Detecting Drift, and the Model Update Pipeline

Underfitting

High bias and less variance. Bad on both test and training data.

Human vs automatic evaluation

Human evaluation assesses qualitative aspects like interpretability, while automatic evaluation provides quantitative scores using metrics like F1 and BERT Score.

Amazon SageMaker Clarify

Hvae a model's predictions be transparent and explainable to stakeholders. Can help assess and mitigate bias in pre and post training.

Serverless Inference in Amazon Sagemaker

Ideal for synchronous workloads with spiky traffic patterns that can tolerate latency variations

Amazon Bedrock

It is a fully managed solution that makes high-performing foundation models (FMs) from top AI startups and Amazon available to you through a common API. AWS's service - released in 2023 - for building scalable AI applications. Expect questions on how Bedrock integrates with modern AI applications, especially generative AI.

Gen AI techniques typically used today?

LLMs and RAG

What factors directly influence the latency of a machine learning model?

Length of the input and output sequences

Examples of supervised learning algorithsm

Linear Regression, Neural Network

Overfitting

Low bias and high variance. Bad on test data but good on training data.

MLOps meaning

Machine Learning Operations

Which characteristics of a Generative AI system are relevant for creating personalized product recommendations?

Personalization, Data efficiency, Scalability

Amazon SageMaker Data Wrangler

Prepare and transform data. Data cleansing and handling

Amazon VPC Gateway Endpoint

Privately connect AWS VPC to S3 and DynamoDB without a VPN. Stays within AWS network.

Amazon Sagemaker Inference Options

Real Time, Serverless, Asynchronous, Batch

What is a primary difference between real-time inference and batch inference?

Real-time inference provides immediate predictions with low latency, while batch inference processes large data volumes at once with higher latency.

What can evaluate the performance of a model for text summarization?

Recall-Oriented Understudy for Gisting Evaluation-N (ROUGE-N)

A legal firm has a large collection of scanned legal documents and case files in PDF format. They need a system to automate text extraction, identify key elements like tables and forms, and analyze image content to minimize manual intervention. Which AWS services will most efficiently meet the firm's needs with LEAST operational management?

Rekognition, Textract

Amazon RDS

Relational database service. Supports efficient vector data storage and querying for embeddings from ML models.

A machine learning engineer trained a model on AWS, but the training dataset included confidential information. How can the engineer ensure that the model does not generate inferences containing sensitive data?

Remove confidential data from the training dataset, retrain the model to eliminate traces of sensitive content.

Which two AWS services can provide detailed documentation of the model's training and ensure its predictions are fully explainable for audits

SageMaker Model Cards SageMaker Clarify

Amazon Macie

Service that uses machine learning to discover, classify, and protect sensitive data automatically

Amazon SageMaker Feature Store

Share ML features across projects

Amazon S3

Simple Storage Service (SaaS), a scalable, high-speed, low-cost, web-based cloud storage service designed for online backup and archiving of data and application programs

Amazon OpenSearch Service

Storing embeddings in a vector database for efficient similarity searches

Types of machine learning

Supervised Learning, Unsupervised Learning, Reinforcement Learning

When must you purchase Provisioned Throughput

To use a custom model for inference.

Few-shot prompting

Use a prompt that includes several examples of the task to guide the model in producing accurate responses.

Real-time Inference in Amazon Sagemaker

Use for low-latency workloads with predictable traffic patterns that need consistent latency characteristics and are always available

AWS PrivateLink

VPC directly to AWS services without exposure to public internet. Similar to VPC Gateway Endpoint, but works with more services.

How best to address concerns about data quality and model trustworthiness?

Veracity and Robustness, Explainability

AWS Sagemaker

allows the creation, training, and deployment by developers of machine-learning (ML) models. Released 2017.

Context window

amount of text an AI model can handle and respond to at once

Zero-shot prompting

asks the model to perform tasks without any prior examples

Amazon Bedrock Model Evaluation

assess, compare, and choose the most suitable foundational models

Intelligent Document Processing (IDP)

automates data processing using OCR, computer vision, NLP, and machine learning

Amazon SageMaker Model Registry

catalog models, manage model versions, and associate metadata with the models

AWS Security Hub

comprehensive view of your high-priority security alerts and compliance status

Embeddings

convert textual content into numerical vectorsrepresent data in a high-dimensional space

Amazon Redshift

data warehouse service for big data analytics

model drift

degradation of model behavior due to underlying data distribution

AWS Artifact

download and review compliance documents anytime

PartyRock

environment for experimenting with generative AI models

BERTScore

evaluation metric that uses contextual embeddings to compare generated text with a reference text, making it well-suited for assessing the semantic similarity of chatbot responses

Chain-of-thought prompting

guide the reasoning process of a model in a logical sequence

Negative prompting

guiding a generative AI model to avoid certain topics or content

Data efficiency

how well a model can learn from limited data

Discriminative models

learn the boundary between classes

AWS CloudTrail

log and audit all activities for compliance and security. audit, govern, and ensure compliance within your AWS account

Self-refine prompting

model iteratively solves a problem, critiques its own solution, and then revises the solution based on the critique

Amazon Q in QuickSight

natural language query feature in Amazon QuickSight that enables users to ask questions in plain language and receive relevant visualizations and dashboards without the need for coding or complex querying.

Instruction-based fine-tuning

pre-trained foundation model is further trained with specific instructions to perform particular tasks

Amazon Bedrock Guardrails

set up protections for your AI applications to met accuracy and regulatory compliance standards

Amazon Q

tailored for conversational AI applications. Can generate code snippets, manage reference tracking, and monitor open-source license compliance

Amazon Lex

used for building conversational interfaces like chatbots with voice and text.

AZ

Availability Zone, Redundancy and isolation from other AZs in a given region.

Prompt Leaking

AI inappropriately recalls or references previous queries

Data Poisoning

AI's response includes harmful or misleading content

Which AWS service is primarily responsible for managing user access and permissions to secure AI resources?

AWS Identity and Access Management (IAM)

Amazon Rekognition

AWS service for image and video analysis, such as facial recognition, object detection, and image moderation tasks. Has a Content Moderation feature.

Amazon SageMaker JumpStart

Accelerate your AI journey with pre-trained models and pre-built solutions

Why use Accuracy over Precision or Recall?

Accuracy gives the whole picture where Recall and Precision give partial pictures.

AWS tools to evaluate a model's performance and integrate human review

Amazon SageMaker Model Monitor and Amazon A2I (Amazon Augmented AI)

Which AWS machine learning services can detect and read text from images?

Amazon Textract, Amazon Rekognition

Amazon Inspector

Analyze security of EC2 instances by identifying potential vulnerabilities

Best AWS vector search services

DocumentDB, OpenSearch, Neptune ML, Aurora (Postgre-compatible), MemoryDB

Amazon Personalize

Easily generate personalized recommendations.

Amazon EFS stands for what

Elastic File System. When you need a traditional file system.

A data engineer uses an Amazon Bedrock base model to analyze chat interactions for customer support. To track model input and output data for monitoring, which strategy should the engineer implement?

Enable invocation logging directly in Amazon Bedrock.

Best way to evaluate Amazon Bedrock models for business value?

Evaluate the models using a human workforce and custom prompt datasets.

Amazon Textract

Extract text and data from documents

Core principles of responsible AI

Fairness, Explainability, Privacy & Security, Veracity & Robustness, Contrability, Transparency, Governance, Safety

The Batch Transform Inference is best for what

Make predictions in batches where immediate access is not required.

Role of Agents within Amazon Bedrock

Managing complex AI workflows with multiple steps.

AWS Regions and AZs

Most AWS Regions have a minimum of three Availability Zones (AZs), though some exceptions exist.

Federal solution outlines standards to protect the confidentiality, integrity, and availability of the data accessed by the AI.

NIST AU RMF

A financial technology startup is developing an innovative tool to predict stock market trends. The tool analyzes vast amounts of historical stock data to forecast future price movements. Which of the following statements accurately describes the neural networks in this financial application?

Neural networks are utilized as deep learning models that simulate the human brain's pattern recognition capabilities, learning from historical financial data to anticipate future stock market trends.

Best pricing model for pay for what you consume?

On-Demand


Set pelajaran terkait

Structure & Stereochemistry of Alkanes (Ch. 3)

View Set

Part 3- Saunders NCLEX Cardiovascular

View Set

Chapter 1: Nursing Past & Present

View Set

3.5 Average revenue, total revenue and profit.

View Set

Chapter 7 and 8 Criminal Justice

View Set