Stephane Marek - AWS Certified AI Practitioner - Failed Questions
- When temperature is set to 1 Temperature is a value between 0 and 1, and it regulates the creativity of the model's responses. Use a lower temperature if you want more deterministic responses. Use a higher temperature if you want creative or different responses for the same prompt on Amazon Bedrock and this is how you might see hallucination responses.
A company is using the Amazon Titan Text model with Amazon Bedrock. Which scenarios is the model most likely to hallucinate?
SageMaker Ground Truth enables the creation of high-quality labeled datasets by incorporating human feedback in the labeling process, which can be used to improve reinforcement learning models
A company developing AI-powered customer service chatbots is exploring ways to improve the quality and accuracy of responses using Reinforcement Learning from Human Feedback (RLHF). The data science team is considering using Amazon SageMaker Ground Truth to assist with gathering and processing human feedback during model training. To ensure this solution aligns with their needs, they want to understand how SageMaker Ground Truth supports the key capabilities required for implementing RLHF, such as collecting, labeling, and managing human input effectively. What do you suggest?
Top P Top P represents the percentage of most likely candidates that the model considers for the next token. Choose a lower value to decrease the size of the pool and limit the options to more likely outputs. Choose a higher value to increase the size of the pool and allow the model to consider less likely outputs.
A company is using Amazon Bedrock and it wants to regulate the percentage of most-likely candidates considered for the next word in the model's output. Which of the following inference parameters would you recommend for the given use case?
Response length Response length represents the minimum or maximum number of tokens to return in the generated response. Stop sequence - Stop sequences specify the sequences of characters that stop the model from generating further tokens. If the model generates a stop sequence that you specify, it will stop generating after that sequence. Top P - Top P represents the percentage of most likely candidates that the model considers for the next token. Top K - Top K represents the number of most likely candidates that the model considers for the next token.
A company is using Amazon Bedrock and it wants to set an upper limit on the number of tokens returned in the model's response. Which of the following inference parameters would you recommend for the given use case?
The company should use a VPC endpoint for Amazon S3 that allows secure, private connectivity between the VPC and Amazon S3, without the need for an internet connection, ensuring data is transferred securely within the AWS network A VPC endpoint for Amazon S3 is the most appropriate choice because it creates a private connection between the VPC and Amazon S3 over the AWS network, without requiring internet access. This VPC endpoint allows the SageMaker model deployed within the VPC to securely access data from Amazon S3 directly, using the internal AWS network paths. It provides enhanced security by keeping data traffic within the AWS infrastructure and not exposing it to the public internet.
A financial analytics company has deployed a machine learning model using Amazon SageMaker within a Virtual Private Cloud (VPC) to analyze sensitive customer data. To meet security guidelines, the VPC is configured with no internet access. However, the model needs to regularly access and read data stored in Amazon S3. The company is looking for a solution that allows secure data transfer between the SageMaker model in the VPC and Amazon S3 without exposing data traffic to the public internet. What do you recommend?
Feature Engineering involves selecting, modifying, or creating features from raw data to improve the performance of machine learning models, and it is important because it can significantly enhance model accuracy and efficiency Feature Engineering is the process of selecting, modifying, or creating new features from raw data to enhance the performance of machine learning models. It is crucial because it can lead to significant improvements in model accuracy and efficiency by providing the model with better representations of the data.
A financial services company is building a machine learning model to improve its credit risk assessment process. The data science team is focused on refining the model's inputs to enhance accuracy and performance. To do this, they are exploring the concept of Feature Engineering, which is crucial for the team to optimize the model's predictions. What is Feature Engineering in the context of machine learning?
The bias versus variance trade-off refers to the challenge of balancing the error due to the model's complexity (variance) and the error due to incorrect assumptions in the model (bias), where high bias can cause underfitting and high variance can cause overfitting The bias versus variance trade-off in machine learning is about finding a balance between bias (error due to overly simplistic assumptions in the model, leading to underfitting) and variance (error due to the model being too sensitive to small fluctuations in the training data, leading to overfitting). The goal is to achieve a model that generalizes well to new data.
A financial services company is building a machine learning model to predict loan defaults, but the data science team is struggling to find the right balance between model complexity and accuracy. They are aware of the bias-variance trade-off, as understanding this trade-off is critical for optimizing the model's performance and ensuring it generalizes well. What is the bias versus variance trade-off in machine learning?
Interpretability is about understanding the internal mechanisms of a machine learning model, whereas explainability focuses on providing understandable reasons for the model's predictions and behaviors to stakeholders. Interpretability refers to how easily a human can understand the reasoning behind a model's predictions or decisions. It's about making the inner workings of a machine learning model transparent and comprehensible. Explainability goes a step further by providing insights into why a model made a specific prediction, especially when the model itself is complex and not inherently interpretable. It involves using methods and tools to make the predictions of complex models understandable to humans.
A financial services company is deploying AI models to assess credit risk and make lending decisions. As part of ensuring ethical AI use, the company wants to build models that are both interpretable and explainable to regulators, stakeholders, and customers. The data science team needs to understand the distinction between interpretability and explainability in the context of Responsible AI to choose the right techniques for transparency. This distinction will guide the company in making its AI models more trustworthy and compliant. Which of the following represents the best option for the given use case?
AWS Audit Manager AWS Audit Manager helps automate the collection of evidence to continuously audit your AWS usage. It simplifies the process of assessing risk and compliance with regulations and industry standards, making it an essential tool for governance in AI systems.
A financial services company is deploying AI systems on AWS to analyze customer transactions and detect fraud. To meet stringent regulatory requirements, the company's compliance team needs a tool that can continuously audit AWS usage, automate evidence collection, and streamline risk assessments. This tool should help ensure that the AI systems comply with industry standards and reduce the manual effort involved in compliance reporting. Which AWS tool meets these requirements?
IAM Identity Center With IAM Identity Center, you can create or connect workforce users and centrally manage their access across all their AWS accounts and applications. You need to configure an IAM Identity Center instance for your Amazon Q Business application environment with users and groups added. Amazon Q Business supports both organization and account-level IAM Identity Center instances.
A financial services firm is adopting Amazon Q Business to streamline its data-driven decision-making processes. As part of the implementation, the company needs a robust solution for managing user access, ensuring that employees across various departments have appropriate permissions to interact with dashboards and reports. The team is evaluating options for user management that offer secure, scalable, and easy-to-administer controls within Amazon Q Business. Which of the following would you recommend for user management in Amazon Q Business?
Agility
A fintech company is looking to improve its software development lifecycle by adopting cloud-based solutions that allow for faster innovation and more efficient deployment of new features. The development team wants to leverage AWS Cloud to rapidly build, test, and launch its applications, while also minimizing infrastructure management overhead. To achieve this, they need to identify the specific AWS feature that supports accelerated development and faster time-to-market. Which feature of AWS Cloud offers the ability to innovate faster and rapidly develop, test, and launch software applications?
Self-supervised learning It works when models are provided vast amounts of raw, almost entirely, or completely unlabeled data and then generate the labels themselves. Foundation models use self-supervised learning to create labels from input data. In self-supervised learning, models are provided vast amounts of raw completely unlabeled data and then the models generate the labels themselves. This means no one has instructed or trained the model with labeled training data sets.
A healthcare analytics company is exploring the use of Foundation Models to automate the process of labeling vast amounts of medical data, such as patient records and clinical notes, to enhance its machine learning models for diagnosis and treatment recommendations. The company wants to understand the specific techniques that Foundation Models use to generate labels from raw input data, helping streamline the data annotation process without requiring extensive manual effort. Which of the following techniques is used by Foundation Models to create labels from input data?
SageMaker model cards include information about the model such as intended use and risk rating of a model, training details and metrics, evaluation results, and observations. AI service cards provide transparency about AWS AI services' intended use, limitations, and potential impacts
A healthcare company is deploying AI models using Amazon SageMaker to predict patient outcomes and ensure compliance with healthcare regulations. The data science team wants to document important details about their models, such as performance, bias assessments, and intended use. They are considering using SageMaker model cards for this purpose but also want to understand how AI service cards fit into the broader documentation of their AI services. Understanding the differences between these two tools will help the team select the right one for tracking and managing their AI models. Given this context, how would you highlight the key differences between SageMaker model cards and AI service cards?
- Sentiment analysis - Fraud identification Semi-supervised learning is when you apply both supervised and unsupervised learning techniques to a common problem. This technique relies on using a small amount of labeled data and a large amount of unlabeled data to train systems. First, the labeled data is used to partially train the machine learning algorithm. After that, the partially trained algorithm labels the unlabeled data. This process is called pseudo-labeling. The model is then re-trained on the resulting data mix without being explicitly programmed.
A healthcare company is developing a machine learning model to analyze medical images and patient records to assist with diagnostics. The team has access to a large amount of unlabeled data and a smaller set of labeled data, and they are considering using semi-supervised learning to maximize the utility of both datasets. To make an informed decision on the approach, the data science team wants to understand which methods fall under semi-supervised learning. Which of the following are examples of semi-supervised learning? (Select two)
FMs use unlabeled training data sets for self-supervised learning FMs use unlabeled training data sets for self-supervised learning In supervised learning, you train the model with a set of input data and a corresponding set of paired labeled output data. Unsupervised machine learning is when you give the algorithm input data without any labeled output data. Then, on its own, the algorithm identifies patterns and relationships in and between the data. Self-supervised learning is a machine learning approach that applies unsupervised learning methods to tasks usually requiring supervised learning. Instead of using labeled datasets for guidance, self-supervised models create implicit labels from unstructured data. Foundation models use self-supervised learning to create labels from input data. This means no one has instructed or trained the model with labeled training data sets.
A healthcare company is evaluating the use of Foundation Models (FMs) in generative AI to automate tasks such as medical report generation, data analysis, and personalized patient communications. The company's data science team wants to better understand the key features and benefits of Foundation Models, particularly how they can be applied to various tasks with minimal fine-tuning and customization. To ensure they choose the right model for their needs, the team is seeking to clarify the essential characteristics of FMs in generative AI. Which of the following is correct regarding Foundation Models (FMs) in the context of generative AI?
FMs use unlabeled training data sets for self-supervised learning In supervised learning, you train the model with a set of input data and a corresponding set of paired labeled output data. Unsupervised machine learning is when you give the algorithm input data without any labeled output data. Then, on its own, the algorithm identifies patterns and relationships in and between the data. Self-supervised learning is a machine learning approach that applies unsupervised learning methods to tasks usually requiring supervised learning. Instead of using labeled datasets for guidance, self-supervised models create implicit labels from unstructured data. Foundation models use self-supervised learning to create labels from input data. This means no one has instructed or trained the model with labeled training data sets.
A healthcare company is evaluating the use of Foundation Models (FMs) in generative AI to automate tasks such as medical report generation, data analysis, and personalized patient communications. The company's data science team wants to better understand the key features and benefits of Foundation Models, particularly how they can be applied to various tasks with minimal fine-tuning and customization. To ensure they choose the right model for their needs, the team is seeking to clarify the essential characteristics of FMs in generative AI. Which of the following is correct regarding Foundation Models (FMs) in the context of generative AI?
By using techniques such as cross-validation, regularization, and pruning to simplify the model and improve its generalization To prevent overfitting, techniques such as cross-validation, regularization, and pruning are employed. Cross-validation helps ensure the model generalizes well to unseen data by dividing the data into multiple training and validation sets. Regularization techniques, such as L1 and L2 regularization, penalize complex models to reduce overfitting. Pruning simplifies decision trees by removing branches that have little importance.
A healthcare startup is developing a machine learning model to predict patient outcomes based on historical medical data. During the training process, the data science team notices signs of overfitting, where the model performs well on the training data but struggles with new, unseen data. To ensure the model generalizes effectively and avoids memorizing the training data, the team needs to implement strategies to prevent overfitting. How can you prevent model-overfitting in machine learning?
Feature engineering for structured data often involves tasks such as normalization and handling missing values, while for unstructured data, it involves tasks such as tokenization and vectorization
A healthcare technology company is developing machine learning models to analyze both structured data, such as patient records, and unstructured data, such as medical images and clinical notes. The data science team is working on feature engineering to extract the most relevant information for the models but is aware that the process differs depending on whether the data is structured or unstructured. To ensure they approach each data type correctly, they need to understand the key differences in feature engineering tasks for structured versus unstructured data in machine learning. What is a key difference in feature engineering tasks for structured data compared to unstructured data in the context of machine learning?
Model parameters are values that define a model and its behavior in interpreting input and generating responses. Hyperparameters are values that can be adjusted for model customization to control the training process
A machine learning team at a tech company is developing a generative AI model to automate text generation for customer support. As part of optimizing the model's performance, the team needs to adjust both model parameters and hyperparameters but wants to clearly understand the distinctions between the two. Understanding these differences is crucial for fine-tuning the model and improving its output. Which of the following highlights the key differences between model parameters and hyperparameters in the context of generative AI?
- Test set is used to determine how well the model generalizes - Validation sets are optional
A retail company is developing a machine learning model to predict customer churn and is in the process of preparing its dataset. The data science team plans to divide the data into a training set, validation set, and test set to ensure the model performs well across different stages of development and evaluation. To proceed effectively, the team needs to fully understand the roles of each of these sets and how they contribute to building a robust model. Which of the following is correct regarding the training set, validation set, and test set used in the context of machine learning? (Select two)
Model training in deep learning involves using large datasets to adjust the weights and biases of a neural network through multiple iterations, using techniques such as gradient descent to minimize the error
A robotics company is developing an AI system to improve the autonomous navigation of its robots. The team is exploring Deep Learning to enhance the system's ability to recognize and respond to its environment. To ensure the AI model performs well, the team needs to understand how model training works in Deep Learning, specifically the process through which the model learns from large datasets by adjusting its internal parameters. This understanding is essential to optimize the model for real-time decision-making. How does model training work in Deep Learning?
Amazon Q in Connect Amazon Connect is the contact center service from AWS. Amazon Q helps customer service agents provide better customer service. Amazon Q in Connect uses real-time conversation with the customer along with relevant company content to automatically recommend what to say or what actions an agent should take to better assist customers.
A telecom company is seeking to improve the efficiency and effectiveness of its customer service operations by integrating generative AI. The goal is to equip customer service agents with AI-driven tools that can assist in generating accurate, context-aware responses to customer inquiries, offer real-time suggestions, and help automate routine tasks. The company is evaluating several generative AI solutions to determine which one best fits their need for enhancing customer service interactions. Which of the following is the best fit for this use case?
The company should use an optimized small language model (SLM) deployed directly on the edge device, allowing for real-time, low-latency inference
An Internet-of-Things (IoT) company is developing a suite of smart sensors and devices that rely on real-time data processing to enable applications like predictive maintenance, environmental monitoring, and immediate anomaly detection. To provide immediate feedback and actions, the company needs to deploy machine learning models directly on its edge devices, ensuring that these models can perform inference with minimal latency. The company is evaluating different approaches to optimize performance and maintain low-latency inference on these edge devices. Which approach would be the most suitable for meeting this requirement?
The developer should create a rule-based application that uses predefined mathematical rules and formulas to answer probability questions accurately. A rule-based application is the most suitable choice for this scenario. Probability questions, like calculating the chance of drawing a spade from a deck of cards, are based on well-defined mathematical rules and formulas. A rule-based system can be programmed with these rules to provide precise answers to such questions, making it an efficient and straightforward solution. This approach ensures accuracy, is easy to implement, and requires no training data, making it ideal for helping students understand fundamental mathematical concepts.
An app developer is building an educational application to help high-school students understand fundamental concepts in mathematics, such as calculating the probability of drawing a spade from a deck of cards. Which approach would be the most suitable for this purpose?
The company should use a multi-modal embedding model, which is designed to represent and align different types of data (such as text and images) in a shared embedding space, allowing the chatbot to understand and interpret both forms of input simultaneously
An e-commerce company is developing a chatbot to enhance its user experience by allowing customers to submit queries that include both text descriptions and images, such as product photos or screenshots of issues. The company aims for the chatbot to understand these multi-modal inputs and provide accurate and context-aware responses, seamlessly combining visual and textual information to address customer needs effectively. Which approach would be the most cost-effective for enabling the chatbot to process such multi-modal queries effectively?
Use Knowledge Bases for Amazon Bedrock to supplement contextual information from the company's private data to the FM using Retrieval Augmented Generation (RAG) With the comprehensive capabilities of Amazon Bedrock, you can experiment with a variety of top FMs, customize them privately with your data using techniques such as fine-tuning and retrieval-augmented generation (RAG), and create managed agents that execute complex business tasks—from booking travel and processing insurance claims to creating ad campaigns and managing inventory—all without writing any code. Using Knowledge Bases for Amazon Bedrock, you can provide foundation models with contextual information from your company's private data for Retrieval Augmented Generation (RAG), enhancing response relevance and accuracy. This fully managed feature handles the entire RAG workflow, eliminating the need for custom data integrations and management.
An insurance company is transitioning to AWS Cloud and wants to use Amazon Bedrock for product recommendations. The company wants to supplement organization-specific information to the underlying Foundation Model (FM). Which of the following represents the best-fit solution for the given use case?
- Amazon SageMaker JumpStart - Amazon Bedrock Large language models (LLM) are very large deep learning models that are pre-trained on vast amounts of data. The underlying transformer is a set of neural networks that consist of an encoder and a decoder with self-attention capabilities. The encoder and decoder extract meanings from a sequence of text and understand the relationships between words and phrases in it. Large language models (LLMs) are one class of Foundation Models. For example, OpenAI's generative pre-trained transformer (GPT) models are LLMs. LLMs are specifically focused on language-based tasks such as such as summarization, text generation, classification, open-ended conversation, and information extraction. AWS recommends AWS Bedrock and Amazon SageMaker JumpStart as the best-fit services for developing LLM based solutions.
As a developer specializing in Large Language Models (LLMs) at a technology company, you have been tasked with migrating the company's AI infrastructure to AWS Cloud to support the development of LLM-based solutions for various applications, such as natural language processing, text generation, and chatbots. The company is looking for AWS services that offer robust support for training, deploying, and managing LLMs while ensuring scalability, security, and integration with other cloud services. Which AWS services would you recommend for developing LLM-based solutions in this environment? (Select two)
- Model Evaluation on Amazon Bedrock - Guardrails for Amazon Bedrock Model evaluation on Amazon Bedrock involves comprehensive process of preparing data, training models, selecting appropriate metrics, testing and analyzing results, ensuring fairness and bias detection, tuning performance, and continuous monitoring. Model Evaluation on Amazon Bedrock helps you to incorporate Generative AI into your application by giving you the power to select the foundation model that gives you the best results for your particular use case. Guardrails for Amazon Bedrock enables you to implement safeguards for your generative AI applications based on use cases and responsible AI policies. You can create multiple guardrails tailored to different use cases and apply them across multiple foundation models (FM), providing a consistent user experience and standardizing safety and privacy controls across generative AI applications.
The development team at a company needs to select the most appropriate large language model (LLM) for the company's flagship application. Given the vast array of LLMs available, the team is uncertain about the best choice. Additionally, since the application will be publicly accessible, the team has concerns about the possibility of generating harmful or inappropriate content. Which AWS solutions should the team implement to address both the selection of the appropriate model and the mitigation of harmful content generation? (Select two).
- Each Availability Zone (AZ) consists of one or more discrete data centers. - Each AWS Region consists of a minimum of three Availability Zones (AZ). Overall explanation Correct options: Each AWS Region consists of a minimum of three Availability Zones (AZ) Each Availability Zone (AZ) consists of one or more discrete data centers AWS has the concept of a Region, which is a physical location around the world where AWS clusters its data centers. AWS calls each group of logical data centers an Availability Zone (AZ). Each AWS Region consists of a minimum of three, isolated, and physically separate AZs within a geographic area. Each AZ has independent power, cooling, and physical security and is connected via redundant, ultra-low-latency networks. An Availability Zone (AZ) is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. All AZs in an AWS Region are intercon
Which of the following are correct statements regarding the AWS Global Infrastructure? (Select two)
Gives the ability to use machine learning to generate predictions without the need to write any code Gives the ability to use machine learning to generate predictions without the need to write any code Amazon SageMaker Canvas gives you the ability to use machine learning to generate predictions without needing to write any code. With Canvas, you can chat with popular large language models (LLMs), access Ready-to-use models, or build a custom model trained on your data.
Which of the following best describes the Amazon SageMaker Canvas ML tool?
