Cloud Final Prep
Which of the following code blocks would execute successfully:
#!/usr/bin/env python3 import boto3 ec2 = boto3.client("ec2") response = ec2.describe_instances( DryRun = False ) print(response) The "shebang" line is correct to invoke python3. boto3 is imported The EC2 client is created properly, not set to empty or the wrong service. The describe_instances method is invoked The DryRun flag should not be set to properly execute, so it should be set to False. The output is returned.
Your VPC has a CIDR address range of 172.31.0.0/16. How many possible usable IPv4 addresses do you have to work with for your own infrastructure?
64256 The .0, .1, .2, .3, and .255 addresses of the final octet are reserved for use by AWS infrastructure. No additional addresses are available in the first two octets (designated by the /16 CIDR block), all 256 addresses are available in the third octet, but only 251 addresses are available in the final octet. Therefore, free address space in this block is 256 * 251 = 64,256 addresses.
What is a Dockerfile? What do I do with it? What's the difference between a container and an image?
A Dockerfile is a manifest that describes the base image to use for your Docker image and what you want installed and running on it. For more information about Dockerfiles, go to the Dockerfile Reference. A Docker image is an immutable (unchangeable) file that contains the source code, libraries, dependencies, tools, and other files needed for an application to run. A Docker container is a virtualized run-time environment where users can isolate applications from the underlying system. These containers are compact, portable units in which you can start up an application quickly and easily. Docker Images vs Containers When discussing the difference between images and containers, it isn't fair to contrast them as opposing entities. Both elements are closely related and are part of a system defined by the Docker platform. If you have read the previous two sections that define docker images and docker containers, you may already have some understanding as to how the two establish a relationship. Images can exist without containers, whereas a container needs to run an image to exist. Therefore, containers are dependent on images and use them to construct a run-time environment and run an application. The two concepts exist as essential components (or rather phases) in the process of running a Docker container. Having a running container is the final "phase" of that process, indicating it is dependent on previous steps and components. That is why docker images essentially govern and shape containers.
What function does a VPC (virtual private cloud) serve? Choose the most accurate answer.
A VPC is a logical definition of the virtual network(s) of your cloud resources. Amazon Virtual Private Cloud (VPC) lets you provision a logically isolated section of the AWS Cloud where you can launch resources in a virtual network that you define.
What is a data lake and why would I use one?
A data lake is a storage repository that can rapidly ingest large amounts of raw data in its native format. As a result, business users can quickly access it whenever needed and data scientists can apply analytics to get insights.
If you create a VPC by running the CloudFormation template snippet below, which additional elements will be automatically created with it? AWSTemplateFormatVersion: 2010-09-09Resources:VPC:Type: 'AWS::EC2::VPC'Properties:CidrBlock: 192.168.0.0/16Tags:- Key: ApplicationValue: !Ref 'AWS::StackId'
A default Network ACL in the VPC you created. The action of creating a VPC automatically creates a default route table, a default ACL, and a default security group. However, it does not create any subnets or subnet associations with the Route Table or ACL. It does not create an Internet or NAT Gateway either.
What function does a CloudFormation stack serve? Be familiar with any limitations to deployment stacks.
AWS CloudFormation ensures all stack resources are created or deleted as appropriate. Because AWS CloudFormation treats the stack resources as a single unit, they must all be created or deleted successfully for the stack to be created or deleted. If a resource can't be created, AWS CloudFormation rolls the stack back and automatically deletes any resources that were created. If a resource can't be deleted, any remaining resources are retained until the stack can be successfully deleted.
Lambda*
AWS event-driven functions-as-a-service platform that builds automated applications that can be triggered by events or schedules. Python, Node, C#, Go, etc.
DynamoDB*
AWS nosql database as a service to support data structure and key-valued cloud services. It is scalable, serverless, and easy to implement.
IAM* (Identity Access Management)
Allows for the definition of users, groups, and roles for human and machine access to your AWS resources using policies that govern access to particular services.
What function does an EC2 "reserved instance" provide?
Allows you to lock in at a lower operational cost by committing to run it for a specific period of time. Reserved instances are a financial category of paying up-front for a lower operational rate than on-demand. Reservations come in 1 and 3 year lengths, and commit you to paying for the instance whether you run that instance type or not. You do not "run" a reserved instance: You purchase a reservation that matches the instance type and region of your actual instance. EBS storage is not covered as part of a reservation.
Which of the following storage services requires you to specify a capacity when you start using it?
Amazon EBS Elastic Block Storage requires that you provision a specific capacity (30GB, 2TB, etc.) and IOPS for storage attached to EC2 instances. All other answers in this question are pay-as-yougo, with no pre-provisioned capacity.
Your code relies on many fast reads/writes to a very large file in the course of its operation, and must be available to users for a year. What storage solution would give you the BEST performance regardless of cost?
Amazon EBS with provisioned IOPS Provisioned IOPS volumes are backed by solid-state drives (SSD) and are the highest performance EBS volumes designed for critical, I/O intensive database applications.
Which AWS service correlates to "cold storage"?
Amazon Glacier Amazon Glacier is "cold storage," which means it is both extremely inexpensive but not immediate accessible for retrieval. You can pay more for faster "resurfacing" of Glacier archives, which then appear within your S3 console for a brief period of time.
Which AWS networking service offers the ability to create a virtual network:
Amazon Virtual Private Cloud (VPC) Amazon VPC is the service for creating and defining elements of a private virtual network in the Amazon cloud. It consists of subnets, routing tables, ACLs, security groups, and other interfaces to connect with outside resources, such as an Internet gateway, NAT gateway, DirectConnect, VPN tunneling, VPC peering, etc.
Your company wants to run their EC2 application servers securely in the private subnets of your VPC. But for those servers to reach the Internet for basic OS updates and to download code, what networking elements do they need to enable this? (Select TWO correct answers)
An entry in the subnet's route table directing traffic outbound for 0.0.0.0/0 to a NAT gateway. A NAT host / NAT gateway attached to your VPC Hosts in a private subnet can only reach out to the Internet indirectly through a NAT host or NAT gateway. The route table for the private subnets would then need to be updated for traffic bound to 0.0.0.0/0 be routed appropriately through the NAT gateway. Elastic IPs attached to hosts in private subnets will have no connectivity since there is not an inbound Internet Gateway. Ingress rules for a security group have no bearing on these hosts being able to reach out to the Internet.
How do tools like Apache Spark, Apache Kafka, or Amazon Kinesis play into analyzing big data? How do they exist in relation to data lakes or other storage solutions?
Apache Spark is an open-source, distributed processing system used for big data workloads. It utilizes in-memory caching, and optimized query execution for fast analytic queries against data of any size. It provides development APIs in Java, Scala, Python and R, and supports code reuse across multiple workloads—batch processing, interactive queries, real-time analytics, machine learning, and graph processing. You'll find it used by organizations from any industry, including at FINRA, Yelp, Zillow, DataXu, Urban Institute, and CrowdStrike. Apache Spark has become one of the most popular big data distributed processing framework with 365,000 meetup members in 2017. Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Amazon Kinesis offers key capabilities to cost-effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements of your application. With Amazon Kinesis, you can ingest real-time data such as video, audio, application logs, website clickstreams, and IoT telemetry data for machine learning, analytics, and other applications. Amazon Kinesis enables you to process and analyze data as it arrives and respond instantly instead of having to wait until all your data is collected before the processing can begin. In summary, the concept of "processing" relates to your data lake in two general ways: processing data that goes into your data lake and processing data from your data lake. The former uses your data lake as a destination, whereas the latter uses your data lake as a data source. The process of ingesting data into your data lake often involves some form of an ETL process. Spark is commonly a major component of that — be it in its own Hadoop cluster or as part of EMR. In the latter case, other EMR technologies may be used to ETL data into a data lake. AWS Glue can help with both ETLing and cataloging your data lake data for future analysis by BI tools, via products that run fast queries against your data lake, such as Dremio. Spark and EMR can also play a vital role in reading and processing data from your data lake, often in data science-related endeavors such as machine learning and artificial intelligence.
Place these in order of scale, starting with the largest conceptual unit and moving to the smallest:
Beginning with the largest and moving to the smallest unit: a. Region b. Availability Zone c. Datacenter d. Rack
Which of the following storage choices must be provisioned with a specific capacity when you create it?
Block Storage Block storage must be partitioned and formatted to a specific size before consumption. In the case of Amazon EBS, you create a new volume and specify its capacity.
Infrastructure as code
CloudFormation in Amazon, Terraform by Hashicorp. boto3 for Python, awscli. Means it is easily recreatable and can be distributed to other environments. Automated, versioned, repeatable, and distributed.
What does "latency-based routing" mean when CloudFront determines which edge location to
CloudFront selects the edge location that returns content most quickly. Latency Based Routing calculates the lowest possible latency between a user and a variety of Edge locations. This may or may not be the closest physical edge location. Proximity is less important than responsiveness.
Your company's digital products are sold on a subscription basis, and customer access is renewed annually with a membership fee. A concern among your management team is the piracy of content or non-subscribers gaining access to your products. A common technique for this is when paying subscribers share links to digital resources with others. Which of the following solution would both safeguard from non-subscribers gaining access to digital content AND be the most affordable for your company?
Create an S3 bucket and disable the "Block all public access" setting. Store your company's digital media files in the bucket without granting public access to them. In your company's website, require users to sign in and then programmatically offer them URLs to your files using the presign feature of S3 to create signed, expiring URLs. Creating an EC2 instance that stores all media files on a local EBS volume does not necessarily protect your site from users stealing content through direct URLs to your media, AND it costs several times more per month than S3 storage. An S3 bucket configured as a website, with all content made public, does not protect your content from piracy. Any known URL to your media files can be shared with non-subscribers. An S3 bucket configured as an OAI behind CloudFront also does not protect your content from privacy. The bucket is protected from direct access but nothing limits the sharing of CloudFront URLs. The best solution makes use of presigned, expiring URLs in S3 that are rendered each time a user signs in. It allows them to access the media they payed for but even if they share a URL it will be valid for only a limited length of time. S3 is also less than one fourth the cost of EBS storage.
Push/Pull architecture
Deli counter. Push via APIs and hard-wired connections. Pull via messaging queues, SQS.
· Docker
Docker is a software platform that allows you to build, test, and deploy applications quickly. Docker packages software into standardized units called containers that have everything the software needs to run including libraries, system tools, code, and runtime. Using Docker, you can quickly deploy and scale applications into any environment and know your code will run.
Block storage
EBS in Amazon. Device attached, local to server. Pre-allocated capacity.
What does CloudFront use to deliver fast, low-latency content to end users?
Edge locations CloudFront uses a wide network of over 200 edge locations to deliver content quickly to end users. This is often geographically close to the user, but the edge location for a specific user is generally determined by lowest-latency.
Which of the following AWS services provides persistent local storage for EC2 compute instances?
Elastic Block Store Amazon Elastic Block Store (EBS) is an easy to use, high performance block storage service designed for use with Amazon Elastic Compute Cloud (EC2) for both throughput and transaction intensive workloads at any scale.
EC2*
Elastic compute cloud for creating instances of virtual machines with a base set of AMIs. Has options for status checking, monitoring, and protection. An EC2 instance must be in a public subnet to have a public IP address.
Which of the following steps would most help secure and encrypt the data in my EC2 applications?
Enable EBS encryption for attached volumes. EBS encryption occurs on the servers that host EC2 instances, ensuring the security of both data-at-rest and data-in-transit between an instance and its attached EBS storage.
(True or False) AWS Support can easily log into your EC2 instance to help you troubleshoot server issues.
False AWS has no access into, and no legal responsibility for, your virtual machines and their contents. Since all authentication to EC2 instan
What is unique about the security model and constraints of Lambda functions?
For AWS Lambda, AWS manages the underlying infrastructure and foundation services, the operating system, and the application platform. You are responsible for the security of your code and identity and access management (IAM) to the Lambda service and within your function.
Be able to describe the design components for EC2 auto-scaling - what goes into configuring it? What are methods for triggering scale-out and scale-in events?
Groups: Your EC2 instances are organized into groups so that they can be treated as a logical unit for the purposes of scaling and management. When you create a group, you can specify its minimum, maximum, and, desired number of EC2 instances. For more information, see Auto Scaling groups. Configuration templates: Your group uses a launch template, or a launch configuration (not recommended, offers fewer features), as a configuration template for its EC2 instances. You can specify information such as the AMI ID, instance type, key pair, security groups, and block device mapping for your instances. For more information, see Launch templates and Launch configurations. Scaling options: Amazon EC2 Auto Scaling provides several ways for you to scale your Auto Scaling groups. For example, you can configure a group to scale based on the occurrence of specified conditions (dynamic scaling) or on a schedule. For more information, see Scaling options. The Auto Scaling group in your Elastic Beanstalk environment uses two Amazon CloudWatch alarms to trigger scaling operations. The default triggers scale when the average outbound network traffic from each instance is higher than 6 MB or lower than 2 MB over a period of five minutes. To use Amazon EC2 Auto Scaling effectively, configure triggers that are appropriate for your application, instance type, and service requirements. You can scale based on several statistics including latency, disk I/O, CPU utilization, and request count.
Scaling - horizontal / vertical.
Horizontal scaling means increasing/decreasing the number of nodes. Vertical scaling means increasing/decreasing the base instance size.
Which of the following is NOT required when creating an EC2 instance?
IAM Role IAM Roles are not required when creating an instance. You do not necessarily need an EC2 compute instance to have any authorized access to other AWS services when it runs.
What does an IAM policy look like related to S3 buckets/objects?
IAM policies specify what actions are allowed or denied on what AWS resources (e.g. allow ec2:TerminateInstance on the EC2 instance with instance_id=i-8b3620ec). You attach IAM policies to IAM users, groups, or roles, which are then subject to the permissions you've defined. In other words, IAM policies define what a principal can do in your AWS environment. S3 bucket policies, on the other hand, are attached only to S3 buckets. S3 bucket policies specify what actions are allowed or denied for which principals on the bucket that the bucket policy is attached to (e.g. allow user Alice to PUT but not DELETE objects in the bucket). S3 bucket policies are a type of access control list, or ACL (here I mean "ACL" in the generic sense, not to be confused with S3 ACLs, which is a separate S3 feature discussed later in this post).
What does IOPS mean functionally?
IOPS measures potential input/output operations per second. IOPS are a unit of measure representing input/output operations per second. Think of this as the "speed" of connected storage, which is different from the "speed" of networking attached to an instance.
CIDR block ranges
IP addressing in the format 10.0.0.0/8 describing 4 octets of addressable IP space and "/n" netmask. where the value of the netmask determines the range of the values, where 8 is the largest range and where 32 is the smallest range. For AWS .0 is vpc network, .1 is vpc router, 2 and .3 are aws specific and 255 is broadcast address
A project for your company requires that outside developers be granted access to some of your company's cloud resources in the AWS Console. Which service would help you manage that?
Identity and Access Management (IAM) AWS Identity and Access Management (IAM) enables you to manage access to AWS services and resources securely. Using IAM, you can create and manage AWS users and groups, and use permissions to allow and deny their access to AWS resources.
Which service would help you manage granting AWS Console access to an outside user?
Identity and Access Management (IAM) Amazon IAM allows you to create specific security policies, users, groups and roles for access to your AWS account. An outside user would most likely get an IAM user account with a custom IAM policy attached to it, or be placed into an IAM group attached to that policy.
Service Discovery
In scaling cloud design, service discovery is the mechanism by which any resource that comes online can "discover" the services it needs in order to run. Consider it a directory for SOA services to find each other. Microservices, Consul.
What makes a "private subnet" private? (Select TWO correct answers)
It does not have a route to 0.0.0.0/0 through an Internet Gateway. It cannot be reached from outside the VPC. Instances in a public subnet can send outbound traffic directly to the Internet, whereas instances in a private subnet cannot by default. In addition, instances in a public subnet can be configured to receive inbound traffic directly from the Internet, whereas instances in a private subnet cannot. Security groups are not attached to subnets and do not control whether a subnet is public or not.
Be familiar with the built-in triggers for AWS Lambda. Understand what an event payload is and what it might contain.
Many AWS services can emit events that trigger Lambda functions. Here is a list of services that invoke Lambda functions synchronously: Elastic Load Balancing (Application Load Balancer) Amazon Cognito Amazon Lex Amazon Alexa Amazon API Gateway Amazon CloudFront (Lambda@Edge) Amazon Kinesis Data Firehose Here is a list of services that invoke Lambda functions asynchronously: Amazon Simple Storage Service Amazon Simple Notification Service Amazon Simple Email Service AWS CloudFormation Amazon CloudWatch Logs Amazon CloudWatch Events AWS CodeCommit AWS Config Information needed to configure the payload. By default, AWS IoT Events generates a standard payload in JSON for any action. This action payload contains all attribute-value pairs that have the information about the detector model instance and the event triggered the action. To configure the action payload, you can use contentExpression.
DevOps
Merger of development and operations, where the developer and operations teams are integrated to improve collaboration. The principles of DevOps are automation, have consistent environments, and small iterations.
What are the advantages of microservices and what is their appropriate use?
Microservices are a way of breaking large software projects into loosely coupled modules, which communicate with each other through simple Application Programming Interfaces (APIs). The advantages of microservices seem strong enough to have convinced some big enterprise players such as Amazon, Netflix, and eBay to adopt the methodology. Compared to more monolithic design structures, microservices offer: · Improved fault isolation: Larger applications can remain mostly unaffected by the failure of a single module. · Eliminate vendor or technology lock-in: Microservices provide the flexibility to try out a new technology stack on an individual service as needed. There won't be as many dependency concerns and rolling back changes becomes much easier. With less code in play, there is more flexibility. · Ease of understanding: With added simplicity, developers can better understand the functionality of a service. · Smaller and faster deployments: Smaller codebases and scope = quicker deployments, which also allow you to start to explore the benefits of Continuous Deployment. · Scalability: Since your services are separate, you can more easily scale the most needed ones at the appropriate times, as opposed to the whole application. When done correctly, this can impact cost savings.
What are the advantages of object storage? (Select THREE correct answers)
Object storage can be used separately from computing resources in the cloud. Object storage is "bottomless," which means it does not have a set capacity. Object storage allows for custom metadata. Cloud Object Storage like S3 is "bottomless" in its capacity, and does not have to be provisioned with a specific amount of capacity. Object storage allows for extensive metadata to be added to objects, which allows for both the automation of data management and richer data discovery. Object storage is always accessible, whether or not you are using any cloud-based computing resources.
are the pricing models for EC2 compute resources and how do I choose between them?
On-Demand With On-Demand instances, you pay for compute capacity by the hour or the second depending on which instances you run. No longer-term commitments or upfront payments are needed. You can increase or decrease your compute capacity depending on the demands of your application and only pay the specified per hourly rates for the instance you use. On-Demand instances are recommended for: · Users that prefer the low cost and flexibility of Amazon EC2 without any up-front payment or long-term commitment · Applications with short-term, spiky, or unpredictable workloads that cannot be interrupted · Applications being developed or tested on Amazon EC2 for the first time See On-Demand pricing » Spot instances Amazon EC2 Spot instances allow you to request spare Amazon EC2 computing capacity for up to 90% off the On-Demand price. Learn More. Spot instances are recommended for: · Applications that have flexible start and end times · Applications that are only feasible at very low compute prices · Users with urgent computing needs for large amounts of additional capacity See Spot pricing » Savings Plans Savings Plans are a flexible pricing model that offer low prices on EC2 and Fargate usage, in exchange for a commitment to a consistent amount of usage (measured in $/hour) for a 1 or 3 year term. Learn more » See Reserved pricing » Dedicated Hosts A Dedicated Host is a physical EC2 server dedicated for your use. Dedicated Hosts can help you reduce costs by allowing you to use your existing server-bound software licenses, including Windows Server, SQL Server, and SUSE Linux Enterprise Server (subject to your license terms), and can also help you meet compliance requirements. Learn more. · Can be purchased On-Demand (hourly). · Can be purchased as a Reservation for up to 70% off the On-Demand price. See Dedicated pricing » Per Second Billing With per-second billing, you pay for only what you use. It takes cost of unused minutes and seconds in an hour off of the bill, so you can focus on improving your applications instead of maximizing usage to the hour. Especially, if you manage instances running for irregular periods of time, such as dev/testing, data processing, analytics, batch processing and gaming applications, can benefit. EC2 usage are billed on one second increments, with a minimum of 60 seconds. Similarly, provisioned storage for EBS volumes will be billed per-second increments, with a 60 second minimum. Per-second billing is available for instances launched in: · On-Demand, Reserved and Spot forms · All regions and Availability Zones · Amazon Linux, Windows, and Ubuntu For details on related costs like data transfer, Elastic IP addresses, and EBS Optimized Instances, visit the On-Demand pricing page. Perform workload cost modeling: Consider the requirements of the workload components and understand the potential pricing models. Define the availability requirement of the component. Determine if there are multiple independent resources that perform the function in the workload, and what the workload requirements are over time. Compare the cost of the resources using the default On-Demand pricing model and other applicable models. Factor in any potential changes in resources or workload components. Perform regular account level analysis: Performing regular cost modeling ensures that opportunities to optimize across multiple workloads can be implemented. For example, if multiple workloads use On-Demand, at an aggregate level, the risk of change is lower, and implementing a commitment-based discount will achieve a lower overall cost. It is recommended to perform analysis in regular cycles of two weeks to 1 month. This allows you to make small adjustment purchases, so the coverage of your pricing models continues to evolve with your changing workloads and their components. Use the AWS Cost Explorer recommendations tool to find opportunities for commitment discounts. To find opportunities for Spot workloads, use an hourly view of your overall usage, and look for regular periods of changing usage or elasticity.
Reserved instance
Pay for a pre-set amount of time (1 or 3 years) for EC2/compute resources, locked in at a lower rate.
Public vs. Private Cloud
Private cloud is what you build, usually an infrastructure operated solely for a single organization. Public cloud is from a large cloud vendor, servers rendered over a network open for public use.
What are the differences between relational and no-SQL databases?
Relational and NoSQL are two types of database systems commonly implemented in cloud-native apps. They're built differently, store data differently, and accessed differently. In this section, we'll look at both. Later in this chapter, we'll look at an emerging database technology called NewSQL. Relational databases have been a prevalent technology for decades. They're mature, proven, and widely implemented. Competing database products, tooling, and expertise abound. Relational databases provide a store of related data tables. These tables have a fixed schema, use SQL (Structured Query Language) to manage data, and support ACID guarantees. No-SQL databases refer to high-performance, non-relational data stores. They excel in their ease-of-use, scalability, resilience, and availability characteristics. Instead of joining tables of normalized data, NoSQL stores unstructured or semi-structured data, often in key-value pairs or JSON documents. No-SQL databases typically don't provide ACID guarantees beyond the scope of a single database partition. High volume services that require sub second response time favor NoSQL datastores.
If an IAM user is bound to the policy below, which of the following actions is she NOT allowed to perform? {"Version": "2012-10-17","Statement": [{"Sid": "Stmt1601401986280","Action": ["s3:Get*","s3:List*","s3:Put*"],"Effect": "Allow","Resource": ["arn:aws:s3:::bucket-name","arn:aws:s3:::bucket-name/*"]}]}
Remove an object from the bucket. The IAM policy allows all GET, LIST, and PUT actions on the bucket and objects stored within it. It does not allow for any deletion or removal of objects.
What are some features of Docker containers that make them useful?
Return on Investment and Cost Savings The first advantage of using docker is ROI. The biggest driver of most management decisions when selecting a new product is the return on investment. The more a solution can drive down costs while raising profits, the better a solution it is, especially for large, established companies, which need to generate steady revenue over the long term. In this sense, Docker can help facilitate this type of savings by dramatically reducing infrastructure resources. The nature of Docker is that fewer resources are necessary to run the same application. Because of the reduced infrastructure requirements Docker has, organizations are able to save on everything from server costs to the employees needed to maintain them. Docker allows engineering teams to be smaller and more effective. Standardization and Productivity Docker containers ensure consistency across multiple development and release cycles, standardizing your environment. One of the biggest advantages to a Docker-based architecture is actually standardization. Docker provides repeatable development, build, test, and production environments. Standardizing service infrastructure across the entire pipeline allows every team member to work in a production parity environment. By doing this, engineers are more equipped to efficiently analyze and fix bugs within the application. This reduces the amount of time wasted on defects and increases the amount of time available for feature development. As we mentioned, Docker containers allow you to commit changes to your Docker images and version control them. For example, if you perform a component upgrade that breaks your whole environment, it is very easy to rollback to a previous version of your Docker image. This whole process can be tested in a few minutes. Docker is fast, allowing you to quickly make replications and achieve redundancy. Also, launching Docker images is as fast as running a machine process. CI Efficiency Docker enables you to build a container image and use that same image across every step of the deployment process. A huge benefit of this is the ability to separate non-dependent steps and run them in parallel. The length of time it takes from build to production can be sped up notably. Compatibility and Maintainability Eliminate the "it works on my machine" problem once and for all. One of the benefits that the entire team will appreciate is parity. Parity, in terms of Docker, means that your images run the same no matter which server or whose laptop they are running on. For your developers, this means less time spent setting up environments, debugging environment-specific issues, and a more portable and easy-to-set-up codebase. Parity also means your production infrastructure will be more reliable and easier to maintain. Simplicity and Faster Configurations One of the key benefits of Docker is the way it simplifies matters. Users can take their own configuration, put it into code, and deploy it without any problems. As Docker can be used in a wide variety of environments, the requirements of the infrastructure are no longer linked with the environment of the application. Rapid Deployment Docker manages to reduce deployment to seconds. This is due to the fact that it creates a container for every process and does not boot an OS. Data can be created and destroyed without worry that the cost to bring it up again would be higher than what is affordable. Continuous Deployment and Testing Docker ensures consistent environments from development to production. Docker containers are configured to maintain all configurations and dependencies internally; you can use the same container from development to production making sure there are no discrepancies or manual intervention. If you need to perform an upgrade during a product's release cycle, you can easily make the necessary changes to Docker containers, test them, and implement the same changes to your existing containers. This sort of flexibility is another key advantage of using Docker. Docker really allows you to build, test, and release images that can be deployed across multiple servers. Even if a new security patch is available, the process remains the same. You can apply the patch, test it, and release it to production. Multi-Cloud Platforms One of Docker's greatest benefits is portability. Over last few years, all major cloud computing providers, including Amazon Web Services (AWS) and Google Compute Platform (GCP), have embraced Docker's availability and added individual support. Docker containers can be run inside an Amazon EC2 instance, Google Compute Engine instance, Rackspace server, or VirtualBox, provided that the host OS supports Docker. If this is the case, a container running on an Amazon EC2 instance can easily be ported between environments, for example to VirtualBox, achieving similar consistency and functionality. Also, Docker works very well with other providers like Microsoft Azure, and OpenStack, and can be used with various configuration managers like Chef, Puppet, and Ansible, etc. Isolation Docker ensures your applications and resources are isolated and segregated. Docker makes sure each container has its own resources that are isolated from other containers. You can have various containers for separate applications running completely different stacks. Docker helps you ensure clean app removal since each application runs on its own container. If you no longer need an application, you can simply delete its container. It won't leave any temporary or configuration files on your host OS. On top of these benefits, Docker also ensures that each application only uses resources that have been assigned to them. A particular application won't use all of your available resources, which would normally lead to performance degradation or complete downtime for other applications. Security The last of these benefits of using docker is security. From a security point of view, Docker ensures that applications that are running on containers are completely segregated and isolated from each other, granting you complete control over traffic flow and management. No Docker container can look into processes running inside another container. From an architectural point of view, each container gets its own set of resources ranging from processing to network stacks.
SOA
Service Oriented Architecture. This is counter to running a "monolith" application that bundles together all necessary services into one. SOA tries to break apart all functionality into separate services, which can then each be analyzed, scaled, throttled, and made more efficient. This improves HA.
What can AWS users do to control costs in their account? (pick 2)
Set up billing budgets and alarms Stop running instances or services to halt on-demand spending. To monitor and control your AWS costs, you should (A) set up a budget and get alerts as you approach that monthly spend; and (C) stop or delete unused resources to limit operational costs from adding up.
Which of the following would incur the greatest cost?
Storing 164GB of data in Amazon S3 for one month. Let's break down each possible charge: m5.xlarge instances in us-east-1 cost $0.192/hr. 17 hours x 0.192 = $3.264 total. 164GB in S3 costs $0.023/GB/month. 1 month = 164 x 0.023 = $3.772 total. VPC with 5 subnets and 12 attached elastic IP addresses = no cost. IAM users / IAM policies = no cost.
What is a "miss" when discussing CDNs? A miss means ...
The CDN cache does not have the content, and must go to the origin to fetch it. the CDN cache does not have the content, and must go to the origin to fetch it. CDNs operate on the premise that they should only cache requested content. When content is requested that the cache does not contain, or if the content has expired, the CDN must fetch it from the origin, i.e. an S3 bucket, an API or website, or some other content source.
When adding a bootstrap script to EC2 user-data, which of the following should be considered:
The script runs non-interactively, so commands must take account of that. A bootstrapping script runs non-interactively, so commands must take account of that. It runs as root, so sudo is not required. It runs only once, when the instance is created. Bootstrapping scripts can be many lines long, but cannot exceed 16KB in raw form, before it is base64 encoded and injected into the instance. Windows bootstrapping is possible, as either .bat batch scripts, or .ps1 PowerShell scripts.
When adding a bootstrap script to EC2 user-data, which of the following is true:
The script runs non-interactively, so commands must take account of that. Bootstrapping runs as root, so sudo should never be required. Repository updates, package installations, and all other commands must always be performed non-interactively since you are not present to approve or deny them. Windows can be bootstrapped using .bat or .ps1 scripts. Bootstrapping scripts can be up to 16kb in size. Bootstrapping occurs only once, when an instance is first provisioned.
In the networking diagram below, what is the relationship between resources A and B?
They are peered VPCs
Why are cloud services more economical than traditional data centers?
They maximize the computing power of physical hardware by using hypervisors. Cloud Services are more economical because they maximize what can be run on physical infrastructure. This process makes sophisticated use of hypervisors and automation.
One of the primary purposes of an EC2 security group is:
To control which remote addresses can connect to EC2 instances over which specific TCP or UDP ports. EC2 security groups are a networking filter much like a firewall. They control ingress and egress of specific address blocks (in CIDR form) coupled with specific TCP and UDP ports. For example, the security group for public-facing web servers might include an INGRESS rule that allows traffic over TCP ports 80 and 443 from 0.0.0.0/0.
In the design below, what function does Travis CI play?
Travis builds new GitHub content and publishes it to S3. Travis CI is a continuous integration / continuous deployment tool. In this design it takes new pushes to GitHub, builds them into web content, publishes them to S3, which is presented through a CloudFront cache.
Explain what DevOps means for how you organize teams and define their roles.
Under a DevOps model, development and operations teams are no longer "siloed." Sometimes, these two teams are merged into a single team where the engineers work across the entire application lifecycle, from development and test to deployment to operations, and develop a range of skills not limited to a single function. In some DevOps models, quality assurance and security teams may also become more tightly integrated with development and operations and throughout the application lifecycle. When security is the focus of everyone on a DevOps team, this is sometimes referred to as DevSecOps. These teams use practices to automate processes that historically have been manual and slow. They use a technology stack and tooling which help them operate and evolve applications quickly and reliably. These tools also help engineers independently accomplish tasks (for example, deploying code or provisioning infrastructure) that normally would have required help from other teams, and this further increases a team's velocity.
Which of the following actions would help enable an EC2 instance to securely retrieve a sensitive file from a private S3 bucket? Select the TWO most secure actions.
Use the AWS CLI tools, or SDKs to retrieve the S3 file to the EC2 server that has an IAM role with appropriate permissions. Attach the instance to an IAM role with a custom IAM policy including the GetObject action. An IAM policy should only include the GetObject action, since it does not need to list, describe, put, or delete any objects in S3. The IAM policy should also define a specific S3 bucket and path as the resource. Then the EC2 instance, running under an IAM role attached to the policy, should use the CLI or SDK to fetch the file.
"Undifferentiated heavy lifting"
What cloud vendors try to provide for you - services, not servers. A hard/repetitive task that does not require any great talent or insight to perform, such as racking+stacking servers, keeping power on, etc. These do not add any particular value to your company or project, but they must be done.
What are AWS Service Quotas?
Your AWS account has default quotas, formerly referred to as limits, for each AWS service. Unless otherwise noted, each quota is Region-specific. You can request increases for some quotas, and other quotas cannot be increased. Service Quotas is an AWS service that helps you manage your quotas for many AWS services, from one location.
Stateless design
a design pattern that allows supporting infrastructure to be swapped in/out without interruption to user activities. The session data does not need to be stored.
High availability (HA)
a design principle taking "fault tolerance" into account. The app stays up even with failures of components since you remove all single points of failure. Also incorporates easy crossover so when the system fails, another instance can take over.
Peering (VPCs)
a logical connection between two or more VPCs that allows you to then define additional routes, ACLs, and subnet rules to allow traffic to flow from one VPC to another.
NAT host/NAT gateway
allows OUTBOUND ONLY traffic
Orchestration
allows containers to be balanced and able to be monitored. Kubernetes, Docker. Includes provisioning/deployment of containers, scaling up/down, load balancing, and automated resilience.
VPC* (Virtual Private Cloud)
an AWS service that allows you to define a private network space with scoped access to cloud services. This includes subnets (located in availability zones)
Right-sizing
appropriate instance sizing for your workload. Not too large, not too small. Mem/CPU.
Security Groups
are bound to instances, not to VPCs or subnets
Network ACL
are brute force rules for ingress/egress (VPC-wide)
Elastic IPs
can be attached to EC2 instances
Edge computing
computing done at or near the source of the data (local places) rather than being transmitted to a data center.
Internet gateway
connects the internet to a public subnet, uses a route. Allows INBOUND and OUTBOUND
IoT / Internet of Things
describes the network of physical objects connected to the Internet.
Route table
directs traffic. a set of rules that tells traffic where to flow
CloudFront*
edge caching CDN (content delivery network) service using 200+ edge locations. Cloudfront routes by latency rather than geography, uses cached content, and is helpful for serving content. Infrastructure as code
EBS*
elastic block storage created in the same availability as your server. SSD available, magnetic available, high IOPS available, highly durable, comes in fixed increments, attached to a single node.
Public subnet
has connectivity to 0.0.0.0/0 ("the internet") for ingress/egress, and routes in place.
Private subnet
has no connectivity to 0.0.0.0/0 for ingress/egress, unless a NAT is used and route rules are in place.
· SNS / SQS
i. Amazon Simple Queue Service (SQS) and Amazon SNS are both messaging services within AWS, which provide different benefits for developers. Amazon SNS allows applications to send time-critical messages to multiple subscribers through a "push" mechanism, eliminating the need to periodically check or "poll" for updates. Amazon SQS is a message queue service used by distributed applications to exchange messages through a polling model, and can be used to decouple sending and receiving components. Amazon SQS provides flexibility for distributed components of applications to send and receive messages without requiring each component to be concurrently available.
· SDKs / CLI / boto3
i. SDK stands for "Software Development Kit", which is a great way to think about it — a kit. Think about putting together a model car or plane. When constructing this model, a whole kit of items is needed, including the kit pieces themselves, the tools needed to put them together, assembly instructions, and so forth. ii. boto3, the AWS SDK for Python. Boto3 makes it easy to integrate your Python application, library, or scri The AWS Command Line Interface (CLI) is a unified tool to manage your AWS services. With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts.
· CloudFormation
i. Using CloudFormation, you can create an entire stack with one function call. The stack can be comprised of multiple Amazon EC2 instances, each one fully decked out with security groups, EBS (Elastic Block Store) volumes, and an Elastic IP address (if needed). The stack can contain Load Balancers, Auto Scaling Groups, RDS (Relational Database Service) Database Instances and security groups, SNS (Simple Notification Service) topics and subscriptions, Amazon CloudWatch alarms, Amazon SQS (Simple Queuue Service) message queues, and Amazon SimpleDB domains.
Spot instance
lowest of all pricing for EC2 resources. You make a bid on the spot market based upon what you're willing to pay for unused capacity. Subject to fluctuations and getting outbid, but useful for short-term large jobs.
Object storage
non-incremental storage with unlimited capacity. It is distributed/resilient, allows external access, and has eventual consistency. Full file read/write. Best to use for web content, media files, and PDFs.
CI/CD (Continuous Integration / Continous Deployment)
one of the main principles of DevOps where CI is making sure that small changes are integrated and CD is making sure deployment occurs often to minimize the risk of having large problems when merging
RDS*
relational database service. Database as a service allows you to make TCP connection and speak SQL directly to the service.
S3*
simple storage service. Amazon's object storage service, an example of an IaaS. 11x9 of durability and 13x9 of availability.
On-Demand instance
simplest, pay-as-you-go pricing. Highest of all pricing for compute resources.
What is the "Shared Security Model" of AWS?
the AWS Shared Security Model dictates which security controls are AWS's responsibility, and which are the customers.
Serverless computing
the cloud fully manages the virtual machines as necessary, such as running code in response to events and automatically scaling.
· RedShift
uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes, using AWS-designed hardware and machine learning to deliver the best price performance at any scale