Cloud Computing

Ace your homework & exams now with Quizwiz!

Argue cloud security is better than enterprise

1) dedicated security experts 2) regular auditing 3) better support for data recovery 4) physical security (amazon building security) 5) amortized hardware 6) Patched OS's --> up to date 7) Blessed OS's --> verified 8) No public cloud security breaches known to the public yet 9) no possibility of disgruntled enterprise employees

Argue enterprise security is better than cloud

1) only my humans, no amazon employees 2) I know where my data is 3) more specific design (maybe you don't need something too sophisticated depending on nature of data) 4) faster, more specific reaction to breach 5) smaller attack surface 6) no wire attacks, no "in-flight" attacks

Why virtualization?

1) reduce capital expenditures (like buying hardware) and OpEx (operational expenditures). 2) Avoid downtime with VM relocation. 3) dynamically re-balance workload to guarantee application SLAs. 4) enforce security policy

MAC address

A MAC address is given to a network adapter when it is manufactured. It is hardwired or hard-coded onto your computer's network interface card (NIC) and is unique to it. The ARP (Address Resolution Protocol) translates an IP address into a MAC address.

Elastic beanstalk

A PaaS that we used in PA2 to run Docker containers. Takes care of provisioning ECS for you.

containers

A isolated system. Multiple containers are run on a single control host and access a single kernel. Because containers share the same OS kernel as the host, containers can be more efficient than VMs, which require separate OS instances. Containers hold the components necessary to run the desired software. Host OS constrains the container's access to physical resources (CPU, memory) so a single container cannot consume all.

EFS (elastic file system)

A mountable file system that can be stored on multiple VMs at the same time. Pro: Remote access; well-known interface (NFS: network file system). Con: Lack of structure; NFS scalability

DynamoDB

A noSQL database, which gives up some benefits of querying but gain scalability. Pro: scalability and performance. Con: Poor analytics/ aggregates; does not have a well known interface; may be difficult to see everything in it's entirety (can look at one entry really easily, but hard to get an overview)

public IP

A public (or external) IP address is the one that your ISP (Internet Service Provider) provides to identify the network to the outside world. It is an IP address that is unique throughout the entire Internet. Could have an IP address that never changes (a fixed/static IP address). But most ISPs provide an IP address that can changes (a dynamic IP address). A machine may or may not have this, but always have a private IP.

Script vs. program

A script is interpreted at runtime, a program is compiled. A script gives rapid prototyping and allows for dynamic typing

AWS security group

A security group acts as a virtual firewall that controls the traffic to and from virtual instances

Protocol

A sequence of well defined messages. Ex: HTTP

sockets

A socket is one endpoint of a two-way communication link between two programs running on the network. A socket is bound to a port number so that the TCP layer can identify the application that data is destined to be sent to. An endpoint is a combination of an IP address and a port number.

Monolithic

A software system is called "monolithic" if it has a monolithic architecture, in which functionally distinguishable aspects (for example data input and output, data processing, error handling, and the user interface) are all interwoven, rather than containing architecturally separate components.

Synchronous vs. asynchronous

A synchronous operation blocks a process till the operation completes. An asynchronous operation is non-blocking and only initiates the operation

VPN

A virtual private network (VPN) extends a private network across a public network, and enables users to send and receive data across shared or public networks as if their computing devices were directly connected to the private network.

Lambda functions

AWS Lambda is a serverless compute service that runs your code in response to events and automatically manages the underlying compute resources for you. Functions must have an endpoint and some triggering event. Used in PA3 and PA6.

S3

AWS blob storage. Pro: simple, good for large data. Con: poor analytics/aggregates, no structure.

horizontal scaling (scale out)

Add more machines to ensure reasonable performance as traffic increases (what most cloud systems do)

Xen

Allows multiple operating systems to execute on the same computer hardware concurrently

AMI

Amazon machine image. PRO: faster boot (than DevOps script). Con: less transparent than DevOps script

VPC

Amazon virtual private cloud......

RDS

Amazon's DB services: Relational database as a service. Pro: queries, Con: not super scalable.

Amdahl's law

Amdahl's law states that in parallelization, if P is the proportion of a system or program that can be made parallel, and 1-P is the proportion that remains serial, then the maximum speedup that can be achieved using N number of processors is 1/((1-P)+(P/N). If N tends to infinity then the maximum speedup is 1/(1-P).

EBS (elastic block storage)

An external drive. A single device attached to a single VM. Pro: not intra-VM. Con: only one VM.

ports

An identifier to a particular subsystem on a machine --> a gateway

sandbox

An isolated computing environment in which a program or file can be executed without affecting the application in which it runs. Sandboxes are used by developers to test. Sandboxes cannot: write to the filesystem, open a socket and access another host directly, spawn a subprocess or thread, or make other system calls. An example of a PaaS.

Heroku

Another example of a PaaS. Hosted and managed within AWS. Bought by salesforce.

API

Application interface: a collection of signatures of functions

Why choose one cloud storage option over another?

Availability, scalability, security, interface, speed, cost, preference for latency or bandwidth?

typical scripting languages

Bash, Javascript, PHP, Perl, Python

Availability

Can I get the data when I want it?

Import/Export snowball

Cloud storage: Amazon service for transitioning terabytes of data in the cloud. Can UPS the data and amazon will upload it. AKA sneaker net AKA fedexnet

DLL

Collection of code that is the body of the API. When an app is booted, the app is combined w/ all the other libraries it uses at runtime

DevOps

Combining the roles of software developers and IT professionals; automating the process of software delivery and infrastructure changes. Docker is an example of a devOps tool

CLI

Command line interface

declarative vs. procedural

Declarative = what, procedural = how

Load balancer

Determines what web server to go to. Could be stateful or stateless, depending on architecture (stateful if it stores info about previous server traffic). CON: single point of failure.

Docker

Docker is an open-source program that enables a Linux application and its dependencies to be packaged as a container. Makes containers portable amongst Linux systems. Recently expanded to support Windows containers. Capabilities: packaging --> creating images with dependencies, execution/scheduling, versioning/deployment

Durability

Does the data persist (in bad times)?

OS-level virtualization

Dynamically create/destroy containers that are "bigger" than processes but smaller than "virtual machines". PRO: lightweight (fast creation/destruction, little overhead switching btw. instances, no emulation), good isolation (security, resource usage). CON: generally runs "the same OS" as the host machine (i.e. you cannot run windows on linux via OS-level virtualization).

AutoScaling

Dynamically scaling and descaling. Takes time: condition must exist for some time, cloudwatch takes time --> separate infrastructure, autoscale is a service --> takes time, VMs take time to boot, ELB (elastic load balancer) takes time to determine liveness

ECS

EC2 container service. Elastic beanstalk managed ECS for us in PA2. You can use ECS directly if you want more control. Steeper learning curve than beanstalk. Supports Docker containers and allows you to easily run applications on a managed cluster of Amazon EC2 instances.

ELB

Elastic load balancer. Determines what web server to go to.

EULA

End-user license agreement: what the customer can/cannot do with the product, and ramifications of violations.

Why would you pick one OS over another?

File system architecture (speed or reliability), it's what you used last time, cost, efficiency (resource management, boot time, minimal operating resources), U.I., security, dev tools

Full virtualization vs. paravirtualization

Full: Unmodified OS/application code. Performance hit because hypervisor mediation. x86 architecture problems. Full is slow! Para: OS cooperates with hypervisor. OS code must be modified for this cooperation. Xen uses paravirtualization.

Least privilege

Giving access only to the bare minimum of information and resources necessary to complete the legitimate purpose

google app engine

Good for running web apps. Client interacts over http(s). Supports Java, Python, Go, PHP. Simple app configuration. Scalability is no longer's dev's concern.

IAM

Identity and access management.

AWS region

Independent geographic location. Each region consists of multiple availability zones.

IaaS

Infrastructure as a service. Maintain complete control of software, but don't want to maintain the hardware. IaaS providers provide the VM, and you can put whatever software you like on it. Ex: EC2

IDE

Integrated development environment: some type of built-in debugging

IIS

Internet Information Services (IIS) is a flexible, general-purpose web server from Microsoft that runs on Windows systems to serve requested HTML pages or files. Used this in PA1.

latency

Latency is the amount of time a message takes to traverse a system. In a computer network, it is an expression of how much time it takes for a packet of data to get from one designated point to another. It is sometimes measured as the time required for a packet to be returned to its sender. Analogous to length of pipe.

LXC

Linux container: looks like a VM, but has the speed of a process. Make each process have its own filesystem

DNS

Maps names (strings) to IP addresses. Amazon has a DNS server that resolves an instance's hostname to its IP address. Does not assign IP addresses (that's DHCP).

Microservices

Microservices architecture is an approach to application development in which a large application is built as a suite of modular services. Each module supports a specific business goal and uses a simple, well-defined interface to communicate with other sets of services. Containers and microservices are a good fit --> scale individual microservices, roll out new version of a microservice

AWS availability zone

Multiple availability zones per region. Isolated zones, but connected via low latency links to the other zones in its region.

Defense in Depth

Multiple layers of security are placed throughout a system

NLP

Natural language processing

NIC

Network interface card: when the machine comes on, this thing has a logical/physical sequences of bits used as identification (the MAC address)

force.com

PaaS for business apps by salesforce

PaaS

Platform as a service. Provides application developers with tools to develop that particular platform. Not a blank slate like IaaS, but not a finish product like a SaaS. Write/run/debug in local emulation environment, then give to platform for deployment. No concept of SSH here: use browser to control the application. Ex: Microsoft windows azure. Pro: simplifies work if you do things the way the platform wants you to. Con: more app logic to learn, can cause vendor lock-in.

Intra-VM cloud storage

Pro: It works; fast development and prototyping. Con: VM death = data death! Single point of failure. Can't scale out, only scale up.

Database schema (pros & cons)

Pro: keeps data homogenous. Con: conforming to a schema means you require data for each field, and you may not know or care about a specific field for an entry. Requires design. Difficult to add attributes halfway through.

HTTP(S)

Protocol for sending webpages. A means of communication between server and client. 4 HTTP verbs: post, put, get, delete. HTTP over SSL is HTTPS.

programming assignment 1 (health plans)

Purpose: gain experience setting up an IaaS cloud application and using rds. Created a program client to hit our site to measure latency before scaling. Used RDP to get on windows VM, because windows VM doesn't support SSH.

Cloud storage options

RDS (relational database as a service), BLOB storage (unstructured, ex: S3), Block device (disk abstraction, single device attached to a single VM), EFS-NFS, DynamoDB (noSQL)

SSH

Secure SHell. Program designed to allow users to log into another computer over a network, to execute commands on that computer and to move files to and from that computer. No certificates. Involves public key cryptography.

SSL

Secure socket layer. Protocol by which to secure websites. Server authentication, optional client authentication. Involves public key cryptography.

"Serverless"

Serverless computing, AKA function as a service (FaaS), is a cloud computing code execution model in which the cloud provider fully manages starting and stopping of a function's container platform as a service (PaaS) as necessary to serve requests, and requests are billed by an abstract measure of the resources required to satisfy the request, rather than per VM per hour (never pay for idle time) Does not actually involve running code without servers. Called "serverless computing" b/c the person that owns the system does not have to purchase, rent or provision servers or VMs for the back-end code to run on.

SOA

Service oriented architecture. Good: incremental update (versioning), speed to market/customers. Bad: communication cost, requires discovery.

SaaS

Software as a service. Uses the web to deliver applications managed by a thrid-party vendor. Interface is accessed on the clients' side. Eliminates the need to install and run applications on individual machines. Examples: gmail, salesforce

SDK

Software development kit: The extra "stuff" other than code itself that one might use in an IDE. Ex: the documentation associated with each function

Stateless

Stateless: a stateless protocol does not require the server to retain information

Why would a restarted VM have the same IP address/DNS name?

The IP lease is not yet up. Starting/stopping the VM does not automatically generate a new IP address).

Why would a restarted VM have a different IP address/DNS name?

The IP lease is up

TCP

Transmission control protocol. TCP enables two hosts to establish a connection and exchange streams of data. TCP guarantees delivery of data and also guarantees that packets will be delivered in the same order in which they were sent.

vertical scaling (scale up)

Use a better machine to ensure reasonable performance as traffic increases. CON: single point of failure.

Programming assignment 2 (auto-grader)

Used the PaaS elastic beanstalk to run a Docker container to implement an autograder system.

programming assignment 3 (slack)

Using AWS lambda functions to create a slack chatbot. API gateway is the function's endpoint

sudo

a Linux program that allows users to run programs with the security privileges of another user, by default the superuser

DevOps concerns/issues

expressiveness/complexity of the language, efficiency/speed of language implementation, sub-changes to a deployed infrastructure, offline error checking, tooling/ease of use, security

private IP

local devices see another device on the same network via it's private IP address. However, the devices residing outside of your local network cannot directly communicate via the private IP address, but uses your router's public IP address to communicate. If you ask a machine who am I, it only knows its private IP.

Glacier

long term storage on blue-ray. Pro: super super cheap. Con: slow (4 hour latency)

MFA

multi-factor authentication: 1) who you are (i.e. usernames), 2) what you know (e.g. security questions or passwords), 3) what you have (e.g. fingerprint)

SLA

service-level agreement: what does the customer get. Public cloud SLAs are "lousy". Amazon promises monthly uptime percentage of 99.95. If they violate, they give post-facto service credit (not refund)

bandwidth

the amount of data that can be transmitted in a fixed amount of time. Analogous to width of pipe.

Virtualization

virtualization is software that separates physical infrastructures to create various dedicated resources. It is the fundamental technology that powers cloud computing.


Related study sets

MGMT 4390 : What is Strategic Management, Mission & Vision Analysis

View Set

Nervous and Endocrine System Test

View Set

International Research- SBE quiz

View Set

CHAPTER 15: INFLAMMATION, INFECTION, AND THE USE OF ANTIMICROBIAL AGENTS

View Set

Health Insurance Policy Provisions

View Set

PSYS 001 - EXAM 1 - 3.3.2 - The Research Hypothesis

View Set