Midterm Prep for CS 498: Cloud Computing Applications

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Remote Procedure Call

"RPC" stands for Remote Procedure Call Rolling Policy Check Really Predictable Code Realtime Performance Clusters

Google Protocol Buffer

A Which of the following RPC frames is created by Google to deal with communication in big data deployment? Apache Thrift Google Protocol Buffer JVM

False Big Data, by definition, cannot be processed in a timely manner by a single standard computer.

A commercial off the shelf laptop, disconnected from the internet, has the storage, memory, and computational capacity to process Big Data in a timely manner. True False

No Cloud computing only makes economic sense when the Utility Premium is less than the ratio of Peak demand to Average demand. If Peak demand falls while Average demand remains steady, cloud computing becomes less economically attractive than before."

A company determines that the most economical decision is to use in-house servers. Over time the company's peak demand for computing resources decreases sharply, while its average demand remains steady. Should the company consider switching from in-house servers to a cloud approach? Yes No

GSI It is tolerable if the tweets of a user show up one client and does not show up on another client immediately. GSI follows Eventual consistency model and ensures that the view of the clients will be the same eventually. GSI also does not constrain table size and hence can be used to store a large number of tweet

A social media company wants to use DynamoDB for storing posts of users. Which secondary indexing method should it use? LSI GSI

False It is an interval, not a single value.

A timestamp in the TrueTime API is a single value, exposing clock uncertainty. True False

False The interval exposes the clock uncertainty.

A timestamp in the TrueTime API is an interval, which masks the clock uncertainty. True False

False, AWS Lambda functions are not allowed to run for more than a few minutes

AWS Lambda is a good technology to use when you have a function that will take several days to run. True, AWS Lambda is optimized for long-running jobs False, AWS Lambda functions are not allowed to run for more than a few minutes

False Once an object is deleted, it may take some time before all of the replicated copies of the object are deleted. Other processes may still read those replicated copies before they are all deleted.

Amazon S3 BLOB Storage's consistency model guarantees that once an object is deleted by a process it cannot be read by any other process. True False

False

Apache Spark cannot read from any Hadoop input. True False

MapReduce

Apache Spark was created to solve the shortcomings of which technology? Juju Apache Storm YARN MapReduce

False In Ceph, the data and metadata are decoupled. Ceph improves performance by: * limiting interaction between clients and servers * leveling metadata load * offloading decision making to the many data servers

Ceph achieves high performance in part by storing data and its accompanying metadata in the same server for faster access. True False

located on disks that are physically attached to the host computer well-suited for caching well-suited for temporary logs

Check all that apply: AWS Instance Store is located on disks that are physically separated from the host computer located on disks that are physically attached to the host computer well-suited for long-term persistent storage well-suited for caching well-suited for temporary logs

ALL of them

Choose all that apply: Which are examples of clustered file systems? Lustre NFS SMB Ceph

Microsoft OneDrive Apple iCloud Drive Dropbox

Choose all that apply: Which of these are Internet-Level Personal Filesystems? Microsoft OneDrive Glacier Redis Apple iCloud Drive Dropbox

Negative Correlation Jobs with negative correlation have a small coefficient of variation, which can lead to higher utilization.

Cloud providers prefer variable jobs that have Positive Correlation Negative Correlation

False Cloud services do not need to be cheaper to be economical, as long as the utility premium is less than the ratio between peak demand and average demand.

Cloud services need to be cheaper (e.g. through economies of scale) to be economical compared to in-house servers. True False

False Partitions are immutable

Each topic has partitions and each partition is ordered, numbered, and mutable. True False

False Even though AWS Instance Stores can handle higher throughput because they are located on disks that are physically attached to the host computer, experiments show that their edge over EBS in throughput tests is not orders of magnitude, a testament to the efficiency of NVM over Fiber technologies and data center networking designs.

Experiments have shown that AWS Instance Stores perform at least an order of magnitude or more better than AWS Elastic Block Stores in throughput tests in comparable settings. True False

some availability for stronger consistency

Compared to NoSQL, NewSQL sacrifices partition tolerance for high availability some availability for stronger consistency strong consistency for high availability strong consistency for partition tolerance

False CosmosDB supports multiple consistency models, and one of them is strong consistency

CosmosDB is an eventually-consistent system, so if you want strong consistency, you must use a different system. True False

False The CAP theorem says that no system can achieve guaranteed perfection in consistency, availability, and partition tolerance. Google Spanner can only achieve consistency and partition tolerance (CP).

Google Spanner is the first real-world system to achieve guaranteed consistency, availability, and partition tolerance (CAP) True False

False Index servers and document servers are backend workers, and the Google Web server is the frontend.

Google's cluster architecture uses index servers as frontend load-balancers, and document servers to hold the indexed information. True False

True

Graph databases are considered one of the four NoSQL categories. True False

False HBase is built on HDFS.

HDFS is built on HBase True False

False HFiles are immutable, so the key-value pairs cannot be updated.

HFiles are optimized so users can quickly update the values associated with keys. True False

binlog is on the master, and the relay log is on the slave

In multinode RDBMS with master/slave asynchronous replication, normally the binlog and relay log are both on the master binlog is on the master, and the relay log is on the slave binlog is on the slave, and the relay log is on the master binlog and relay log are both on the slave

permanently saving state

Once you upload your code, AWS Lambda does NOT automatically handle autoscaling capacity provisioning fault tolerance permanently saving state

Data parallelism

Select all of the solutions provided by Hadoop. Caching for loop-invariant data Caching for fixpoint evaluation Data parallelism Loop-aware task scheduling

ALL except for extract

Select all of the transformations which create an RDD. extract groupBy map join filter

ALL of them

Select all of which are properties of HDFS: Synergistic with Hadoop Throughput scales with attached HDs Massive throughput Optimized for reads, sequential writes, and appends

Is a persistant ordered immutable map from keys to values Is an on disk file format representing a map from a string to a string Is stored in HDFS

Select all that apply: An HFile: Is a persistant ordered immutable map from keys to values Is a distributed collection of objects Is an on disk file format representing a map from a string to a string Is stored in HDFS

ALL of them

Select all that apply: Spark SQL can read data from HIVE tables HDFS JSON

True

The HBase master assigns an HRegion to HRegion servers. True False

True

The Hive approach translates queries into map/reduce jobs. True False

Voice over IP (VoIP) Occasional packet loss is an acceptable tradeoff given the speedy packet delivery for VoIP

UDP is best suited for which of the following tasks? Encrypted communications Voice over IP (VoIP) Communication between computers on the same rack in a data center Reliablely transmiting a medium-sized file over a congested link

I​n a massive multiplayer game

Under which scenario would we be better off using HTTP Streaming API? Initiating a database table I​n a massive multiplayer game Requesting an invoice data structure from a financial server

Yet Another Resource Negotiator

What is YARN? Your Adept Resource Negotiator Your Adept Reason Negotiator Your Applicable Resource Negotiator Yet Another Resource Negotiator

Infrastructure-as-a-Service

What is the best model of delivery for the following scenario? "A custom, lighting-fast storage solution for gigantic amount of data" Software-as-a-Service Platform-as-a-Service Infrastructure-as-a-Service

Platform-as-a-Service

What is the best model of delivery for the following scenario? "A web hosting solution for PHP web applications" Platform-as-a-Service Infrastructure-as-a-Service Software-as-a-Service

Software-as-a-Service

What is the best model of delivery for the following scenario? "An Electronic Health Record system for clinics and doctors" Platform-as-a-Service Infrastructure-as-a-Service Software-as-a-Service

Reduce phase can't start until Map phase is completely finished

What is the bottleneck of the MapReduce programming model? Combine phase can't start until Reduce phase is completely finished Reduce phase can't start until Map phase is completely finished Combine phase can't start until Map phase is completely finished Map phase can't start until Reduce phase is completely finished

JSON has no namespaces while XML has

Which of the following about JSON and XML is correct? JSON and XML does not have security issue JSON does not have security issue while XML has JSON has namespaces while XML has JSON has no namespaces while XML has

Bandwidth

Which of the following are not advantages of VPC Security Bandwidth Flexibility Data control

Amazon AWS Lambda Lambda is an example of a FaaS

Which of the following is NOT considered a PaaS? Amazon AWS Lambda Amazon Elastic BeanStalk Google AppEngine Microsoft Azure App Service

XML JSON

Which of the following is main formats of data representation? XML JSON HTTP

fetch

Which of the following is not a transport method? flush read fetch write

Resend

Which of the following is not an event type of WebSocket? Message Error Open Resend

Message Queuing Systems can handle change in demand.

Which of the following is true? Producer and Consumers have to coordinate with each other in Message Queue systems. Producers and Consumers must communicate synchronously in Message Queue systems. Message Queuing Systems can handle change in demand. AWS Simple Queue Service follows the Publish Subscriber Model.

Write Through

Which of the following prevents stale data in cache? Cache Aside Write Through Write Back Both "Write though" and "Write Back"

Incoming traffic from Internet cannot access the private subsets Outgoing traffic from a private subnet cannot access the Internet

Which of the following statements about private subsets are true? Outgoing traffic from a private subnet can access the Internet Incoming traffic from Internet cannot access the private subsets Outgoing traffic from a private subnet cannot access the Internet Incoming traffic from Internet can access the private subsets

NAT

Which of the following technology can help private subnet access the Internet? NAT CIDR Internet Gateway

Apache Kafka

Which of the following would be best for the following use case? "Record user activity like page views, searches, and clicks on a website and make that data readily accessible and available to process in a streaming manner" Apache Kafka HBase Cassandra AWS Lambda

Enhance programmability Extend the MapReduce model to better support two common classes of analytic applications

Which of the followings state the purpose of Apache Spark? Enhance programmability Extend the MapReduce model to better support two common classes of analytic applications Eliminate locking found in the MapReduce model Add concurrency to the MapReduce model

Apple

Which of these companies is least involved in providing IaaS? Apple Microsoft Amazon Google

D-Streams

Which of these frameworks is not built on Spark? Mllib SparkSQL GraphX D-Streams

GMail

Which of these is an example of Software as a Service (SaaS) VMWare vCloud Google AppEngine Juju GMail

Metal as a Service (MaaS)

Which of these is not considered serverless computing? Function as a Service (FaaS) Platform as a Service (PaaS) Metal as a Service (MaaS)

Data parallelism across many machines

Which one of the following features is a main factor in the philosophy of Apache Hadoop? Caching for loop-invariant data Caching for fixpoint evaluation Data parallelism across many machines Loop-aware task scheduling

Asynchronous replication

Which replication strategy would likely have the fastest commits (at the expense of possibly weaker consistency)? Asynchronous replication Semi-synchronous replication Synchronous replication

Glacier Deep Archive

Which storage technology is the best for the following scenario? "An application that archives 1,000 TB of data for two years for compliance reasons" Dropbox Glacier Deep Archive AWS S3

Instance store Instance store provides storage solution that is terminates when the instance terminates or reboots.

Which storage technology is the best for the following scenario? "An application that stores 200 GBs of binary data for a few minutes and doesn't need it to be persistent if the instance running it fails" Glacier Instance store AWS S3

AWS S3

Which storage technology is the best for the following scenario? "An application that stores and queries 2,000 TB of binary data for a few weeks" Redis Glacier AWS S3

AWS S3 clearly AWS S3 would be ideal for storing data for just few weeks, analogous to using cloud computing over owning the machines

Which storage technology is the best for the following scenario? "An application that works with 2 TBs of sound files for a few weeks" I​nstance Store L​ambda AWS S3

Swift Swift is best when dealing with huge sizes of images

Which storage technology is the best for the following scenario? "An application that works with 80 GBs of images on in-house data center" Swift AWS S3 HIVE

HIVE Hive's drive component allows complicated queries (SQL-like) which essentially can be done on structured data

Which storage technology is the best for the following scenario? "An application which needs complicated queries on structured data" HIVE A​WS S3 M​emCacheD

Ceph Ceph is ideal when dealing with structured data, it has better performance and speed

Which storage technology is the best for the following scenario? "Application needs to update structured data frequently" Swift Ceph HIVE

Local hard disk or instance store

Which storage technology is the best for the following scenario? "Application runs on a single node, needs to store 10GB of data for a few minutes" AWS S3 Local hard disk or instance store HDFS

Swift Swift is ideal when dealing with unstructured data like operating system's data and binaries

Which storage technology is the best for the following scenario? "Store an operating system and application binaries remotely" HIVE HDFS Swift

Dropbox

Which storage technology is the best for the following scenario? "Sync files on a few personal devices" AWS Glacier AWS S3 Dropbox Swift

D​ropBox

Which storage technology is the best for the following scenario? "Sync files on a few personal devices" D​ropBox A​WS DynamoDB A​WS S3 G​oogle App Engine

JSON

Which technology can best address the following need? "Human readable representation of data" T​hrift R​PC JSON W​ebRTC

X​ML

Which technology can support addressing the following need? "Data representation for (un)marshalling on different machines and programming languages" W​ebRTC R​EST R​MI X​ML

YARN

Which technology is the best suited for the following use case? "Assigning resources to a highly parallel application" YARN HDFS Spark Hadoop

Hadoop

Which technology is the best suited for the following use case? "Finding the set of words utilized in the Wikipedia website" Spark Hadoop YARN HDFS

Spark

Which technology is the best suited for the following use case? "Interactively exploring a new large dataset" YARN Hadoop Spark HDFS Amazon S3

HDFS

Which technology is the best suited for the following use case? "Storing a large set of images on thousands of computers" YARN Hadoop HDFS Spark

Spark

Which technology is the best suited for the following use case? "Training a machine learning model on a large dataset with several iterations" HADOOP Spark YARN HDFS

REST

Which technology will address the following need? "Create, Update, Read, and Remove objects over the web" JSON XML REST RMI

R​EST

Which technology will address the following need? "Create, Update, Read, and Remove objects over the web" J​SON-RPC R​EST X​ML W​ebsockets

XML

Which technology will address the following need? "Data representation for (un)marshalling on different machines and programming languages" MBaaS RMI XML REST

JSON

Which technology will address the following need? "Data representation using a dictionary with key and value" HTML XML TXT JSON

JSON

Which technology will address the following need? "Human readable representation of data" RPC REST JSON RMI

MBaaS

Which technology will address the following need? "Provides a way for mobile web applications to link to backend storage" MBaaS XML JSON REST

S​OAP SOAP is a remote procedure call technology. CORBA could also work.

Which technology will address the following need? "Send method execution requests to a remote object" S​SH S​OAP X​ML W​ebsockets

CORBA

Which technology will address the following need? "Send requests to a remote object" CORBA XML SSH Juju

Read replicas Read replicas for scalability; Multi-AZ deployments for high availability; Multi-Region deployments for disaster recovery and local performance.

Which would you choose for scalability (as opposed to availability or disaster recovery) Multi-AZ deployments Read replicas Multi-Region deployments

Apache Thrift, because it is scalable and easy to use the auto-generated RPC functions. Note: RMI, MPI are for communication in a local network.

W​hat Communication framework or technology do many Big Data systems use, and why? Apache Thrift, because it is scalable and easy to use the auto-generated RPC functions. SOAP, because these frameworks are all enterprise systems. Remote Method Invoation (RMI), because it is the standard library of choice in Java. MPI, since MPI is extremely light weight and therefore provides high throughput and low latency.

P​UT, GET, DELETE 4 verbs: PUT, POST, GET, DELETE

W​hat HTTP verbs are used in RESTful APIs? P​UT, POST, APPEND X​ML, JSON, SOAP P​UT, GET, DELETE A​TTACH, APPEND, PATCH

K​ubernetes

W​hat underlying technologies are typically utilized to offer serverless compute cloud offerings? HDFS file system K​ubernetes A​nalytics and AI packages M​etal as a Service provisioning systems

A process writes a new object to Amazon S3 and immediately attempts to read it. Until the change is fully propagated, Amazon S3 might report "key does not exist." A process writes a new object to Amazon S3 and immediately lists keys within its bucket. Until the change is fully propagated, the object might not appear in the list. A​mazon S3 has a weak consistency model

W​hich of the following statements are true regarding Amazon S3? A process writes a new object to Amazon S3 and immediately attempts to read it. Until the change is fully propagated, Amazon S3 might report "key does not exist." Object w​rites to S3 immediately take effect. A process writes a new object to Amazon S3 and immediately lists keys within its bucket. Until the change is fully propagated, the object might not appear in the list. A​mazon S3 has a weak consistency model

UDP, TCP

W​hich of the following technologies are transport layer systems in the Internet Protocol? IMAP, SSL, HTTP H​TTP, WebSocket and SOAP T​CP, IP UDP, TCP

Microsoft Azure CosmosDB A​mazon DynamoDB

W​hich one of the following are examples of NOSQL key/value store cloud offerings? ​Microsoft Azure CosmosDB I​BM Object Storage G​oogle cloud Filestore A​mazon DynamoDB

SOAP

What evolved as the successor to XML-RPC? JSON SOAP REST HTTP/2 Push

Distributed NOSQL key/value storage service

Amazon DynamoDB is an example of a Cloud-optimized SQL database Centralized Big-Data blob storage Distributed NOSQL key/value storage service Function as a Service (FaaS) dynamic container offering

replicating the data to multiple machines

Amazon S3 BLOB Storage aims to provide high availability primarily by relying on subcontractors to provide excess capacity using proprietary, expensive, high quality storage hardware that rarely fails only offering the service to users with predictable, regular workloads to limit congestion replicating the data to multiple machines

True

Amazon S3 BLOB Storage uses a weak consistency model. False True

Software as a Service (SaaS)

Applications are managed for you when using Platform as a Service (PaaS) Infrastructure as a Service (IaaS) Software as a Service (SaaS) Metal as a Service (MaaS)

True

BLOB stands for "Binary Large OBjects" True False

Longest prefix match

For VPC, how to select the optimum route for network traffic? Longest prefix match Shortest prefix match Any prefix match

00001010.00001010.00000001.00011000 This is because it ranges from the last three bits 000 to 1111 Need to keep the first 29 digits the same.

Given the CIDR of IP4, 10.10.1.16/29, which of the following IP4 is out of range? 00001010.00001010.00000001.00010110 00001010.00001010.00000001.00010010 00001010.00001010.00000001.00010001 00001010.00001010.00000001.00011000

Y​ou have to declare a route through Amazon API Gateway.

H​ow can you utilize a lambda function as an RPC using a Websocket call? U​sing Amazon CloudWatch to "watch" for incoming RPC calls, and routing them to the right Lambda function. Y​ou have to declare a route through Amazon API Gateway. B​y deploying an instance of a webserver using an approved web-server AMI on an EC2 instance, and connecting it to your lambda functions on AWS console.

T​hrough OAuth protocol

H​ow does the Dropbox API handle authentication? T​hrough OAuth protocol D​ropbox API doesn't offer authentication service, and third party plugins should be used. B​y using persistent cookies in the HTTP session

True

In Amazon AWS Aurora, the log is the database. True False

many users may share the same physical computer and database

In the context of cloud computing, multi-tenancy means many users may share the same physical computer and database many computers may share the same rack many cloud providers may share the same data center many data centers may share the same electric plant

False

Infrastructure as a Service, by definition, provides load balancing automatically for you. True False

False This market segment does not accept access pattern limits, so cloud providers use statistics to extract average access patterns and set pricing. On average, they make a profit but may lose money on some users with high access patterns.

Internet-level Personal Filesystems (like Dropbox) have strict access pattern limits, ensuring that they make a profit on each customer. True False

Availability

It is impossible for a distributed system to have guaranteed consistency, availability, and partition tolerance simultaneously (following the CAP theorem). Which does Google Cloud Spanner sacrifice? Consistency Availability Partition Tolerance

increase the utility of clouds, because the IoT devices will generate a huge amount of data that will be processed by the cloud

It is likely that a move towards Internet of Things (IoT) in the near future will increase the utility of clouds, because the IoT devices will generate a huge amount of data that will be processed by the cloud decrease the utility of clouds, because IoT brings computational power closer to the user, while cloud computing is far from the user have no effect on clouds, since they are totally unrelated technologies

A​ Dataset organized into named Columns

I​n Apache Spark SQL, what is a DataFrame? E​quivalent to a column of data in a relational database A​ Dataset organized into named Columns A S​park RDD automatically transformed for SQL queries

E​vents that are collected by the cloud backend and trigger your function

I​n almost all serverless function as a service offerings, how are the functions called? T​he functions are periodically executed, with frequency depending on the observed traffic E​vents that are collected by the cloud backend and trigger your function Y​ou have to deploy a pub-sub middleware such as Apache or similar engines to collect events and route them to the right function

True Logging needs to be fast, and the in-memory approach to data storage that Redis takes is well suited to keep up with intense logging demands.

Logging is a good use case for Redis. True False

Row key, column key, timestamp

Map in HBase is indexed by: Key Row key, column key, index Row key, column key Row key, column key, timestamp

True

Multiplexing demand in a cloud infrastructure leads to higher utilization. True False

False, there is no "official standard"

NIST developed an official standard for REST, because it recognized the importance of interoperability in cloud environments. True, although IBM later developed their own competing standard True, and the success of the standard directly led to the recent explosive growth of the cloud industry False, NIST is opposed to the goal of interoperability False, there is no "official standard"

Consumers Kafka maintains feeds of messages in categories called Topics. Processes that publish messages to a Kafka topic are Producers. Kafka is run as a cluster comprised of one or more servers each of which is called a Broker.

Processes that subscribe to topics and process the feed of published messages are: Consumers Producers Brokers Topics

True

Software as a Service, by definition, provides load balancing automatically for you. True False

CREATE ALTER INSERT UPDATE DELETE

The following are logged in a binary log CREATE ALTER INSERT UPDATE DELETE SELECT SHOW

Because load balancers must make quick decisions, empirically more complex algorithms are likely to be slower than simpler algorithms

The primary reason most commonly-used load-balancing algorithms are relatively simple is Because load balancers must make quick decisions, empirically more complex algorithms are likely to be slower than simpler algorithms Tradition. There are well-known complex load balancing algorithms that work better than simple algorithms (like round-robin) in almost all situtations. Because an exhaustive theoretical analysis of the algorithms has determined that complex load balancing algorithms all have exponential runtimes To save the developer time writing and maintaining code, since load-balancing isn't very important

have three phases: opening handshake, data transfer, closing handshake

WebSockets requires the users to "poll" to receive data have three phases: opening handshake, data transfer, closing handshake have longer latency than RESTful approaches are built on top of UDP to minimize latency

Open, Close, Error, Message

What are the event types in WebSocket? o​nError, onMessage, onInit, onClose Open, Close, Error, Message M​monitor, Connect, Disconnect, Send, Receive Open, Close, I​nitReceive, initSend

Isolation ACID: Atomicity Consistency Isolation Durability

What does the "I" in ACID stand for? Integrity Isolation Insistent Instance

Software that provides services to applications beyond those generally available at the operating system.

What is the function of Middleware? Software that provides services to applications beyond those generally available at the operating system. A​n architecture style where the state of the program is recorded in the operating system and not the user application, hence the term "Middle" in middleware. The set of Device Drivers that provide network support for the operating system. The kernel of an Operating System.

Cache aside

What policy does Memcached use? Cache aside Write through Write back Write front

Opening handshake, Data transfer, Closing Handshake

What three phrases does WebSocket have? Opening handshake, Sending message, Closing Handshake Opening handshake, Data transfer, Closing Handshake Opening SSH, Sending message, Closing SSH

Multi-AZ deployments

What would you choose for high availability? Multi-AZ deployments Multi-Region Deployments Read replicas

is stored across multiple availability zones (AZs) rather than one

When compared to Amazon EBS PIOPS, Amazon EFS (Elastic File Store) is well suited for NoSQL databases is stored across multiple availability zones (AZs) rather than one generally has lower throughput generally has lower per-operation latency

PaaS

Which *aaS is best described by "The unit of compute is a full app"? MaaS IaaS FaaS PaaS

Infrastructure-as-a-Service

Which approach is feasible for the following scenario with the minimum efforts? "ACME company needs to be able to change the cloud provider frequently." Infrastructure-as-a-Service Packaged software Platform-as-a-Service Software-as-a-Service

Infrastructure-as-a-Service

Which approach is feasible for the following scenario with the minimum efforts? "ACME company needs to deploy a system with a modified OS." Platform-as-a-Service Infrastructure-as-a-Service Packaged software Software-as-a-Service

Software-as-a-Service

Which approach is feasible for the following scenario with the minimum efforts? "ACME company needs to provide a widely used application for its marketing team." Packaged software Software-as-a-Service Infrastructure-as-a-Service Platform-as-a-Service

Cloud computing

Which approach is more economical for the following scenario? "A long-running business needs 10,000 computers for one-time data processing." Hybrid approach In-house servers Cloud computing

Hybrid approach

Which approach is more economical for the following scenario? "A long-running business serves 1,000 daily but 1,000,000 during the holiday session." In-house servers Hybrid approach Cloud computing

In-house servers

Which approach is more economical for the following scenario? "An established, mature business serves 10,000 users during business hours (9am to 5pm) and 100 users outside of business hours each day." Cloud computing In-house servers Hybrid approach

Cloud computing Setting up the infrastructure to serve 1,000,000 customers is time-consuming. Cloud computing should allow the startup to start serving customers much more quickly, beating their competitors to the market.

Which approach is the most sensible for the following scenario? "A new startup needs to quickly scale their infrastructure to serve 1,000,000 customers, or risk losing market share to their competitors." Cloud computing In-house servers Hybrid approach

ALL of them

Which are the benefits of service-oriented architecture (SOA)? Scalability Interaction Reduce costs Reusable Code

Partitioning-aware

Which feature of Spark Scheduler avoids extra shuffles? Dryad-like DAG Cache-aware work reuse and locality Partitioning-aware Pipelining functions within a stage

W​hen in a database one small write results into multiple physical data writes because of the way the storage subsystem is designed.

Which guarantee can be relaxed for the following use case? "Data can be served on a single server." A general problem in distributed SQL database engines where an UPDATE query generates intermediate temporary data writing operations. It is a cascade effect in some SQL queries, where a write in one table results in more writes in other joined tables. W​hen in a database one small write results into multiple physical data writes because of the way the storage subsystem is designed.

HBase

Which has better support for incremental addition of small batches (e.g. record-level insertion)? HBase HDFS

Strong-read weak-write

Which is NOT one of the CosmosDB Consistency Models? Consistent Prefix Eventual Bounded-staleness Strong Session Strong-read weak-write

Redis

Which is NOT one of the building blocks of HBase? Apache ZooKeeper HFile HDFS Redis

Unstructured Data

Which is NOT one of the four NoSQL categories? BigTable Unstructured Data Key-Value Document Graph DB

Ephemeral file system

Which is NOT one of the three layers of a file system? Logical file system Virtual file system Physical file system Ephemeral file system

HBase

Which is better for fast record lookup? HBase HDFS

Bandwidth is infinite

Which is not an advantage of resilient distributed datasets? Retain the attractive properties of MapReduce Allow apps to keep working sets in memory for efficient reuse Support a wide range of applications Bandwidth is infinite

VPC network does not use ARP while Physical Ethernet Network implements ARP

Which is one of the differences between Physical Ethernet Network and VPC network? Physical Ethernet Network does not use ARP while VPC network implements ARP Both of them intercept ARP request VPC network does not use ARP while Physical Ethernet Network implements ARP

Make communication between VPCs whether these two VPCs belong to the same account or different accounts.

Which is the function of VPC peering? Make communication between VPCs within the same account. Make communication between VPCs whether these two VPCs belong to the same account or different accounts. Only make communication between VPCs for different accounts

TCP

Which layer does WebSocket runs on top of? IP UPD TCP

DNS HTTP IMAP

Which of the follow protocol lies in Application Layer? DNS HTTP IMAP UDP

TCP UDP

Which of the follow protocols lies in Transport Layer? TCP HTTP UDP

ALL except APPEND

Which of the following HTTP verbs are used in REST APIs? DELETE APPEND GET POST

F​unction as a Service deployment should be stateless, and rely on an external storage service for state storage. (think about Lambda Function)

W​hich one of the following is correct F​unction as a Service deployment Should be stateless, and as such are severely limited in what they can accomplish. F​unction as a Service deployment support in-built state storage, accessible through a special state manipulation API. F​unction as a Service deployment should be stateless, and rely on an external storage service for state storage. F​unction as a Service deployment support state storage by launching an instance of a relational database in the same container, and can access it using SQL commands.

Because of the relaxed consistency requirements of S3, building the distributed system to support it is much easier.

W​hy do object stores such as AWS S3 cost lower than managed file systems such as AWS FSx Lustre? Because S3 is not replicated and therefore less reliable, while Lustre has replication.. Because of the relaxed consistency requirements of S3, building the distributed system to support it is much easier. Because storage in AWS S3 is ephemeral, while storage in Lustre is non-volatile.

IaaS

You are tasked with choosing between a PaaS and a IaaS approach. Flexibility is important in your company, and you must avoid being locked in by a vendor. Which should you choose? PaaS IaaS

Map(key = line, value = contents): for each word in value: emit intermediate (word, 1)

You want to build a word count program. Which of the following pseudo-code is the proper Map function for this program? Note that the indenting is not accurate. "Word Count Program: You have a huge text file that consists of many lines. The goal is to count the number of times each distinct word appears in the file." Map(key = line, value = contents): result = 0; for each word in value: result += value; emit(key, result) Map(key = line, values = uniq_counts): Sum all 1's in values list Emit result (word, sum) Map(key = line, value = contents): for each word in value: emit intermediate (word, 1) Map(key, values): for each value in intermediate values: value += 1; emit intermediate(key, values)

Reduce(key = word, values = uniq_counts): Sum all 1's in values list Emit result (word, sum)

You want to build a word count program. Which of the following pseudo-code is the proper Reduce function for this program? Note that the indenting is not accurate. "Word Count Program: You have a huge text file that consists of many lines. The goal is to count the number of times each distinct word appears in the file." Reduce(key = line, value = contents): result = 0; for each word in value: result += value; emit(key, result) Reduce(key = word, values = uniq_counts): Sum all 1's in values list Emit result (word, sum) Reduce(key, values): for each value in intermediate values: value += 1; emit intermediate(key, values) Reduce(key = line, value = contents): for each word in value: emit intermediate (word, 1)

Map(key = x,y, value = R,G,B) emit intermediate(key, value)

You want to build an image smoother program. Which of the following is the proper Map function for this program? Note that the indenting is not accurate. "Image Smoother Program: To smooth an image, use a sliding mask and replace the value of each pixel." Map(key = x,y value = list of R,G,B) compute average of R,G,B emit intermediate(key, average R,G,B) Map(key = x,y, value = R,G,B) emit intermediate(key, value)

Map(key = x,y, value = R,G,B) {​ emit intermediate(key, value) }​

You want to build an image smoother program. Which of the following is the proper Map function for this program? Note that the indenting is not accurate. "Image Smoother Program: To smooth an image, use a sliding mask and replace the value of each pixel." Map(key = x,y value = list of R,G,B) {​ compute average of R,G,B emit intermediate(key, average R,G,B) }​ Map(key = x,y, value = R,G,B) {​ emit intermediate(key, value) }​

Reduce(key = x,y value = list of R,G,B) compute average of R,G,B emit (key, average R,G,B)

You want to build an image smoother program. Which of the following is the proper Reduce function for this program? Note that the indenting is not accurate. "Image Smoother Program: To smooth an image, use a sliding mask and replace the value of each pixel." Reduce(key = x,y, value = R,G,B) emit (key, value) Reduce(key = x,y value = list of R,G,B) compute average of R,G,B emit (key, average R,G,B)

Reduce(key = x,y ; value = list of R,G,B) { compute average of R,G,B emit (key, average R,G,B) }​

You want to build an image smoother program. Which of the following pseudocode is the proper Reduce function for this program? Note that the indenting is not accurate. "Image Smoother Program: To smooth an image, use a sliding mask and replace the value of each pixel." Reduce(key = x,y ; value = R,G,B) {​ emit (key, value) }​ Reduce(key = x,y ; value = list of R,G,B) { compute average of R,G,B emit (key, average R,G,B) }​

Microsoft Azure App Service S​ince the application is already written, the best method would be PaaS model, where the unit of compute is a whole app.

Y​ou have a whole application already written that you want o deploy to the cloud without much re-architecturing. Which of the cloud models would best fit this scenario? A​mazon EC2 I​BM Cloudant ​Microsoft Azure App Service O​racle functions


Ensembles d'études connexes

Chapter 38: Caring for Clients With Cerebrovascular Disorders

View Set

King Midas Vocabulary & Review Questions

View Set

Ch.2.3 Helpdesk: Exploring storage devices and ports

View Set

Uipath : Data MAnupulation With Lists and Dictionaries in Studio

View Set

Certified Ophthalmic Assistant (COA)

View Set

Real Estate Principles: Practice Exam 2

View Set