Ch 1 Distributed systems

¡Supera tus tareas y exámenes ahora con Quizwiz!

Three Scalability Dimensions

- Size (add more users or resources for performance) - Geographic (users and resources lie far apart) - Administrative (spans many orgs)

How can you bring down the service time and thus the response-to-server time ratio if your system is grinding to a halt?

- decreasing the arrival rate of requests or increasing the processing capacity of the service can help reduce the utilization of a service

Middleware for distributed systems offer what?

- it is a manager of resources that extends over multiple machines offering each application the same interface - it is a software layer placed between the OS and the distributed applications it also offers: - facilities for interapplication communication - security services - accounting services - masking of and recovery from failures - the main difference with the OS is that middleware is offered in a networked env

Cluster computing

- a collection of similar PCS closely connected with a high speed local area network - used for parallel programming in which one program runs on multiple machines

Grid computing

- a federation of computers where the systems fall under different administrative domains with different hardware, software or network topology - generally the computers are largely the same byt there are hybrid architectures

Open distributed system

- a system that offers components that can easily be used by or integrated into other systems -this often defines services through interfaces using an Interface Definition Language - it should be interoperable, portable, and extensible

Uniform Resource Locator (URL)

- an address used for locating a document on the Web. - it gives no clue as to the location of the websites main web server

What are some things that makes coordination between nodes challenging?

- computing elements are autonomous and need to coordinate - no global clock - authenticating a node to identify group membership can create scalability bottlenecks

Cloud computing

- dynamically construct the infrastructure needed for services - is an easily usable and accessible pool of virtualized resources that can be configured dynamically

How would you hide communication latencies?

- geographic scalability - you can avoid waiting for responses to remote service requests by using asynchronous communication, when a reply comes in the application is interrupted and a special handler is called to complete the request - for interactive apps it is better to move the computation to the client processing the request like checking a form on the client side for errors before sending the response to the server

To hide replication it is necessary that all replicas _____________

- have the same name - so the system also needs to support location transparency, otherwise it is impossible to refer to replicas in diff locations

Response time

- how long it takes before the service processes a request including time in the queue - the average number of requests divided by the average throughput - if the utilization time is very small then the response to service time ratio is close to 1so the request is almost instant - once the utilization comes closer to 1 the response to server time ratio increase to very high values meaning the system is coming to a halt

Extensible

- it should be easy to add or replace parts of a system without affecting those components that stay in place

4 Design goals of Distributed Systems

- make resources easily accessible and shareable - hide that resources are distributed - open - scalable

Why is it difficult to geographically scale distributed systems?

- many are based on synchronous communication where the client blocks until a reply is received from the server - in wide area networks where the communication can be slow this means you are waiting for a response and thus cant do other things

What challenge occurs when trying to create a single coherent system for a distributed environment and distribution transparency?

- partial failures are inevitable and if the user is not aware of which node is failing or the process that is failing on some set of unknown nodes then it will be hard to debug - with distribution transparency there is a performance price (say an application repeatedly tries to contact a server before giving up, masking the server failure before trying another one will slow down the system) -there is also a trade-off with geographic scalability, since hiding latencies and bandwidth restrictions are difficult - in some situations hiding distribution is not useful, like location based services on mobile phones where you want to find the nearest store

What are some of the least and most problematic scalability problems?

- size scalability is the least problematic since you can increase the capacity of the machine or add more machines - geographic scalability is tougher since network latencies are naturally bound from below and you are forced to copy data to the client which leads to consistency issues. - administrative scalability is the most challenging since it deals with politics across orgs

Three root causes for bottleneck considered with size scalability

- the computational capacity limited by the CPU - storage capacity, including I/O transfer rate - network between the user and the centralized service

Interoperability

- the extend to which two implementations of compenents from different developers can work together by relying on a common standard

Portability

- the extent an application developed for a distributed system A can be executed without modification for distributed system B that implements the same interfaces

Pervasive systems

- the introduction of mobile embeded computing devices that blurs the line between users and system components - many sensors pick up the user behavior and many actuators steer the behavior - this has unique solutions to make the system transparent and unobtrusive

False assumptions when designing a distributed system

- the network is reliable - the network is secure - the network is homogenous - the topology doesnt change - latency is zero - bandwidth is infinite - transport cost is zero - there is one admin

Partitioning and distribution

- used in scaling - splits a component into smaller parts and spreads it across a system - an example is how DNS is handled with the path of each name being a name of a host in the Internet, so a single server does not have to deal with all requests for name resolution

Overlay network

- used to organize a collection of nodes - in this case a node is a software process with a list of other processes it can send messages to

What are the two ways to organize the collection of nodes for identifying which nodes can communicate with one another?

1. Overlay network, where the node has a list of other processes it can send messages to 2. A node may need to first look up a neighbor

3 Types of pervasive systems

1. ubiquitous computing systems 2. mobile systems 3. sensor networks

ACID Properties

Atomic - indivisible, either all occur or none occur Consistent - should be consistent when the transaction begins and ends Isolated - transactions done interfere with one another Durable - changes are permanent once committed

What is a common communication service offered by middleware?

Remote Procedure Calls, which allow an application to invoke a function that is implemented and executed on a remote computer as if it was locally available

Caching and replication can lead to ____________ problems

consistency - therefore it often requires global synchronization mechanisms

What is a distributed system?

it is a collection of autonomous computing elements (either hardware devices or software processes) that appear to the user as a single coherent system

Mobile phone users can continue a conversation while they move, this is an example of ________________________

migration transparency

in distributed systems transactions are often constructed as a ____________

nested transaction, or a number of sub transactions where a top level transaction forks off children that run in parallel

The fraction of time pₙ that there are n requests in the system

pₙ = (1 - λ/µ) (λ/µ) ⁿ λ =arrival of requests per sec µ = capacity to process requests per sec

Remote method invocations RMI

similar to RPC except it operates on objects instead of functions - the disadvantage for both is that the caller and the calle both need to be up and running at the time of communication and how to refer to each other which is tight coupling - an alternative is having a messaging system carry requests (or a message oriented middleware like publish subscribe systems)

Utilization of a service

the fraction of time that it is busy U = ∑pⁿ = 1 - p₀

distribution transparency

the internal details of the distribution are hidden from the user, this includes where the data is stored, on which computer a process is executing, or how the data is replicated.


Conjuntos de estudio relacionados

Biology Final Exam (End of Chapter Q's & Quizzes)

View Set

Electrical Prep 2A 19, 20, 21, 22, 23, 24, 25, 26 NCER

View Set

Fundamentals of Nursing Chapter 1

View Set

NUR 337- ATI Intravenous Medication Administration

View Set

Chapter 3, section 3.4 Chapter Review

View Set

1. Physics Practice Questions Exam 2- Projectile and Satellite Motion

View Set