Parallel Programming

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Define the program order (PO) of the JMM.

(order in which statements are executed) Total order of intra-thread actions - not a total order across threads! It does not provide an ordering guarantee for memory accesses, it just provides the link between possible executions and the original program, e. g. evaluation of condition comes before statement in the if-section.

Define the synchronizes-with order (SW) of the JMM.

(order of observed synchronizing memory actions across threads) SW pairs the specific actions which "see" each other, e. g. a volatile write to x synchronizes with a subsequent read of x (subsequent in SO).

Define the synchronization order (SO) of the JMM.

(order of synchronizing memory actions in the same thread) Synchronization actions (SA) are - read/write of a volatile variable - (un)lock monitor - first/last action of a thread - actions which start a thread - actions that determine if a thread has terminated Synchronization actions form the synchronization order. It is a total order and all threads see the SAs in the same order. The SAs within a thread are in PO and the SO is consistent (all reads in SO see the last writes in SO).

Discuss pros and cons of the Filter lock.

+ satisfies mutual exclusion + is deadlock free + is starvation free - unfair - O(n) memory - O(n) to acquire lock

Usually, the speedup Sᵖ < p. Why?

- Some parts of the program might be sequential. - Overheads introduced by parallelization - Architectural limitations, e. g. memory contention

Name 3 approaches on how to manage the state (data) in parallel programs.

- immutability: data does not change, best option - isolated mutability (thread-local): data can change but only one thread/task can access them - mutable/shared data: data can change and all tasks/threads can access them

Assume we want to solve a problem that can be divided into subproblems, e. g. sum up the elements of an array. What is a common pattern to solve this kind of problem?

1) Creation: We create a new thread for each subproblem. 2) Waiting: The main thread implements a barrier and waits for all threads to reach this barrier. 3) Accumulate results

What happens if an exception triggers in the middle of a synchronized block?

1) Lock is released 2) Exception is caught/handed back to caller The rest of the code in the synchronized area is not executed. All changes made in the method before take effect, they are not reverted.

Name and describe four different shared memory architectures. What is the difference to distributed memory?

1) Simultaneous Multithreading (SMT) (Hyperthreading) Single core with multiple virtual cores on it - there is only one core but the OS thinks there are two. 2) Multicores Single chip, multiple cores that might share part of the cache hierarchy 3) Symmetric Multi Processing (SMPs) Multiple chips (CPUs) on the same system. The CPUs share memory and their caches coordinate (cache coherence protocol). 4) Non-Uniform Memory Access (NUMA) Each CPU has its own memory that it can access fast. There is a shared memory interface so that it can also access remote memory but that takes longer. In distributed memory architectures, each CPU can only access its own cluster of memory. The CPUs communicate via MPI (Message Passing Interface).

What points does one have to remember when using Divide-And-Conquer resp. Fork/Join?

1) Use a sequential cutoff for numbers of threads (500 - 1000) - it does not make sense to create more threads than OS threads available. 2) Do not create two recursive threads; create one (start), do the other half of the work in the current thread (run) and wait for the other thread to finish (join). 3) Make sure each thread has enough to do: implement a sequential threshold (∼100-5000 basic operations)

What are the two steps in parallelizing a program?

1) Work Partitioning, Task/Thread Decomposition The work is splitted up into parallel tasks. 2) Scheduling Assign the task to the processors with the goal of full utilization.

Why can CPU architects no longer increase the sequential CPU performance?

1) power dissipation wall: it became more difficult to cool the CPUs, the more transistors we had per space. 2) memory wall: CPUs are now faster than memory access. 3) ILP wall: Programs are becoming too complex to be fully parallelized by the compiler. => Multicore processors

What priority can a Java thread have and what is the priority used for?

A Java thread can have a priority between 1 and 10, default is 5. The JVM (usually) uses the priority of threads to schedule them.

What is a reduction?

A computation that produces a single answer from a collection of data via an associative operator. Examples are max, sum, product, ... Reductions can be solved via Fork/Join.

Define the term liveness property.

A liveness property informally says that "eventually, something good will happen".

Define the term "lock" and name its properties.

A lock is a shared object that satisfies the following interface public interface Lock { public void lock(); // entering CS public void unlock(); // leaving CS } providing the following semantics new Lock: make a new lock, initially "not held" acquire: blocks (only) if this lock is already currently "held". Once it is "not held", makes lock "held" (atomically!) release: makes lock "not held". If ≥ 1 threads are blocked on it, exactly one will acquire it. A lock thus implements a mutual exclusion algorithm. An implementation of a lock needs special hardware and OS support. A lock may be reentrant (recursive).

What does a map do and how can it be used in parallel programming?

A map operates on each element of a collection independently to create a new collection of the same size. For arrays, some hardware has direct support for this. Example: vector addition Many parallel algorithms can be written in terms of maps and reductions - the two most important and common patterns.

What is a memory model?

A memory model provides (often minimal) guarantees for visibility of memory operations and thus restricts the possible outcomes of a program. It is a contract between programmer, compiler, and architecture about semantics.

What is a critical section?

A piece of code with the following conditions: 1. Mutual exclusion: statements from critical sections of two or more processes must not be interleaved. 2. Freedom from deadlock: if some processes are trying to enter a critical section, then one of them must eventually succeed. 3. Freedom from starvation: if any process tries to enter its critical section, then that process must eventually succeed. (3 implies 2)

Define the term "race condition".

A race condition occurs when the computation result depends on the scheduling (how threads are interleaved). Examples of race-condition bugs: - data race - bad interleavings

Define the term safety property and how they are achieved.

A safety property informally says that "nothing bad ever happens" and is achieved by means of synchronization.

Define a thread.

A thread is an independent sequence of execution.

Define starvation.

A thread is constantly denied access to a resource and is thus unable to make progress.

Programming-in-the-small: What two parts does a program consist of?

Algorithms and Data Structures (covered last semester)

What does the Java keyword "synchronized" ensure and what are its effects on performance?

All objects in Java are associated with a monitor (implemented by an internal lock). "synchronized" locks the object (acquires the monitor) such that no other thread is able to lock the object. It ensures mutual exclusion (and atomicity) with regard to locks on the same object - other threads are still able to access the object! Code without "synchronized" is faster.

Name the differences between Amdahl's and Gustafson's Law.

Amdahl's law assumes constant problem size, while Gustafson's law assumes constant runtime with variable problem size. Amdahl further assumes that the sequential work increases with more work, Gustafson assumes that sequential work stays constant.

Define Mutual Exclusion.

An algorithm to implement a critical section: acquire_mutex(); ... // critical section release_mutex(); One thread/process never enters its critical section at the same time as another concurrent thread enters its own critical section.

What does atomicity mean?

An atomic action is one that effectively happens all at once, i. e. it appears indivisible. An atomic action cannot stop in the middle: it either happens completely, or it doesn't happen at all. No side effects of an atomic action are visible until the action is complete.

What is an atomic register?

An atomic register r is a basic memory object (for values of primitive type), can be shared or not, with the operations - r.read() - r.write(v) and the following properties: - An invocation J of r.read or r.write takes effect at a single point t(J) in time - t(J) always lies between start and end of the operation J - Two operations J and K on the same register always have a different effect time t(J) ≠ t(K) - An invocation J of r.read returns the value v written by the invocation K of r.write(v) with closest preceding effect time t(K) => We can treat operations on atomic registers as events taking place at a single point in time (sequentially consistent, realtime order is respected for non-overlapping operations).

What does the keyword "volatile" mean, what are its effects on performance and why should you avoid it?

Any changes to a volatile field are atomic and visible to all other threads. It is slower than regular fields but faster than locks. The keyword "volatile" should only be used by experts.

Is it better to use linked lists or trees when planning on using parallelism?

Balanced trees are generally better than lists because we can get to all the data exponentially faster: O(log n) vs. O(n)

What does CAS stand for? What does it do and what is it used for? What are its characteristics?

CAS = Compare And Swap/Compare And Set CAS (location, expected, new) { r = valueAtLocation; if (r == expected) { valueAtLocation = new; return true; } return false; } (may also return old value) CAS is used to implement non-blocking operations, e. g. spin locking: while(CAS(lock, 0, threadID) == false) CAS is an operation at the assembly level. It is atomic, expensive and universal. It is possible to implement CAS wait-free by hardware.

Explain the term "cache" and name its characteristics and effects on parallel programming.

Caches are faster to access than memory, organized in multilevel hierarchies (see Important Figures) and smaller than memory (=> jumping around in a data structure can be expensive as only part of the structure can be loaded into the cache). If a processor processes data, the data is copied to its cache. This can lead to inconsistency and errors when several processors operate on the same data (solved by MESI protocol).

Describe the nested lockout problem and possible solutions.

Calling a blocking method (e. g. wait) within a synchronized method can lead to a deadlock. Solutions: No blocking calls in synchronized methods or provide a non-synchronized method of the blocking object.

Given two threads and a single core CPU. Why is it better to finish Thread1 before working on Thread2?

Changing thread comes at a cost: There is a thread scheduling overhead. The direct cost of a context switch consists of saving the local context (CPU registers) of Thread1, a call to the OS schedule and loading the local context of Thread2 to the CPU.

What is memory reordering, why does it exist and what problems can it cause?

Compiler and hardware are allowed to make changes in the order of statements that do not affect the semantics of a sequentially executed program. This provides a huge potential for optimization in performance like dead code elimination, register hoisting, locality optimizations etc. This can cause errors or false outputs in parallel programs.

Define a deadlock.

Cyclic total blockade, meaning multiple threads are blocking each other such that no thread can make progress (because each process waits for another of these processes to proceed).

Define monitor.

Data structure/synchronization construct that fulfills the following properties: - Mutual exclusion - allows threads to wait for a certain condition A monitor is a higher level construct based on semaphores (makes programming easier). "entering monitor" = waiting "acquiring monitor" = thread has access to area protected by monitor "releasing monitor" = leaving protected area "exiting monitor" = not in waiting area

Name two implementations of a two-thread lock with atomic registers.

Decker's Algorithm Peterson's algorithm

What is the goal of designing parallel algorithms (regarding performance model)?

Decreasing span without increasing work too much.

Define a "bad interleaving" (high level race condition).

Erroneous program behavior caused by an unfavorable execution order of a multithreaded algorithm that makes use of otherwise well synchronized resources. It does not always lead to an error.

Define the term "data race" (low level race condition).

Erroneous program behaviour caused by insufficiently synchronized accesses of a shared resource by multiple threads, e. g. simultaneous read/write or write/write to the same memory location. Under the Java Memory Model, it leads to undefined behaviour (even if it's a matter of two threads trying to write the same value to a variable) - ALWAYS an error.

Name two implementations of an n-thread lock with atomic registers.

Filter lock Bakery lock

Compare fine and coarse task/thread granularity.

Fine granularity is more portable between machines (machine with more cores can still achieve full utilization). Fine granularity is better for scheduling. If the granularity is too fine (scheduling overhead is comparable to a single task), the overhead dominates, i. e. we increase the workload.

What is a pack (non-standard terminology) and how does one solve it?

Given an array INPUT, produce an array OUTPUT containing only the elements of INPUT such that f(e) is true. 1. Parallel map to compute a bit vector for true elements 2. Parallel-prefix sum on the bit vector 3. Parallel map to produce the output (size of output = bitvector[n-1]) One application is a parallelized QuickSort.

Name the guidelines for mutable shared data.

Guideline #0: No data races Guideline #1: Consistent Locking For each location needing synchronization, have a lock that is always held when reading or writing the location. Consistent locking partitions the shared-and-mutable locations into "which lock". It is neither sufficient nor required but an excellent guideline. Guideline #2: Lock granularity Start with coarse-grained (simpler) and move to fine-grained (performance) only if contention on the coarser locks becomes an issue. Alas, this often leads to bugs. Guideline #3: Critical-section granularity Do not do expensive computations or I/O in critical sections but also don't introduce race conditions. Guideline #4: Atomicity Think in terms of what operations need to be atomic, i. e. think about atomicity first (define critical sections) and locks second (implement CS correctly).

Name 3 approaches to apply parallelism in order to improve sequential processor performance.

Hardware Level: Vectorization, Instruction Level Parallelism (ILP) Software Level: Pipelining

What is the idea behind the Bakery algorithm?

If a process wants to enter its critical section, it has to take a numbered ticket with value greater than all outstanding tickets. It is allowed to enter the CS when its ticket number becomes the lowest.

What is a spinlock?

If a thread tries to acquire a spinlock that is already held, it waits in a loop until it actually acquires the lock.

What is an interleaving?

If the second thread/process starts before the first one ends, we say they interleave.

Explain the difference between implicit and explicit parallelism.

Implicit parallelism occurs when the compiler or the operating system identifies possible parallelism and exploits it. Explicit parallelism occurs when the programmer explicitly specifies the parallelism in the code by creating different threads.

What is a task graph and how is it formed?

In Task Parallel Programming, there are three types of tasks: - execute code - spawn other tasks - wait for results from other tasks A task graph is a directed acyclic graph that is formed dynamically (as execution proceeds) based on spawning tasks.

Explain the concept of instruction-level parallelism (ILP).

In instruction-level parallelism either the compiler or the processor evaluates which instructions are independent of other instructions meaning that they can be executed in parallel. Methods to exploit ILP are: 1) Pipelining 2) Superscalar CPUs that can process multiple instructions per cycle 3) Out-of-Order execution (potential change in execution order: allowed as long as the result is the same as for the sequential program order) 4) Speculative execution: Predict results of conditions or calculations to continue execution before said conditions/calculations are actually processed.

Name different options to create Java Threads.

Instantiate a subclass of java.lang.Thread class: - override run method - first create object - then call start()-method (invokes run()-method which assigns an OS thread) - thread terminates when run() returns Implement java.lang.Runnable - similar steps to first option

Name the properties of the ExecutorService interface.

It can handle task objects of type: - "Runnable" (void run() ⟶ does not return result) - "Callable<T>" (T call() ⟶ returns result) The ExecutorService returns a Future object f with the result (f.get()).

Define the happens-before order (HB) of the JMM.

It is formed by the transitive closure (union) of PO and SW. It is consistent insofar as when reading a variable, we see either the last write or any other unordered write - this means races are allowed!

Name Gustafson's Law.

Let f be the sequential non-parallelizable part of the program, p the number of processors and Tʷᵃˡˡ the time given to complete the work W. W = p(1 - f)Tʷᵃˡˡ + fTʷᵃˡˡ Sᵖ = f + p(1 - f) = p - f(p - 1)

Name the differences between native threading and green threading.

Native threading: One Java Thread per OS Thread. OS is in control of schedule. Green threading: Multiple Java Threads can be mapped to the same OS Thread. JVM manages schedule.

What is the difference between optimistic and pessimistic synchronization?

Optimistic synchronization avoids using locks assuming everything will be fine and reverses the effects in case of an error. Pessimistic synchronization uses preventative locks.

Explain the difference between parallel and distributed computing.

Parallel computing means concurrent processing of several different instructions on one system. Distributed computing happens on multiple systems, i. e. there is physical separation.

Name the key concerns of parallelism and concurrency.

Parallelism: How to use extra resources to solve a problem faster Concurrency: How to correctly and efficiently manage access to shared resources The terms parallel and concurrent are often used interchangeably.

Recall the Peterson Lock.

Process 1 (vs. Process 2) volatile boolean array flag[1..2] = [false, false] volatile integer victim = 1 loop non-critical section flag[1] = true victim = 1 while(flag[2] && victim == 1) critical section flag[1] = false

Recall Decker's Algorithm.

Process A1 (vs Process B2) volatile boolean wanta=wantb=false, integer turn = 1 loop: non-critical section wanta = true while (wantb) { if (turn == 2) { wanta = false while (turn != 1) wanta = true; } } critical section turn = 2 wanta = false

Describe the difference between the memory of processes and the memory of threads.

Processes have different/disjoint memory whereas multiple threads share the same address space. + Threads can communicate more easily + switching between threads is efficient as there is no saving/loading of OS process states - increases vulnerability to programming mistakes

Programming-in-the-large: What does a system consist of?

Processes, Objects and Communication (first and last covered in this class)

What is a Safe SWMR Register?

SWMR stands for Single Writer Multiple Reader, i. e. only one concurrent write but multiple concurrent reads are allowed. Safe means that any read not concurrent with a write returns the current value of r and any read concurrent with a write can return any value of the domain of r. (If any read concurrent with a write can only return a value of one of the previous or the new value, the register is called regular.) It is possible to construct mutual exclusion with this type of register.

What are the required properties of mutual exclusion?

Safety property: at most one process executes the critical section code. Liveness property: acquire_mutex must terminate in finite time when no process executes in the critical section.

Name the pros and cons of Single-Core GPUs versus Many-Core GPUs.

Single-Core (complex control hardware): + Flexibility and Performance - Expensive in terms of power Many-Core (simpler control hardware): + Potentially more power efficient - More restrictive, need complex programming models

Name Amdahl's Law.

Sᵖ ≤ (Wˢᵉʳ + Wᵖᵃʳ) / (Wˢᵉʳ + Wᵖᵃʳ/p) => Sᵖ ≤ 1 / { f + (1-f)/p } with Wˢᵉʳ = fT₁ = Time spent doing serial work Wᵖᵃʳ = (1 - f)T₁ = Time spent doing parallelizable work p working units and f non-parallelizable serial fractions of work. S∞ ≤ 1/f

Given a problem where one cannot assess the amount of work it takes at the beginning. What kind of approach does one use and how is it implemented in Java?

Task Parallel Programming: Instead of creating the threads ourselves, we just create the tasks and submit them to an interface that handles a thread pool. The interface then assigns the tasks dynamically to the available threads. (see also: task graph) Java's implementation is the Fork/Join Framework (based on ExecutorService interface). The thread pool is a ForkJoinPool whose default size is the number of available processors.

Name the three different categories for parallel programming models.

Task parallelism, Data parallelism and implicit parallelism.

Name two typical atomic Read-Modify-Write operations.

Test-And-Set (TAS) and Compare-And-Set (CAS). They enable implementation of a mutex with O(1) space and are needed for lock-free programming.

Define the basics of the Java Memory Model (JMM).

The JMM defines "actions", e. g. read(x):1 = "read variable x, the value read is 1". "Executions" combine "actions" with "ordering". The orders are - program order - synchronizes-with - synchronization order - happens-before.

What is latency and how can it be calculated?

The latency is the time it takes to perform a computation, how long it takes to execute a single computation (e. g. CPU instruction) in the pipeline. The lower the latency, the better. The pipeline latency is only constant over time if the pipeline is balanced. Latency = number of stages * max(timePerStage)

What is the lead-in and lead-out respectively?

The lead-in of a pipeline is the time until all stages are "busy", i. e. there is a bijection between the processing levels and the currently processed instructions. The lead-out of a pipeline is the time between full utilization of the pipeline until all inputs are processed.

What is the locality of reference/principle of locality and what are its effects on computer architecture?

The locality of reference is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. This principle is one of the reasons for CPU caches.

What kind of objects can the methods join(), sleep() and isAlive() be called on and what do they do?

The methods can be called on threads. join() enables a thread to wait (with or without timeout) for another thread (the target) to terminate by calling the method on the target's thread object. sleep() causes the thread to stop execution for the specified amount of milliseconds. isAlive() allows a thread to determine if the target thread has terminated.

Explain the concept of pipelining.

The processor works on several pieces of work at the same time but each piece is on another processing level.

Describe the Producer-Consumer pattern.

The producer thread computes X and passes it to the consumer thread. X does not need synchronization (=> lock-free) because at any point in time only one thread accesses X, we only need a synchronized mechanism to pass X from the producer to the consumer (queues).

What is the difference between absolute and relative speedup?

The relative speedup is the relative improvement from using p execution units: the baseline is the serialization of the parallel algorithm. The absolute speedup is the improvement from the best serial algorithm to the parallel algorithm: the baseline is the best serial algorithm.

Define determinism.

The same input always leads to the same output.

Define and state the formulas of the following terms of performance model: - Sequential execution time - Execution time Tᵖ - (parallel) Speedup Sᵖ - Efficiency

The sequential execution time of a program, also called work, is denoted by T₁. The execution time Tᵖ on p CPUs indicates perfection if Tᵖ = T₁/p and performance loss if Tᵖ > T₁/p. The speedup Sᵖ on p CPUs is calculated by Sᵖ = T₁/Tᵖ. It indicates linear speedup if Sᵖ = p and sub-linear speedup if Sᵖ < p. The efficiency is defined by Sᵖ/p.

Performance Model in Task Parallelism: What is the span, which bounds do we have for Tᵖ and how do you calculate parallelism?

The span, critical path length or computational depth T∞ is the time the algorithm takes on infinite processors. In a task graph, it is the longest path from root to sink. Work law: Tᵖ ≥ T₁ / p Span law: Tᵖ ≥ T∞ Tᵖ ≤ T₁ / p + T∞ (empirically, provable is O(T∞)) Parallelism = maximum possible speed-up = T₁ / T∞ (Note: T₁ / p and T∞ are fixed, Tᵖ depends on the scheduler)

Explain the concept of vectorization.

The standard way, the processor processes one element of a vector at a time. The vectorized way, the processor processes N elements at a time: We load the elements for N operations, perform the operations and store the N results. That way the >= N cores can process data at the same time. Vectorization is usually done by the compiler (JVM). Example: X X0 X1 X2 X3 op op op op op Y Y0 Y1 Y2 Y3 = = = = = Z Z0 Z1 Z2 Z3 original code: for (i = 0, i < 4, i++) { Z[i] = X[i] + Y[i] } vectorized code: r1 = load B, i, i+3 r2 = load C, i, i+3 r3 = r1 + r2 store r3, A

What has to be ensured when dealing with mutable/shared state and how do you achieve it?

The state needs to be protected (synchronized), i. e. threads have exclusive access and intermediate inconsistent states should not be observed by other threads. Methods: - locks: mechanism to ensure exclusive access/atomicity. Ensuring good performance and correctness can be complicated. - transactional memory: the programmer describes a set of actions that need to be atomic. Easier but getting good performance might be a challenge.

What is throughput and how can it be (approx.) calculated?

The throughput is the amount of work that can be done by a system (pipeline) in a given period of time, e. g. in CPUs the number of instructions completed per second. The greater the throughput, the better. Approximate calculation for pipelines: 1 / max(computationtime(stages)) (There is a trade-off between throughput and latency.)

Why is mutual exclusion in practice not implemented with the Bakery algorithm or the Filter lock?

Theorem 5.1 in "Bounds on Shared Memory for Mutual Exclusion": If S is an atomic read/write system with at least two processes and S solves mutual exclusion with global progress (deadlock-freedom), then S must have at least as many variables as processes. If you have 10 million threads, that takes a lot of memory. Additionally, we need to prevent reordering to use the Bakery or Filter lock - memory barriers in hardware are expensive.

Assume the following declaration of an array A in Java: volatile boolean A[] = new boolean[2]; What might be a problem with this array and how do you fix it?

This declaration declares a volatile reference to an array, not an array of volatile variables. Instead, use Java's AtomicIntegerArray (gleiche Semantik wie volatile): AtomicIntegerArray A[] = new AtomicIntegerArray[2];

Explain the concept of thread safety.

Thread safety means that shared data is only manipulated in a manner such that all threads behave properly and fulfill their design specifications - irrespective of thread interleavings! Thread safety implicates program correctness.

Define Moore's law and name the issues it faces today.

Transistor count per area doubles every two years. => smaller transistors => more on a chip => computational power grows exponentially => programs get faster Not doable anymore due to heat and power issues: Instead, we just use several processors at once => parallelism

Recall the Filter lock (extension of Peterson's lock).

We assume we have n threads, me refers to the thread executing the function. int[ ] level[n], int[ ] victim[n] lock(me) { for (int i = 1; i < n; i++) { level[me] = i; victim[i] = me; while (∃k ≠ me: level[k] ≥ i && victim[i] == me); } } unlock(me) { level[me] = 0; } A thread waits in a level while other threads are at the same or higher level and it is the victim of its current level => I (as a thread) can make progress if (a) another thread enters my level or (b) no more threads are in front of me

Explain the parallel programming style Fork/Join.

We create several threads and each thread solves the problem for a given portion of the input. Then we wait for each thread to finish (join) and combine the results. It is a parallel form of Divide-And-Conquer. (prof also calls it Cilk-style)

Define fairness.

We divide the acquisition of a lock into two parts: Doorway interval D: time to actually acquire the lock (finite) Waiting interval W: time until lock is not held (unbounded) A lock algorithm is first-come-first-serve/fair if for two processes A and B the condition Dʲ ᴬ ⟶ Dᵏ ᴮ => CSʲ ᴬ ⟶ CSᵏ ᴮ holds.

What does scalability mean (in parallel programming)?

What happens to the speedup when we increase processors? (increasing linearly => program scales linearly) What happens if processor ⟶ ∞?

When does a program end?

When all (non-daemon) threads finish (NOT when main thread returns)

When is a pipeline balanced?

When every stage takes the same amount of time. An unbalanced pipeline can be balanced by making each stage take as much time as the longest one or by splitting up stages (e. g. having two dryers, first one starting to dry the clothes, second one finishing the job). The latter adds time for overhead though.

Describe the semantics of TAS.

boolean TAS (memref s) { if (mem[s] == 0) { mem[s] = 1; return true; } else { return false; } } (mem[s] == 1 also possible, declared in documentation)

Explain the pattern of Divide-And-Conquer/recursive splitting.

if cannot divide: return unitary solution divide problem in two solve first (recursively) solve second (recursively) combine solutions return result

Name the possible states a thread can be in.

new, runnable, blocked, waiting, time waiting, terminated

What are the methods wait, notify and notifyAll of a monitor in Java doing and what does one have to remember when using them?

wait() releases the monitor and thread waits on internal queue until notified. notify() wakes the highest-priority thread closest to the front of the object's internal queue. notifyAll() wakes up all waiting threads, the threads then compete for acquiring the monitor. The methods may only be called when the object is locked. One has to use the methods within synchronized code. wait has to be used within a while-loop as a thread can return from wait for other reasons than being notified (e. g. was interrupted).


संबंधित स्टडी सेट्स

Gridlock in Congress- Causes and Solutions

View Set

cfa_cases_CommunicationWithClientsAndProspectiveClients_V(B)

View Set

exam 1 immunity infection inflammation

View Set

CH#3: Life Insurance Policies Q&A

View Set

Intro to Coding - Programming Overview

View Set

Chapter 14- Corporations: Basic Considerations

View Set