4-8 Quiz 446
thread cancellation
! Asynchronous cancellation terminates the target thread immediately ! Deferred cancellation allows the target thread to periodically check if it should be cancelled cancellation depends on state of thread. if cancellation disabled it will remain pending until thread enables it.
Multiple-Processor Scheduling
! CPU scheduling more complex when multiple CPUs are available
Scheduling Criteria
! CPU utilization - keep the CPU as busy as possible ! Throughput - # of processes that complete their execution per time unit ! Turnaround time - amount of time to execute a particular process ! Waiting time - amount of time a process has been waiting in the ready queue ! Response time - amount of time it takes from when a request was submitted until the first response is produced, not output (for time-sharing environment)
Algorithm Evaluation
! Deterministic modeling ! Type of analytic evaluation ! Takes a particular predetermined workload and defines the performance of each algorithm for that workload
Deterministic Evaluation
! For each algorithm, calculate minimum average waiting time ! Simple and fast, but requires exact numbers for input, applies only to those inputs
Multiple-Processor Scheduling - Load Balancing
! If SMP, need to keep all CPUs loaded for efficiency ! Load balancing attempts to keep workload evenly distributed ! Push migration - periodic task checks load on each processor, and if found pushes task from overloaded CPU to other CPUs ! Pull migration - idle processors pulls waiting task from busy processor
Simulations
! Queueing models limited ! Simulations more accurate ! Programmed model of computer system ! Clock is a variable ! Gather statistics indicating algorithm performance ! Data to drive simulation gathered via 4 Random number generator according to probabilities 4 Distributions defined mathematically or empirically 4 Trace tapes record sequences of real events in real systems
Signal Handling
! Signals are used in UNIX systems to notify a process that a particular event has occurred. ! A signal handler is used to process signals 1. Signal is generated by particular event 2. Signal is delivered to a process 3. Signal is handled by one of two signal handlers: 1. default 2. user-defined ! Every signal has default handler that kernel runs when handling signal ! User-defined signal handler can override default ! For single-threaded, signal delivered to process Where do you deliver a signal to a multi threaded app?? every thread some thread one that called it thread specifically for signals
Multithreaded Multicore System
! Two levels of scheduling: 1. The operating system deciding which software thread to run on a logical CPU 2. How each core decides which hardware thread to run on the physical core.
Deadlock
- two or more processes are waiting indefinitely for an event that can be caused by only one of the waiting processes
Pthreads
A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization ! Specification, not implementation
Recovery from Deadlock: Process Termination
Abort all deadlocked processes OR Abort one process at a time until the deadlock cycle is eliminated In which order should we choose to abort? 1. Priority of the process 2. How long process has computed, and how much longer to completion 3. Resources the process has used 4. Resources process needs to complete 5. How many processes will need to be terminated 6. Is process interactive or batch?
Deadlock Detection
Allow system to enter deadlock state Detection algorithm Recovery scheme
CPU Scheduling
CPU-I/O Burst Cycle - Process execution consists of a cycle of CPU execution and I/O wait ! CPU burst followed by I/O burst ! CPU burst distribution is of main concern
Resource-Allocation Graph Scheme
Claim edge Pi → Rj indicated that process Pj may request resource Rj ; represented by a dashed line Claim edge converts to request edge when a process requests a resource Request edge converted to an assignment edge when the resource is allocated to the process When a resource is released by a process, assignment edge reconverts to a claim edge Resources must be claimed a priori in the system Suppose that process Pi requests a resource Rj The request can be granted only if converting the request edge to an assignment edge does not result in the formation of a cycle in the resource allocation graph
One to One
Common used on windows and linux Best concurrency
Types of parallelism
Data: distributing subsets of data to different cores Task: distributing task across different cores.
Deadlock in Multithreaded Application
Deadlock is possible if thread 1 acquires first_mutex and thread 2 acquires second_mutex. Thread 1 then waits for second_mutex and thread 2 waits for first_mutex. happen under these circumstances: Mutual exclusion: only one process at a time can use a resource Hold and wait: a process holding at least one resource is waiting to acquire additional resources held by other processes No preemption: a resource can be released only voluntarily by the process holding it, after that process has completed its task Circular wait: there exists a set {P0 , P1 , ..., Pn } of waiting processes such that P0 is waiting for a resource that is held by P1 , P1 is waiting for a resource that is held by P2 , ..., Pn-1 is waiting for a resource that is held by Pn , and Pn is waiting for a resource that is held by P0 .
POSIX Real-Time Scheduling
Defines two functions for getting and setting scheduling policy: 1. pthread_attr_getsched_policy(pthread_attr_t *attr, int *policy) 2. pthread_attr_setsched_policy(pthread_attr_t *attr, int policy)
Queueing Models
Describes the arrival of processes, and CPU and I/O bursts probabilistically ! Commonly exponential, and described by mean ! Computes average throughput, utilization, waiting time, etc ! Computer system described as network of servers, each with queue of waiting processes ! Knowing arrival rates and service rates ! Computes utilization, average queue length, average wait time, etc
Functional Programming Languages
Functional programming languages offer a different paradigm than procedural languages in that they do not maintain state. Variables are treated as immutable and cannot change state once they have been assigned a value. There is increasing interest in functional languages such as Erlang and Scala for their approach in handling data races.
Dispatcher
Gives control of CPU to processes includes: Switching context from one process to another • Switching to user mode • Jumping to the proper location in the user program to resume that program reduce dispatch latency
Thread Pools
Have a pool of waiting threads. When a request comes it takes a waiting thread, if no threads request is placed in a queue Pros: quicker to have threads then create them limits number of threads active at any time Separation of task creating and task scheduling allows for different scheduling strategies .
Challenges multicore
Identifying task- find areas of app that can be divided up idealy independent of each other Balance-Making sure that each cores does equal work Data splitting- data must be divided just like the app data dependency- IF task B needs data from task A must ensure execution of the task is in the correct order. Testing and debugging- very hard
Condition Variables Choices
If process P invokes x.signal(), and process Q is suspended in x.wait(), what should happen next? Signal and wait - P waits until Q either leaves the monitor or it waits for another condition Signal and continue - Q waits until P either leaves the monitor or it waits for another condition
Minimizing latency
Interrupt Latency-refers to the period of time from the arrival of an interrupt at the CPU to the start of the routine that services the interrupt. Dispatch latency- The amount of time required for the scheduling dispatcher to stop one process and start another conflic phase of dispatch latency has two components: 1. Preemption of any process running in the kernel 2. Release by low-priority processes of resources needed by a high-priority process
Deadlock Prevention
Invalidate one of the four necessary conditions for deadlock: Invalidating the circular wait condition is most common. Simply assign each resource (i.e. mutex locks) a unique number. Resources must be acquired in order.
Liveness
Liveness refers to a set of properties that a system must satisfy to ensure processes make progress.
Single Instance of Each Resource Type
Maintain wait-for graph Nodes are processes Pi → Pj if Pi is waiting for Pj Periodically invoke an algorithm that searches for a cycle in the graph. If there is a cycle, there exists a deadlock An algorithm to detect a cycle in a graph requires an order of n^2 operations, where n is the number of vertices in the graph
Multithread vs Multicore
Multicore refers to a computer or processor that has more than one logical CPU core, and that can physically execute multiple instructions at the same time. A computer's "core count" is the total number of cores the computer has: computers may have multiple processors, each of which might have multiple cores; the core count is the total number of cores on all of the processors. Multithreading refers to a program that can take advantage of a multicore computer by running on more than one core at the same time. In general, twice as many cores equals twice as much computing power (for programs that support multithreading) though some problems are limited by factors other than CPU usage; these problems will not experience gains that are as dramatic from multithreading.
Semaphore Implementation
Must guarantee that no two processes can execute the wait() and signal() on the same semaphore at the same time critical section problem busy waiting to stop this
Priority-based scheduling
Note that providing a preemptive, priority-based scheduler only guarantees soft real-time functionality. For Hard real time we need different methods
Thread scheduling
On systems implementing the many-to-one (Section 4.3.1) and many-to-many (Section 4.3.3) models, the thread library schedules userlevel threads to run on an available LWP. This scheme is known as processcontention scope (PCS), since competition for the CPU takes place among threads belonging to the same process. To decide which CPU Scheduling kernel-level thread to schedule onto a CPU, the kernel uses system-contention scope (SCS).
periodic processes
Once a periodic process has acquired the CPU, it has a fixed processing time t, a deadline d by which it must be serviced by the CPU, and a period p. The relationship of the processing time, the deadline, and the period can be expressed as 0 ≤ t ≤ d ≤ p. The rate of a periodic task is 1∕p.
Monitors
Only one process may be active within the monitor at a time
Shortest Job first
Optimal cannot be implemented at CPU level as there is no way to know how long next one will be can guess using exponential average length of the pervious burst
Rate-Monotonic Scheduling
The shorter the period, the higher the priority; the longer the period, the lower the priority optimal with static priorities
Multithreading model
User threads are above the kernel without kernel support Kernel threads are manages directly by the operating system
Resource-Allocation Graph
V is partitioned into two types: P = {P1 , P2 , ..., Pn }, the set consisting of all the processes in the system R = {R1 , R2 , ..., Rm}, the set consisting of all resource types in the system request edge - directed edge Pi → Rj assignment edge - directed edge Rj → Pi If graph contains no cycles - no deadlock If graph contains a cycle - if only one instance per resource type, then deadlock if several instances per resource type, possibility of deadlock
Safe State
When a process requests an available resource, system must decide if immediate allocation leaves the system in a safe state - no deadlock Safe State: If Pi resource needs are not immediately available, then Pi can wait until all Pj have finished When Pj is finished, Pi can obtain needed resources, execute, return allocated resources, and terminate When Pi terminates, Pi +1 can obtain its needed resources, and so on
Multiple-Processor Scheduling - Processor Affinity
When a thread has been running on one processor, the cache contents of that processor stores the memory accesses by that thread.- this is affinity ! Soft affinity - the operating system attempts to keep a thread running on the same processor, but no guarantees. ! Hard affinity - allows a process to specify a set of processors it may run on.
thread
basic unit of cpu utilization has thread ID program counter, registers and stack. Shares code section data section and other operating system recourses.
Implicit threading
complier handle creations of threads
Round Robin
each process get a small unit of CPU time (time quantum q) after unit of time process is sent to back of ready queue if q is large then it is bascially FIFO q is small we do too much work context switching.
Bounded-Buffer Problem
int n; semaphore mutex = 1; semaphore empty = n; semaphore full = 0
OpenMP
is a set of compiler directives and API that support parallel progamming. #pramgma omp critical { critical section }
Many to One
many users threads map to one kernel thread cant run in parallel only one thread in the kernel
Memory Barriers
memory models- strongly ordered- memory modification on one processor is immediately visible to all other processors weakly ordered- modifications may not be immediately visible A memory barrier is an instruction that forces any change in memory to be propagated (made visible) to all other processors
POSIX API provides
mutex locks semaphores condition variable
Little's Formula
n = average queue length ! W = average waiting time in queue ! λ = average arrival rate into queue ! in steady state, processes leaving queue must equal processes arriving, thus: n = λ x W ! Valid for any scheduling algorithm and arrival distribution
Deadlock Avoidance
needs a priori information available: Simplest and most useful model requires that each process declare the maximum number of resources of each type that it may need The deadlock-avoidance algorithm dynamically examines the resource-allocation state to ensure that there can never be a circular-wait condition Resource-allocation state is defined by the number of available and allocated resources, and the maximum demands of the processes
asynchronous threading
once the parent creates a child thread, the parent resumes its execution, so that the parent and child execute concurrently and independently of one another. Because the threads are independent, there is typically little data sharing between them
Synchronous threading
parent thread creates one or more children and then must wait for all of its children to terminate before it resumes. Here, the threads created by the parent perform work concurrently, but the parent cannot continue until this work has been completed. Once each thread has finished its work, it terminates and joins with its parent. Only after all of the children have joined can the parent resume execution.
Amdahl's Law
performance gains from adding cores to prosses with parallel and serial parts S= serial part of code N = cores
CPU Scheduler
pick which process to send to a core CPU scheduling decisions may take place when a process: 1. Switches from running to waiting state 2. Switches from running to ready state 3. Switches from waiting to ready 4. Terminates ! Scheduling under 1 and 4 is nonpreemptive ! All other scheduling is preemptive ! Consider access to shared data ! Consider preemption while in kernel mode ! Consider interrupts occurring during crucial OS activities
Priority Scheduling
priority number associated with each process schedule lowest numbers first ! Problem º Starvation - low priority processes may never execute ! Solution º Aging - as time progresses increase the priority of the process
admission-control algorithm
process gives deadline to scheduler then the scheduler does one of two things. It either admits the process, guaranteeing that the process will complete on time, or rejects the request as imp
Mutex locks
process must acquire the lock before entering a critical section; it releases the lock when it exits the critical section. The acquire()function acquires the lock, and the release() function releases the lock spinlock
Synchronization background
processes may be paused at any time in their execution so another process can run. This may leave data incomplete or missing.
Atomic variables
provides atomic (uninterruptible) updates on basic data types such as integers and booleans.
multicore
put multiple computing cores on one chip which all looks to be different CPU to the OS allows for true parallelism
Why use thread
responsiveness- can respond with others threads resource sharing- can have multiple threads within the same address space Economy- Threads share resources so you dont have to do costly resources allocation Scalability- multi thread apps can run on different processors at same time.
Proportional Share Scheduling
shares are allocated among all processes in the system ! An application receives N shares where N < T ! This ensures each application will receive N / T of the total processor time
Linux threads
task rather than threads clone() allows child task to share address space of parent
Multilevel Queue Scheduling
to not have to search a queue for highest priority we can place Tasks in their own queue corresponding to their priority level
Earliest-Deadline-First Scheduling (EDF)
what it sounds like
First-Come First-Served (FCFS)
what it sounds like Convoy effect - short process behind long process ! Consider one CPU-bound and many I/O-bound processes I/O will need to wait for cpu processes first
race condition.
when process manipulate the same data at the same time often leading to results that depend on the orderof processes.
fork() issues
when program calls fork are all threads copied or is it now a single thread app. 2 versions one that copies all and one that only copies the thread from which it was called.
(SMP) Symmetric multiprocessing
where each processor is self scheduling. ! All threads may be in a common ready queue (a) ! Each processor may have its own private queue of threads (b)
The Readers -Writers Problem
writer and some other process (either a reader or a writer) access the database simultaneously, chaos may ensue. First variation - no reader kept waiting unless writer has permission to use shared object Second variation - once writer is ready, it performs the write ASAP
Priority Inheritance Protocol
y allows the priority of the highest thread waiting to access a shared resource to be assigned to the thread currently using the resource. Thus, the current owner of the resource is assigned the priority of the highest priority thread wishing to acquire the resource.
Condition Variables
The only operations that can be invoked on a condition variable are wait() and signal(). The operation means that the process invoking this operation is suspended until another process invokes x.signal(). x.signal() operation resumes exactly one suspended process. If no x.wait() on the variable, then it has no effect on the variable
Hardware Instructions
Test-and-Set instruction Compare-and-Swap instruction -atomic meaning they cannot be interrupted and therefore solving problem
Transactional Memory
A memory transaction is a sequence of read-write operations to memory that are performed atomically. A transaction can be completed by adding atomic{S} which ensure statements in S are executed atomically:
Starvation
A process may never be removed from the semaphore queue in which it is suspended
Critical section problem
Each process has a segment of code, called a critical section, in which the process may be accessing — and updating — data that is shared with at least one other process no 2 processes can be in their critical section at the same time. how to solve?? . Each process must request permission to enter its critical section. The section of code implementing this request is the entry section. The critical section may be followed by an exit section. The remaining code is the remainder section. must follow 3 rules: 1. mutual exclusion - if P is in critical section no other process can be in theirs 2. Progress. If no process is executing in its critical section and some processes wish to enter their critical sections, then only those processes that are not executing in their remainder sections can participate in deciding which will enter its critical section next, and this selection cannot be postponed indefinitely. 3. Bounded waiting. There exists a bound, or limit, on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is granted. can effect the kernel too
System Model
Resource types R1 , R2 , . . ., Rm CPU cycles, memory space, I/O devices Each resource type Ri has Wi instances. Each process utilizes a resource as follows: request use release
Preemptive vs Nonpreemptive
Scheduling decision may take place under following situations 1. when process goes from running to waiting - non 2.switches from running to ready- pre 3. waiting to ready state - pre 4. termination - non Preemptive is common
Priority Inversion
Scheduling problem when lower-priority process holds a lock needed by higher-priority process
Semaphore
Semaphore S - integer variable Can only be accessed via two indivisible (atomic) operations wait() and signal() wait()- wait until S > 0 then s=s-1 signal()- s=s+1 (one waiting process may now be able to execute) signal(s)_i < wait(s)_1+k
Linux Synchronization
Semaphores atomic integers spinlocks reader-writer versions of both
Two-level model
Similar to M:M, except that it allows a user thread to be bound to kernel thread
Real-time CPU
Soft-real time - no guarantee when critical process will be scheduled only that it has higher priority than other processes Hard-real time-task must be serviced by deadline
Semaphore Implementation with no Busy waiting
With each semaphore there is an associated waiting queue Each entry in a waiting queue has two data items: value (of type integer) pointer to next record in the list block - place the process invoking the operation on the appropriate waiting queue wakeup - remove one of processes in the waiting queue and place it in the ready queue add these to wait and signal
multilevel feedback queue
allows a process to move between queues In general, a multilevel feedback queue scheduler is defined by the following parameters: • The number of queues • The scheduling algorithm for each queue • The method used to determine when to upgrade a process to a higherpriority queue • The method used to determine when to demote a process to a lowerpriority queue • The method used to determine which queue a process will enter when that process needs service
Thread-local storage (TLS)
allows each thread to have its own copy of data Useful when you do not have control over the thread creation process (i.e., when using a thread pool)
Many to Many
allows operating system to create suffice number of kernel threads