OS Midterm 1
xv6 Interrupts
0-31 reserved by Intel for different types of processor exceptions. 32-47 used for device interrupts -- most can be configured. 64 (0x40) used for system call interrupts -- Linux uses 128 (0x80) for its system call. xv6 interrupt handler entries: vector.S (generated by vectors.pl).
Combination Hypervisor
A combination of native and hosted hypervisors. Processor virtualization and a dedicated host OS.
How does OS know which code to call?
A number is associated with each syscall -- the syscall interface maintains a table indexed according to these numbers and the API converts symbolic names of these services to these numbers. glibc (the C library) provides wrappers for the syscalls that eventually generate the system aclls.
Kernel Mode
A privileged mode in which the operating system runs.
fork-exec Process
A process (pid = 1) is running and wants to start another. A call to fork copies the state (exact replica) of the current (parent) process to a new child process and returns the new pid to the parent and 0 to the child (it can retrieve its own pid with a syscall if needed). A call to exec now loads a new program with its own code, sets up its global data, stack, heap, etc. Exec only returns if an error has occurred. Parent can continue execution or block until the child process finishes by calling wait. The effect of a call to exec is a separate program image is loaded and begins execution. If the parent calls wait then it will block until the child process calls exit. Exit is generally called when the program finishes main. Exit deletes the current process and its resources (address space, etc.). PCB is not deallocated until parent or other entity specifically releases it. This allows the parent or other task to examine the child process' exit status.
Process's Double Life
A process is not just a user-mode-only concept. In general, a process has both user-mode and kernel-mode lives. The kernel is mapped into every process's address space. A process also has separate user-mode and kernel-mode contexts. User-mode context: register values when running in user mode + user-mode stack. Kernel-mode context: register values when running in kernel mode + kernel-mode stack. In most OSs, each process has two stacks. HW changes the stack pointer to point to the base of the kernel's stack on trap or interrupt. Advantage: kernel can execute even if user stack is corrupts and attacks that target the stack, such as buffer overflow attacks, will not affect the kernel. On a multiprocessor, each processor has its own interrupt stack.
Implementing Syscalls
A system call generates a trap -- in x86, an int instruction traps into the kernel on a syscall. In an xv6 user-level program, if you see int 0x40 it's a syscall. The executing process is suspended and control is transferred to the kernel that implements each syscall. Process continues when the syscall completes. (User mode -trap-> kernel mode -> user mode upon return from the syscall.
Address Space Abstraction
Abstraction of physical memory. Goals: to give each process a private memory area for code, data, and stack, the prevent one process from reading / writing outside of its address space, and to allow sharing when needed. Usually the implementation is split between the OS and HW. The OS manages address spaces and allocates physical memory (for creation, growth, and deletion). The HW performs address translation and protection and translates user addresses into physical addresses. The OS has its own address space.
What Apps Want from OS
Access to HW, Memory Management, and Data. Abstract HW for convenience and portability. Multiplex HW among many apps. Isolate apps to contain bugs. Share among apps.
Process Address Space
All memory a process can address. Consists of a stack and a heap, which grow towards each other and are referred to as dynamic memory (this is to optimize use of space because can have different sizes for each). Also consists of data and text (code) -- these are static memory. All of these components are virtual addresses. In xv6, KERNBASE is the max size that virtual addresses can grow up towards, and it allows up to 2GB memory for what a process can address. The kernel in xv6 has its own text + data + device memory sections in the process address space. In xv6, the heap is above the stack too so it can expand with sbrk, and the stack is a single page (4 KB). User address space can only access processes to which it's permitted (like a sandbox).
Interrupts
An event that diverts the current execution of a program to a predefined location in the kernel in order to respond to an event. SW Interrupts are called traps.
System Calls
An interface between a user app and a service provided by the OS. Used because we don't trust user software -- provides a level of separation. The API of the OS from the user program's point of view, accessible via a library of code, like libc. Most common APIs are Win32 for Windows, POSIX for Linux, Mac, etc., and the Java API for the JVM. Six major categories: process control (fork, exit, wait), device management (ioctl, read, write), file management (open, read, write, close), information maintenance (getpid, alarm, sleep), communications (pipe, shmget, mmap), and protection (chmod, umask, chown).
Maskable HW Interrupt
An interrupt that can be disabled or ignored by the CPU instructions. Interrupts will not be delivered on the INTR line. Useful when the CPU needs to do something more important. Only recognized if the interrupt flag IF is set on the INTR line from the PIC.
Non-maskable HW Interrupt
An interrupt that cannot be disabled by the CPU instructions. Invoked by NMI line from PIC (Programmable Interrupt Controller). No other interrupts can execute until the NMI is done. Never ignored (power failure, non-recoverable HW errors).
Native Hypervisor Communication
App is running, then makes a syscall which invokes the hypervisor -- not the OS (OS actually runs in user mode here). Then the hypervisor invokes the guest OS which delivers the result and uses virtual HW. Then the guest OS goes back to the hypervisor to restore the state of the app and then goes back to the app. Only 4 mode switches total.
Internet Network Layers
Apps (independent), transport (between apps), network (communication), data link (a wire, the air -- like WiFi), and physical (HW).
HW Interrupts
Asynchronous. Spawned by an event external to the CPU. Caused by any peripheral device by sending a signal through a specified pin to the processor. Unpredictable event. Generated by devices when a request is complete or an event that requires CPU attention occurs. Don't happen at predictable places in the user code. Can be maskable or non-maskable.
Security
Authorization. Can the system be compromised by an attacker? Data should only be accessible to authorized users.
Exec
Change the program being run by the current process. Replaces the data and program (text section) executed by a process: control never returns to the original program unless there is an exec error, whole family of exec calls. The child process typically calls exec once it has returned from fork and configured the execution environment for the new process. All open file descriptors remain open after calling exec unless explicitly set to close-on-exec (this flag causes the file descriptor to be automatically (and atomically) closed when any of the exec calls succeed. Resources like open files are stored in kernel space while text and data are in user space.
Policies
Configure the mechanisms. Ex: number of sockets per process, which data should be removed from main memory, which kind of memory to use for something. Should be flexible -- may need to adjust, and it's easier to update the config file just once.
User Space
Consists of the address spaces of each process. User programs are "non-urgent". Can't access kernel space directly -- have to use syscalls.
Kernel Space
Consists of the kernel process address space and process table. These processes are trusted.
Program Address Space
Contains code, data, stack and the heap in the user address space. Also contains the kernel stack, which is maintained in the kernel address space.
OS Kernel
Core component of the OS, has full access to all of the hardware. Responsible for protection (security, privacy, reliability, and fair resource allocation), process management, memory management, I/O management, and system calls. OS needs to provide many services to user-level processes -- where should those services go (user vs. kernel mode)? Tradeoffs: evaluate where the services should be located based on performance, reliability / security, and flexibility. Can be monolithic, layered, microkernel, or virtual machine.
Fork
Create copy of the current parent process and start it (child) running -- no arguments, returns a pid_t. Creates a new child process that's identical to the calling "parent" process, including all state (memory, registers, etc.). Returns 0 to the child process and returns the child's process ID (PID) to the parent process. On failure, -1 is returned in the parent and no child process is created. The child is almost identical to the parent -- child gets an identical (but separate) copy of the parent's virtual address space and the child has a different PID than the parent. Unique function, because it's called once but returns twice on success. FEASIBLE OUTPUTS COULD BE TESTED.
Can interrupts be preempted?
Depends on both the processor architecture and the kernel architecture. On some processors, it's possible for an interrupt to be interrupted by another higher-priority interrupt. For example, ARM processors have two interrupt levels: IRQ (normal interrupts) and FIQ (fast interrupts) -- FIQ can be triggered in the middle of handling an IRQ. Linux supports nested interrupts: one interrupt can "preempt" another interrupt. Different priority interrupts: maskable vs non, different critical levels, etc. Sometimes to serve an interrupt you may need a resource from finishing a different interrupt (code does not exist, for example -- would need to handle that first).
ecxec Family Functions
Different calls to exec. path - full path name of the file which is to be executed, arg - the argument passed, const char* arg - a list of pointers to null-terminated strings, envp - an array which contains pointers that point to the environment variables, file - the path name which will identify the path of new process image file. Ex: execl, execlp, execle, execv, execvp, execvpe.
Dual Mode of Operation
Distinguishes between the execution of OS code and user code. The goal is protection. Kernel mode: special mode of the processor for executing trusted OS code (execution with full privileges of the HW) and no memory access limits. User mode: limited privileges (only those granted by the kernel) and limits on memory accesses. How do we handle switching from one mode to the other with safe control transfer? A mode bit and CPL. N hierarchical domains labeled 0 - N-1. HW isolates each layer to protect. Each ring supports one mode.
Reliability
Does the system do what it's supposed to do?
xv6 Process States
EMBRIO, RUNNABLE, RUNNING, SLEEPING, ZOMBIE (PCB in OS exists even though program is no longer executing --the instructions are done but the process isn't released).
Stack
Easier to access because it uses simple push and pop operations. One pointer that increments or decrements. The heap is more chaotic and more time-consuming (used to store dynamic allocation -- allocate(), write(), offset()), exists until it's explicitly freed. Static vars are stored in data and can always be accessed during a process' runtime (never destroyed).
Microkernel
Ex: Mach, seL4. The kernel has low-level functions (like scheduling, virtual memory, IPC, protection) and very few lines of code. Other functionalities (drives, I/O, apps, process management, file system) run in user mode. All on top of hardware. Bad performance because services can only talk to one another through IPC, and would have to make a lot of syscalls to get between the different services (file system may need a driver, for example). Good reliability / security because of the isolation. Good flexibility because you can replace pieces easily (not a lot of code).
Layered Kernel
Ex: Multics, Gemsos. Old, very secure approach. Groups components that serve similar purposes and has different privileges based on what they are. Apps run in user mode, and each group (file system, virtual memory, etc.) is isolated and runs in in--between modes though process management itself is in kernel mode, all on top of hardware. Tradeoffs: Bad performance, because so isolated and would need to communicate through multiple different layers. Best security because apps are not trusted and each layer is distinct. Medium flexibility because syscalls exist for different parts. Hard to draw a hierarchy of the layers because they're all different.
Monolithic Kernel
Ex: Windows, Linux, Mac OS. All OS services (file system, virtual memory, I/O management, process management) run in kernel mode and apps run in user mode, all on top of hardware. Tradeoffs: Really good performance (#1 goal) because there's no communication between different layers or anything. Bad security, because there's no isolation in the kernel. Bad flexibility because have to recompile a ton of code and you may need to make changes in a bunch of different places.
Virtual Machine
Ex: Xen, VMWare. Organizes multiple groups of user modes, each with their own guest OS and apps. Those user modes run on top of a hypervisor, which is in kernel mode. The idea is to abstract HW of a single computer into several different execution environments. This indirection layer creates a virtual system on which OSs or apps can run. Components: host (underlying HW or OS), Virtual Machine Manager (VMM) or hypervisor, and guest (software running within a VM -- guest OSs and guest apps, which all run in user mode, including the OS). Can be used to support legacy apps, helpful for cloud providers who want flexibility
Interrupt Handling Steps
Executing a user process: lower 2 bits in CS (code segment) register store the CPL (current privilege level) where 0 = most privileged (kernel mode). An interrupt then occurs -- HW enters kernel mode and disables interrupts, temporary copies are made of the stack pointer, program counter (IP), and flags register, and using the task segment (tr), the HW looks up the kernel stack location and points the esp to that location. The HW then updates the stack and basic registers, HW pushes user process $esp, $eip, $eflags registers onto the stack and loads $eip with handler's starting address by looking it up in the interrupt vector table. Then the handler saves the remaining state and pushes other registers (eax, ebx, etc.) onto the stack and can now execute kernel mode. When the handler is done, it restores the saved registers (eax, ebx, etc.), executes IRET which pops off $eflags, $eip, and $esp back into the registers, and mode is restored appropriately.
User Stub
Fills in code for the syscall, pushes arguments and the syscall number onto the stack, and executes the trap instruction.
How does the processor know what code to run?
For protection, code is only executed in kernel mode at specific well-defined locations / entry points. Each interrupt is identified by a number indicating its type (in x86, 14 is a page fault, 3 is a debug breakpoint, etc.). The number is the index into a processor-specific Interrupt Descriptor Table (a.k.a. Interrupt Vector) that holds the start address of the interrupt handlers or Interrupt Service Routines (ISR). The interrupt vector is stored in the kernel rather than user memory for protection.
fork-exec Model
Fork() creates a copy of the current process, exec*() replaces the current process code and address space with the code for a different program. The return values of fork and exec* should be checked for errors. Fork causes the kernel to: create and initialize the PCB entry for a new process, create a new address space, initialize the new address space with a complete copy of the parent's address space, inherit the execution context of the parent (open files, etc.), inform the scheduler that the new process is ready to run. Exec causes the kernel to: load a new program image into the current address space, copy arguments into the address space, initialize the hardware context (PC + stack) to the "start" entry point of the program image, and exec does NOT create a new process.
How do we run monolithic OS on different architectures?
General features can make an OS portable and extensible. Hardware Abstraction Layer (HAL) and dynamically loaded device drivers. Both run in kernel mode in x86.
Indirection
General principle for managing complexity through abstraction. Problems can arise if you have too many layers of abstraction, may have to duplicate functionality.
When you need OS
Getting input from the keyboard, allocating memory for an array, loading a program into memory, sometimes for calling a library function, exiting the current program. When you don't: high-level matrix multiplication, changing the layout of a document (text editor handles this itself), calling a function in the current program, sleeping for n seconds (threads can be implemented t the user level) -- but this would need the OS if the CPU was idling.
Dynamically Installed Device Drivers
Goal: accommodate a wide variety of physical I/O devices. Approach: Decouple the OS source code from the specifics of each device, support dynamically loadable device drivers (not compiled directly into kernel but loaded as shared library when needed), and the device manufacturer typically provides the driver code that supports standard interface supported by the kernel. Tradeoffs: good flexibility, bad reliability / security (drivers could be buggy, installed drivers become a part of the kernel, but can use signatures to ID different parts of drivers). Device drives can cause lots of problems. Best approach: make device drivers at the user level with only a little of it running at the kernel level.
CPU Fetch-Execute Cycle
IP = instruction pointer, IH = interrupt handler, IRET = return to the interrupted process. You cannot stop in the middle of an instruction. Steps: fetch the instruction at the instruction pointer, decode the fetched instruction, execute the decoded instruction, and advance the IP to the next instruction -- at this point, check if there were any interrupts. If there weren't, go through the loop again. If there was an interrupt, save the context, get the interrupt ID, lookup the IH, execute the IH, then IRET and loop around.
Current Privilege Level (CPL)
In x86, this indicates the current mode. A number in the range from 0 (the kernel mode (ring 0)) to 3 (the least privileged, in ring 3). Part of the %cs register, which specifies the code segment address. Four levels of privilege can be interpreted as rings of protection, where ring 0 can do anything, and then further rings' permissions are determined by their policies, where ring 3 can't do anything requiring privileged access. Usually apps are ring 3 and OS services are rings 1 and 2.
Privileged Instructions
Instructions that are only available in kernel mode. Ex: change which memory locations a user program can access, send commands to I/O devices, read data from / write data to I/O devices, disable interrupts, jump into kernel code. If a user program attempts to execute a privileged instruction, then a processor exception transfers control to the kernel -- a processor exception transfers control to the kernel with a mode switch, while a programming exception is handled by user code.
Hardware Abstraction Layer (HAL)
Interface to machine configuration and processor-specific operations, provide functions and abstract models of devices, and to port OS to a new HW create new implementations of low-level HAL routines.
How do we take interrupts safely?
Interrupt vector has limited number of entry points into the kernel. Atomic transfer of control from user to kernel mode -- single instruction to change program counter, stack pointer, memory protection, and kernel / user mode. If you don't save that, you have to restart. Transparent restartable execution: restore state and return to application (or kill it if recovery is impossible).
Performance
Latency / response time: how long for one operation to complete (like, the mouse moving). Throughput: operations per unit of time. Overhead: how much extra work is done by the OS? Fairness: how equal is the performance for different users? Predictability: How consistent is the performance over time?
General libraries
Like a wrapper around the system call libraries.
Function Invocation
Local vars are stored on the stack, and pushed on. When you call another function, the stack pushes the parent's local vars and pushes the vars of the function called on next.
Kernel Stub
Locates arguments (in registers or on user stack), copies arguments (from user memory into kernel memory -- protects the kernel from malicious code evading checks), validates arguments (protects kernel from errors in user code), and copies results back into user memory.
xv6
MIT educational reimplementation of UNix v6. Runs with a multicore processor. Supports paging, processes, files. Runs on x86 (QEMU - Quick Emulator, a generic open-source management emulator). Monolithic (everything is at the privileged level), preemptively multitasked, multi-processor-capable, 32-bit, UNIX-like OS. Maximum addressable memory: 2 GB. Few supported devices. No support for kernel-level multithreading. Limited set of user-level programs. No compiler / debugger / editor. Files described on slide deck 1, slide 30.
Hypervisor / VMM
Manages the resources of the underlying hardware and provides an abstraction of one or more VMs. Runs in the most privileged level (kernel mode).
x86 Interrupt
Mask interrupts. Save the current stack pointer, program counter, processor status (execution flags) in registers. Switch to the kernel stack and put the SP, PC, PS on the stack. Switch to kernel mode. Vector through interrupt table. Interrupt handler saves registers it might change.
Address Space
Memory space is divided into two distinct regions. Kernel space is where the kernel (the core of the OS) runs and provides its services -- can access all the memory and HW. User space is where user processes run (everything other than the kernel) -- cannot access kernel space directly (some kernel space can be accessed via syscalls). This protects processes from one another and protects the kernel from processes.
Process Management
Most OSs allow one user process to create another process through system calls, rather than the kernel initiating all process creations. Windows approach: perform process creation, environment setup, and program startup through a single system called createProcess(). Unix approach: process creation, environment setup, and program startup have separate system calls, create a process with fork(), setup environment with possible calls to open(), close(), and dup2(), and load program image and start execution with exec() syscall.
Hypervisor Types
Native, Hosted, or a combination.
Process States
New -(admitted)-> Ready -(schedule dispatch)-> Running -(I/O or event wait)-> Waiting -(I/O or event completion)->Ready. Also, Running can go to Ready after an interrupt. Then, Running can also go to Finished after the process is done.
Process Control Block (PCB)
OS maintains information about every process in this data structure. Contains: unique process identifier, process state (running, ready, etc.), CPU state (program counter, regisers, flags), memory management information (page tables, segment tables, base and bound registers), CPU scheduling and accounting information, the parent process, child processes, etc. Each process has one entry in the table. The OS maintains the PCB table which has one entry for each process (the process ID and a pointer to that process' PCB table) -- can be a hashmap, array, linked list, though you would have to search for a free space in an array whenever a new process started. In xv6, this is maintained in the proc.h file. Can only access PCB information through syscalls.
What stack is used for syscalls?
On a syscall, the processor switches to the kernel stack, jumps to kernel mode, and starts executing the kernel instructions that implement the syscall. User stack holds info for a few functions, at first the kernel stack is empty. Then the process is ready to run and the kernel stack has the user CPU state. Then when waiting for I/O, the kernel stack has the user CPU state, syscall handler, and I/O driver top half which contains the context to be resumed when I/O completes. The user stack contains the context to be resumed when the syscall returns. Process can't access its own pid in the pcb -- it needs to make a syscall to do so.
Interrupt vs. Trap Handling
On an interrupt: HW interrupts CPU execution, CPU traps to interrupt vector, and CPU switches to kernel mode and jumps to interrupt handler in OS. On trap: CPU traps to interrupt vector, CPU switches to kernel mode and executes OS instructions (syscall handler).
Mechanisms
Operate on abstractions. Ex: create, open, write, allocate, context switch, etc.
Container
Packages code and dependencies together. Lightweight VM that can choose to reuse some underlying OS stuff and discard / ignore other parts -- isolates apps. Container images / dockers become containers at runtime. More portable and efficient. NOT TESTED.
Physical vs. Virtual Machine
Physical: resources usually underused, which cloud providers don't like. VM: Extra level of indirection with the hypervisor, enforcing isolation and multiplexing the physical HW across VMs.
State
Process Control Block (PCB). Contains registers, the program counter, list of open files, etc. This is held in kernel address space.
Process Abstraction
Process is an instance of a program, running with limited rights. A process is NOT a program. A program is static, while a process is dynamic. One program can be several processes. Processes consist of two parts: address space and state. OS address space / user address space + PCB block. The OS can be represented as processes.
OS Challenges
Reliability, availability, security, performance, portability, and adoption.
Process Creation Pictorally
Root is the parent of all other OS processes. Init is a child of root, responsible for all user processes (under init are the user nodes, user1, user2, etc., which each have their own processes). Under root, in addition to init, is also OS1, OS2, etc. If a process is destroyed, Unix makes init the new parent of deleted process's children.
Native Hypervisor (Type 1)
Runs on bare HW. Doesn't require any base OS. More secure because it won't be running on top of a potentially vulnerable OS. Less code = less potential for exploits. Very stable. Bad flexibility. Good performance because there's less communication. Used for cloud. Isolation. HW provides isolation for the hypervisor. Ex: VMWare vSphere, Microsoft Hyper-V, Citrix Hypervisor, KVM.
Hosted Hypervisor (Type 2)
Runs on top of another OS. More flexible than native, since it runs on different OSs with HALs. Bad performance (because lots of communication) and bad security. SW provides isolation, but host OS is huge, so it's vulnerable. The hypervisor isn't isolated from the HW / guest OS, so it's easier to exploit it. Ex: VMWare Workstation, VMWare Player, VirtualBox.
Traps
SW Interrupts. Synchronous. Does interrupt code. Can be generated with a special assembly instruction (like a system call). Can be generated by the CPU when there is a math error. Can be generated by the Memory Management Unit (MMU) if a page fault or seg violation occurs if EAX is un-mapped virtual address.
Unikernel
SW directly integrated with the kernel it's running on. Single-purpose, one process running. Code compiled along with only the required system calls and drivers into one executable program using only a single address space. Typically run in kernel mode -- rely on hypervisor for isolation. Would need to install two kernels for two apps. Stripped down code -- apps run in kernel mode, so not as secure. A bug can corrupt anything. NOT TESTED.
Context Switch Process
Save the current process state, load the state of the next process, and continue execution of the next process. Need to save current process registers without changing them, which isn't easy -- saving state needs to execute code, which will modify the registers. The solution: use hardware + software -- architecture dependent. Process: Process P1 is happening, an interrupt comes (could be a timer), save P1's state in PCB1 and load P2's state from PCB2, then P2 runs, interrupt comes, etc.
OS Design Principals
Separate policy from mechanism -- should be separated, since policies can change. Optimize for common case, but make the corner case work (tradeoffs). Indirection -- general principle for managing a complexity through abstraction. Optimize for the common cases, but make corner cases work too. Look at what happens most often and optimize it. For example, deadlocks don't happen often, but they do happen. Need to keep security in mind.
Portability and Adoption
Should be flexible across different hardware.
Operating System
Software, not hardware. Consists of: protection, scheduling, and resource management (memory, disk drive, network device, display device). We interact with stuff through applications. Complex mess - Windows has 14 million lines of code. We can understand it through abstractions, design principals, and tradeoffs.
Handling Interrupts
Steps: save the current processor state, load the state for interrupt handling, invoke the corresponding Interrupt Service Routine, and then resume the program execution.
Event Driven
The OS is event driven. An event causes a break in normal execution flow. For example, error conditions / exceptions (illegal instructions, invalid memory address, divide by zero), HW interrupts raised by devices to get OS attention (handling a keyboard press, mouse moving, USB data transfer, etc), and system calls (issued by user processes to request system services). Events automatically cause kernel mode to be entered. When events occur while we're in user mode, we want to automatically call some subroutine ("handler") to handle the issue in kernel mode, then resume normal programming in user mode.
Hosted Hypervisor Communication
The app makes a syscall to the host OS which is in kernel mode. The host OS then invokes the hypervisor (VMM) which back to the host OS -> guest OS -> host OS -> hypervisor -> host OS -> app. 4 mode switches generally, with 8 in total.
Executable and Linkable Format (ELF)
The executable image file loaded by exec must be in a particular format. Specifies which part of the file holds instructions, which part is data, at which instruction to start, etc. xv6 uses this format, a common standard file format for executable files, object code, shared libraries, and core dumps. ELF files not only contain the binary code and static data the code operates on, but also metadata the runtime or linker needs in order to load the file into memory and/or resolve any dependencies on other ELF files such as libraries. Contains ELF header, program headers, code, data, and section headers. [WILL NOT BE TESTED].
Disabling Interrupts
The interrupt handler runs with interrupts off, and then gets re-enabled when the interrupt completes -- interrupt buffering. The IF interrupt flag is a system flag bit in the x86 flags that determines whether or not the CPU will handle maskable HW interrupts. The OS kernel can also turn interrupts off (for example, when determining the next process / thread to run -- scheduling decision). On Intel x86, CLI disables interrupts (only deferred) and STI enables interrupts -- only applies to the current CPU on a multicore. Doesn't work on multiprocessors because it would only halt context switches on one CPU but doesn't stop other CPUs from entering the critical section.
Context Switch in xv6
The kernel (scheduler) is not a separate process, but rather runs as part of a user process. Steps: Save P1 user-mode CPU context and switch from user to kernel mode, save P1 kernel CPU context and switch to scheduler context, scheduler selects another process P2, switch to P2 address space, save scheduler CPU context and switch to P2 kernel context, then switch from kernel to user mode and load P2 user-mode context. The unrestricted kernel mode sets the user mode, going to user mode and becoming restricted, then the user mode goes back to kernel mode with an interrupt / trap.
Protection
The kernel cannot trust user processes, they may be malicious or buggy. The kernel has to protect user processes from one another and the kernel from user processes. HW mechanisms for this: memory protection (segmentation and paging, where the kernel manages a segment / page table), timer interrupt (kernel periodically gets back control), and dual mode of operation (privileged (+ non-privileged) operations in kernel mode, non-privileged operations in user mode).
User mode
The mode in which applications typically run.
Context Switch
The process of switching the CPU from one process to another, performed by the OS. The CPU has a state, which is the current contents of all its registers; each process has a copy of the processor state, which allows processes to be paused and resumed at a later time. Want to minimize these because they involve a lot of overhead -- it's expensive and takes time (loading states is a waste of CPU time). State is only saved during a context switch.
xv6 syscall parameters
These are pushed onto the user stack. xv6 has its own built-in functions for fetching the arguments into a kernel function (syscall.c). For example, argint() is used to retrieve an integer argument.
Horizontal Approach to OS Development
This is looking at the OS design principles as being dependent on a scale of environment. Differences between specific techniques (like for file systems) in different environments (flash memory, disk, etc.)
Vertical Approach to OS Development
This is looking at the OS design principles as being dependent on a scale of history. Looks at the history of design structure based on time.
System Call Motivation
This mechanism provides a safe and controlled method to request specific kernel operations. The OS will define all available syscalls, typically defined by a number. Syscalls are exceptions that always cause transition to kernel mode. Interface between a user app and a service provided by the OS (the API of the OS from the user program's pov, accessible via a library of code (like glibc). A pair of stubs (user stub and kernel stub) mediates between the user-level caller and the kernel's implementation of system calls (different privilege levels in the ring layout of the syscall layer) -- a trap separates the kernel stub from the user stub (user space starts here).
Event Notification
Two methods. Polling "busy" loop (responsibility on the processor): CPU periodically checks each device to see if it needs service (waiting for a key press by continuously reading a status flag bit in the interface register). Interrupts (responsibility on I/O device): I/O device notifies processor only when it needs attention (the keyboard controller "interrupts" the processor when it's done). An interrupt can occur at any time. For polling, CPU services the device and polls at regular intervals. Polling becomes inefficient when CPU rarely finds a device ready for service. Interrupts become inefficient when devices keep on interrupting the CPU repeatedly.
OS kernel
Typically provides abstractions for: memory, directories and files, protection, users, IPC, network, and time. These are accessible through system calls (open a file, get I/O input, etc.). Accesses resources safely and effectively. Handles communication and priorities
Mode Bit
Used in dual-mode operation in HW to indicate the current mode. Stored in the processor status register. Not directly accessible to user code. HW automatically saves the status register to memory on interrupt.
Abstractions
Used to hide implementation details; process, thread, file. A window is an abstract screen, memory page, socket.
Kernel Space Stack
Used when executing kernel code (like during syscalls). In xv6, p->kstack member of proc points to the kernel stack.
User Space Stack
Used when executing user code.
Layers of Abstraction
User interacts with apps and system software. These can interact with the hardware and call non-privileged instructions, or they can use general library calls to interact with general libraries which can then call non-privileged HW instructions, or they can use system calls, which general libraries can also use, to call system call libraries which can then call non-privileged HW instructions. Or system call libraries and apps / SW can use supervisor calls to access the OS kernel itself. The OS kernel, and every other category, can then use machine instructions to call privileged or non-privileged CPU instructions. Privileged instructions can access information from devices.
Syscall Implementation
User process -> user stub (syscall invoker) -> hardware trap to kernel stub (syscall handler) -> syscall implementation -> kernel stub -> trap return back to user stub in user mode -> user process with return info.
Wait
Wait for a process to finish. Blocks the caller (parent process) until one of its child processes terminates. If the parent doesn't have any child processes, wait returns immediately without blocking the parent. Parent does not wait if it forks a background process. Using wait, the parent can obtain the exit status of the terminated child (returns a pid_t and takes in int* child_status). Return value is the PID of the child process that terminated. On successful return, the child is reaped. If process has no children, then wait() returns -1. child_status will indicate why the child terminated if it's not null. If parent process has multiple children, wait returns when any one of the children terminates -- to wait on a specific child use waitpid() which is parameterized with a child PID.
Availability
What portion of time is the system working? Does not imply reliability.