Operating Systems - Chapter 2

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

ELF

** An executable built on one system is not likely to be executable on another system. ** Each operating system has a binary format for applications that dictates the layout of the header, instructions, and variables. The standard format for executable and relocatable files in Linux is ELF, short for Executable and Linkable Format. There are two different formats, one for files that are executable and one for files that can be linked into executables such as library files. Windows systems use the Portable Executable (PE) format, and macOS uses the Mach-O format. Applications are built against APIs, which specify how the source code should be written. But applications built on a given machine make system calls to that machine's kernel, and contain machine code output by the compiler for the given architecture. The ELF does not specify that the code must be for a specific computer architecture. This means that the compiler and linker can produce an executable that complies with the ELF but not that the code can run on different hardware platforms. Portability is the extent to which an application is able to run on multiple platforms.

System Call Parameter Passing

**There are three general methods for passing the arguments of calls to the kernel in a system call. 1. Register Method: The simplest method is to put them into known registers in a specific order. When the number of parameters exceeds the number of available registers, or if the lengths of these parameters exceed the size of the registers, this does not work. 2. Block Method: The parameters are stored in a block of consecutive bytes in memory, and the address of the block is passed in a register. 3. Stack Method: The parameters are pushed onto the stack by the program and popped off the stack by the kernel. The problem with this is that there are usually two separate stacks - one for the user program and one for the kernel, and popping from one and pushing to the other is painstakingly slow. Neither the block method nor the stack method limits the number of parameters. ** Linux uses a combination of the register method and the block method. If the parameters fit into the registers, it uses them, otherwise the block method is used. In the current version, Linux does not allow more than six parameters to a system call. **

Categories of (most) System Calls

- Process control (creating and terminating processes, loading and executing processes, getting and setting process attributes, waiting for events, signaling events, registering signal actions, setting signal handling properties, allocating and freeing memory) - File management (creating and deleting files and directories, linking and renaming files, opening and closing files, duplication of file descriptors, reading and writing files, repositioning read/write pointers, getting and setting file attributes) - Device management (requesting and releasing devices, reading and writing devices, getting and setting device attributes, logically attaching and detaching devices) - Information maintenance/management (getting and setting time or date, getting and setting system timers, getting and setting system data such as host information, network information, etc, getting accounting and performance information such as CPU usage and memory usage) - User and group management (creating and removing users, creating, modifying, and deleting groups, getting and setting user information (ids, privilege levels, etc), changing passwords) - Communication and synchronization (creating and deleting communication connections between processes, sending and receiving messages, setting up inter-process communication, setting up inter-process synchronization resources, transferring status information, attaching and detaching remote devices) - Protection (protection of files and directories, protection of memory, protection of devices and other resources)

Services Provided by the Operating System

- Program execution - I/O operations - File systems - Communication - Error detection and recovery - Protection and security - Accounting - Managing Resources (sometimes?) Users obtain services either directly through a user interface such as by typing a command in a command-line interpreter, or indirectly, by running an application that issues a request. In the first case, the command is executed by a system program. In the second case, it might be either a system program or a system call that provides the service.

Source-to-Executable steps:

-Source files are compiled with compiler - Object files (and any statically linked libraries) are linked with linker - Almost-executable (and any dynamically-linked libraries) are loaded with loader - Loader produces executable in memory

Solutions to Creating Portable Executables

1. Compile and link according to an ABI. This is not done very much. 2. Write the code in an interpreted language such as Python. Such code runs slowly. 3. Write code in a language that uses a virtual machine, such as the Java Runtime Environment. Again, these programs are slower. 4. The application can be ported manually to each operating system on which it will run. This is time consuming and must be redone for each new version of the application.

System Booting Steps:

1. When a machine is powered on, execution starts at a small boot loader at a fixed location in nonvolatile firmware, either the BIOS in older systems, or UEFI (Unified Extensible Firmware Interface) in newer systems. 2. The boot loader usually runs diagnostic tests such as inspecting memory and the CPU and discovering devices. If all tests pass, it loads a second boot loader, which is located at a fixed disk location called the boot block. 3. The program stored in the boot block loads the kernel into memory. After the kernel is loaded, the bootstrap program transfers control to it. - What happens next is kernel-dependent. Most systems will load drivers, start services, and eventually display a user interface to allow users to login. - The Linux kernel loads necessary drivers, switches the root file system from a temporary RAM location to the actual root file system location. - It then creates the initial process, systemd, in the system, and then starts servers such as web servers, ssh servers, print services, and so on. - Eventually, it displays a login prompt.

Loadable Kernel Modules

An alternative that offers a compromise between microkernels and monolithic systems. The idea is to keep the kernel small, but not as small as a microkernel. The kernel provides core services, and the other services are loadable dynamically as needed, while the kernel is running. The other services are provided in kernel modules, which are executable files that can be loaded and linked into the kernel at run-time. Linux uses loadable kernel modules for a variety of services, ranging from device drivers to file systems to cryptographic services. They can be inserted into and removed from the kernel from the command line.

Application Programming Interface (API)

A definition or collection of definitions that specify how a program is to obtain the services provided by some other piece of software. In the context of operating systems and the kernel, an API is the set of specifications of functions for obtaining the services of the kernel. A programming language needs to provide a means for programs written in that language to obtain kernel services. Therefore, programming language libraries usually provide wrapper functions in that language for all system calls. For example, the GNU C library, glibc, contains wrapper functions for all of the system calls in GNU Linux. ** This is not to say that every function in that library is a wrapper function. It has other functions too. And it might have functions that make several system calls. But it is true that almost every system call in Linux has a wrapper function in glibc. ** The three most common APIs are the - WinAPI for Windows (Win64 for 64-bit Windows), - POSIX API for POSIX-based systems, which includes virtually all versions of UNIX, Linux, and macOS, and - Java API for the Java virtual machine (JVM)

Software Library (Program Library)

A file containing compiled code and possibly data that can be used by other programs. **Most programs are built from multiple source code files and also contain code that references one or more libraries. ** ** A library is not a stand-alone executable - you can not "run" a library. It typically contains functions, type definitions, and constants that other programs can use. **

Wrapper Function

A function whose only purpose is to make a call to another function. It might perform some small amount of work before calling that function and then a bit more work after the call. - In the context of system calls, a wrapper function is a function whose only purpose is to make the system call. - The wrapper function is part of a user level library and runs in user mode, whereas the actual system call is inside the kernel and runs in kernel mode.

Mechanism

A process, technique, or system for achieving a result. Policies state what things must be done, whereas mechanisms describe how to do them. In any software engineering endeavor, separation of policy from mechanism is a very important principle. In operating system design, this principle plays an important role. It is best if mechanisms are flexible enough to work across a range of policies. If so... - When a policy changes, the mechanism may only need adjustments. - Mechanisms can be changed without needing to change policies. ** Microsoft did not make the separation in Windows: mechanism and policy are tied together in the system to enforce their global "look and feel". In contrast, Linux is an open source operating system; anyone can make changes to how various features are implemented, i.e., change mechanisms, and can propose changes to what parts of the system should do. **

Loader

A program that loads a binary executable file into memory. The loader places the executable in memory and assigns final addresses to all program code and data so that it can run correctly in the location at which it was loaded. This is possible because its code is relocatable code. ** In older systems, the default was that the libraries were combined into the program executable by the linker when it created the executable. This is called static linking. Modern systems usually default to linking libraries at run-time, which is called dynamic linking. **

Policy

A set of rules that, under prescribed conditions, guides and determines present and future decisions. Policies state what things must be done, whereas mechanisms describe how to do them.

Command-Line Interpreter

A special type of command line interface-- it is a stand-alone program that can be used to interact with the operating system. ** A command line interface is more general-- application programs such as emacs, MatLab and GNU Octave have command line interfaces, but they are not command line interpreters. ** ** In UNIX, a command line interpreter is called a shell. A shell is not just a command line interpreter, but a full-fledged programming language. Examples of shells are the Bourne-again Shell (bash), the C Shell (csh), and the original Bourne shell (sh). ** ** A command-line interpreter can be run in a text terminal, which is a terminal that transmits data one character at a time, and can only display characters. **

The two copies of a library are...

A statically-linkable and a dynamically-linkable copy. In UNIX, static libraries end in ".a" and dynamic libraries, called shared objects, end in ".so". E.g, libm.a and libm.so. In older systems, the default was that the libraries were combined into the program executable by the linker when it created the executable. This is called static linking. Modern systems usually default to linking libraries at run-time, which is called dynamic linking. Modern Linux defaults to dynamic linking. Some advantages of faster load time, better use of memory (only one copy!), and also when libraries change, the programs do not need recompilation or rebinding. HOWEVER, executables whose libraries are all statically linked are portable - they can be run on any machine with the same architecture - because they do not require the presence in memory of the exact library that they were compiled against. Statically-linked programs are also faster. There is a performance cost in references to shared library routines of about eight machine cycles per reference.

Command-Line Interface (CLI)

A type of user interface in which a user enters a command by entering plain text followed by a newline character. - Usually, there is a prompt of some kind to indicate that the user is supposed to enter the command. - The text must conform to the syntax expected by the interface. - When the newline is entered, the interface tries to run the command.

POSIX

An acronym for the Portable Operating System Interface. POSIX is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. The IEEE official description of it states that... "POSIX defines a standard operating system interface and environment, including a command interpreter (or "shell"), and common utility programs to support applications portability at the source code level." It was intended originally to define standards for software compatibility with variants of Unix and other operating systems. When you write code that uses the POSIX API, it can be run on any POSIX-compliant system.

The Mach Microkernel

The most well-known example of a microkernel-based operating system today is the Mach kernel, which was developed at Carnegie Mellon University and is used in several operating systems, the most well known being macOS. The Mach kernel is like the earlier ones, in that the kernel contains an IPC mechanism, memory management, and a process scheduler, and most everything else is in user-space programs that interact with the kernel through message-passing. All communication takes place between user modules using message passing. Other microkernels: Redox, sel4.

Graphical User Interface (GUI)

An alternative user interface to the operating system. It requires a graphical terminal, which can display not just characters, but images. The GUI was invented at Xerox PARC, appearing in the Xerox Alto computer in 1973. Previously Ivan Sutherland developed a graphical program called Sketchpad in 1963. Apple Computers introduced the first GUI as we know it in their Apple Lisa in 1983 and the Apple Macintosh in 1984. - A GUI emulates a desktop using a cursor-based window-and-menu system. The contents of the screen are graphical elements such as an active cursor, icons, images, and windows, which can represent files, directories, and applications. - The cursor is moved by the user with a pointing device such as a mouse or by voice commands. The cursor is used to execute commands, select elements, draw, etc. - Most systems provide a GUI, such as Microsoft's Windows, Apple's macOS, and Linux's Gnome, or KDE.

Structure

By the structure of an operating system, we mean the arrangement and interrelationships among its separate executable modules. The question is not how its source code files are written and organized, but about how the binary files are interrelated. - The system could be in a single file, using a single address space. - It could be in multiple files, designed in such a way that they form a sequence of layers or rings. - It could be designed using very small files, each interacting with each other using message-passing.

The purpose of linking is to...

Combine relocatable object (.o) files into a single binary executable file, replacing all unresolved references by actual addresses. ** This will usually require including the library files that contain the definitions needed by the program, such as the math library file. ** Example: gcc -c main.c gcc -c utilities.c gcc -c mathstuff.c gcc -c io.c gcc -o myprog main.o utilities.o mathstuff.o io.o -lm - The last part is the instruction to link to the math library. - We do not need to tell gcc to link to the C standard library, because it will do this by default. - gcc combines and links all files, resolving the references.

Under the Hood: System Calls

Different operating systems implement system calls in different ways. We describe how they are implemented in Linux, which is roughly how they are implemented in macOS. The Windows implementation is different. The wrapper function for a system call in Linux usually does little more than the following: - It copies the arguments and the unique system call number to the registers where the kernel expects them - It traps to kernel mode, at which point the system call handler in the kernel does some setup and then invokes system call itself - It puts an error value in a special variable named errno if the system call returns an error number when the kernel returns the CPU to user mode. Depending on the call, it might do more than this, such as pre-processing the arguments or post-processing results. This shows why the wrapper function exists: these actions could not be performed in the kernel itself. The trap to kernel mode must take place outside the kernel! Complete flow: system call invoked in main <-> wrapper function of same name in library <-> system call handler (we are now in kernel mode) <-> system call service routine of same name in kernel.

Programming Languages for Operating Systems

Early operating systems were written in assembly language. That changed over time. The first operating system to be written exclusively in a high-level language (ESPOL) was the Burroughs MCP, in 1961. UNIX was the first operating system to be implemented in C. Now, most operating systems are written in a mix of assembly and C and/or C++. The kernel is usually written in assembly and C. (Android and Linux are examples.) The libraries are written in C and/or C++. Some systems programs are also written in scripting languages such as PERL, Python, and shell languages such as bash. The drawback of implementing an operating system in a higher level language is reduced speed, although two facts ameliorate this: - Compilers have gotten very good at optimizing code, and - Modern processors use hyper-threading to reduce the effects of memory delays caused by dependencies in the code.

System Call

Executes an operating system service when a user runs an application that issues a request (however, in this case, this is sometimes done by a system program, instead.) A call to code inside the kernel of the operating system made via a software interrupt or trap. It is the means by which a user process explicitly requests a service from the kernel. **Kernel processes do not make system calls. ** Usually, system calls are not invoked directly from a user program. Instead, the program calls a wrapper function that invokes that system call. Technically, it is the wrapper function defined in the API that the program calls to obtain services from the kernel. Suppose that foo() is a system call. The library will have a wrapper with the same name. Your program will call foo(), but this will invoke the wrapper, not the actual system call, which is invoked within the wrapper itself. ** When we say that a program calls some system call, we mean that it calls the corresponding wrapper function in the user-level library. **

User Interfaces

For a user to access any of the operating system's services, some type of user interface to the operating system is required. It can be any of: - Command-Line Interface (CLI) - Graphical User Interface (GUI) - Touch Screen - Batch Monitor

Microkernel

In 1969, Danish/American computer scientist Per Brinch Hansen and his team of developers created the RC 4000 Multiprogramming System using principles that a decade later would be called a microkernel design. The idea was to make the kernel as small as possible, but containing only absolutely essential functionality that needed to run in privileged mode. All other functionality was put into separate programs that ran in user mode, communicating through message-passing. Benefits of Microkernels: - It is easier to extend a microkernel-based operating system because much of the extension would be in separate programs rather than inside the kernel. - It is easier to port the operating system to new architectures because there is less code to modify. - It is more reliable because the kernel is smaller and easier to debug and maintain. - It is more secure because the smaller kernel is easier to analyze for security flaws, and satisfies the principle of least privilege. The earliest microkernel had an inter-process communication system, process scheduling, and memory management. All other services, such as file systems, device drivers, and application interface, were in user-space processes. One example of an early microkernel operating system was MINIX, designed and built by Andrew S. Tanenbaum, a professor of computer science at the Vrije Universiteit Amsterdam in the Netherlands. MINIX 1.0 was a UNIX-derived operating system designed to run on the Intel 8088 processor, and was system-call compatible with Seventh Edition Unix.

Layered/Ring Structure

In a layered structure, the operating system is divided into a number of layers, with successively higher layers built upon and dependent upon lower layers. Equivalently, this structure can be viewed as a set of concentric rings, with outer rings built upon and dependent upon inner rings. The innermost ring, called ring 0, is the hardware; the outermost ring is the set of user interfaces. In between are the drivers, kernel, and so on. There are no commercial operating systems that adhere to this model strictly, and there are no commercial operating systems in which each ring is in its own address space. But the idea of rings has been used as a design principle, often only with respect to principles of protection. Multics was such a system, for example. Early UNIX can be viewed as having a ring structure.

Design Principles

Information hiding: Separating the methods that implement a particular component from the design of that component itself makes system maintenance and modification easier. Levels or layers of abstraction: System software can be simplified and verified by organizing the functions as a hierarchy that can make only downward calls and upward returns. Virtual machines: A set of related functions can be implemented as a simulation of a machine whose interface is an "instruction set" and whose internal structure and data are hidden. Principle of least privilege: In a particular abstraction layer of a computing environment, every enitity (process, user, program, etc.) must have only the privilege needed to access only the information and resources that are necessary for its legitimate purpose. Principle of locality: Processes use small subsets of their address spaces for extended periods.

The readelf -s Command

Is used to display the contents of all ELF files (their symbols).

The ltrace command

Is used to intercept and record the dynamic library calls made by a running program. It is useful because it helps in determining which library calls are failing. It is also able to receive signals for segmentation faults, and other exceptions. ltrace [options] <command-to-run> <command-args> ltrace date generates a lot of output but ltrace -l with a library name limits the call to that library.

The ldd Command

Is used to see what libraries will be dynamically linked to a program.

The nm -D Command

Is used to see what symbols (functions, constants, etc.) are external in an object or an executable, meaning not defined in it but references in it. (You can also use objdump -T, which provides much more output for each symbol.)

Categorizing System Programs

It is difficult to categorize all system programs, but generally speaking, we can lump them into some large groups as follows. Some examples in Linux are given... File and directory manipulation: creation, deletion, modification, printing, renaming, copying, checking and changing attributes. Status information: obtaining system information such as time, date (date), resource utilization (memusage), process status (ps), device status. Software development: compilers (gcc), linkers (ld), assemblers (as), interpreters and shells (bash), debuggers (gdb), tracers, profilers, version control, etc. Program loading and execution: absolute loaders, relocatable loaders, dynamic linker/loaders (ld-linux), object dumpers (objdump), system call tracers (strace), library call tracers (ltrace), etc. Communication and network services: remote logins (ssh), communicating with servers (curl), remote file transfer (sftp,scp), send messages to a screen (write) browsers, email clients, and such are applications, not system software. Background services: services that run in the background include ssh daemons (sshd), printer daemons (lpd, cupsd), various other network daemons, and so on. They are started up at boot time and usually remain running until shutdown.

Touchscreen Interface

Keyboards and mice are not feasible input devices for mobile computing devices such as smartphones and tablets, making CLIs and GUIs impractical for them. These devices typically have touchscreens, which are both input and output devices. - Actions and selections are based on gestures, which are not single point clicks, but motions, which are represented by location, velocity, and acceleration data. - Keyboards are on-screen for text entry - Voice commands are supported

Hybrid Operating Systems

Most operating systems are hybrids of two or more models. For example, Linux is a monolithic system but it also uses loadable kernel modules. Windows operating systems are mostly monolithic but provides support for separate subsystems (which they call operating-system personalities) that run as user-mode processes. Some operating systems were created as hybrids, the most notable example being macOS and iOS. macOS is a complex, partially layered system, built on top of the Darwin kernel. Darwin is the core of macOS, consisting of an open source Mach microkernel and elements of FreeBSD UNIX. It is an open-source, mostly POSIX-compliant, Unix-like operating system.

System Program

One of the two ways that a command can be implemented, in UNIX and UNIX-like systems, is in a separate, executable file-- in which case, it is a system program. ** There are a handful of commands that are implemented in both ways, such as the echo , kill, and pwd commands. ** For example, in Linux systems, the grep and date commands are in the files /bin/grep and /bin/date respectively. System programs are typically programs shipped or downloaded when the operating system is installed. Almost all such programs can only be invoked as commands on the command line in a terminal window. ... But not every program that can be run from the command line is a system program. For example, we can run the Firefox web browser from the command line: $ firefox and it is an application, not a system program.

Built-In Command

One of the two ways that a command can be implemented, in UNIX and UNIX-like systems, is within the shell itself-- in which case, it is a built-in command. ** There are a handful of commands that are implemented in both ways, such as the echo , kill, and pwd commands. ** For example, in bash, the cd command is a built-in command, as is the read command.

Application Binary Interface (ABI)

The architecture equivalent of an API - it defines how the components of the binary code can interface with a given kernel and CPU architecture.

Portability

The extent to which an application is able to run on multiple platforms.

The read System Call in UNIX

The read() system call in UNIX can be used to read from anything that can be mapped to a file, which is to say all devices and files. We use it to demonstrate how you can learn about the Linux API from the manual pages. Type "man read" in a terminal to see the beginning of the manual page for it, which provides its API and also shows that the unistd.h header file must be included to use it: #include <unistd.h> ssize_t read(int fd, void *buf, size_t count); --- - int fd : an integer file descriptor that identifies the file from which to read - void * buf : the address of the first byte of memory into which to store the data - size_t count : the number of bytes to be read into the buffer - It returns a ssize_t result, which is the number of bytes actually read. ssize_t is a C signed integer type.

Monolithic Structure

The simplest structure of an operating system is a monolithic one: - All of the functionality of the kernel is in a single, static binary file that runs in a single address space. - Early versions of UNIX were monolithic, as were early versions of Linux. The three levels in the kernel correspond roughly to what calls what: calls go from higher components to lower ones.

System Call Dispatch Table

When a trap occurs as a result of a system call, the mode bit is switched to kernel mode and a system call handler runs (this was said in chapter 1). This handler handles all system calls. The system call handler knows which system call to invoke because each system call has a number associated with it, that is read from a register. (In Linux it is in the eax register.) The kernel maintains a system call dispatch table indexed according to these numbers. The table contains the address of the start of each system call in the kernel. The system-call handler invokes the actual kernel function, which runs in kernel mode, and when it is finished, it returns control back to the handler, which returns the exit status of the system call and any return values.

The objdump Command

objdump -d is used to look at the assembly code in an object file objdump -T displays a list of all symbols (functions, constants, etc.) used by an object or executable (that are external.) Alternatively, you can use the readelf -s command, which can display the contents of all ELF files.


संबंधित स्टडी सेट्स

Chapter 13 Introduction to Classes

View Set

Multiplication Properties 3rd Grade

View Set

Practice Final Exam- Graph Questions

View Set