CSIS 342 - Computer Architecture and Organization

¡Supera tus tareas y exámenes ahora con Quizwiz!

Floating-point numbers come in two principal formats: single-precision ______ values, corresponding to C data type float, and double-precision ______ values, corresponding to C data type double.

4 byte, 8 byte

When an assembler generates an object module, it has already determined where the code and data will ultimately be stored in memory.

False

When programming in a high-level language such as C, we are developing in a detailed machine-level implementation of our program.

False

Random access memory (RAM) comes in two varieties— _____ and ______. ______ RAM is used for cache memories, both on and off the CPU chip. ______ RAM is used for the main memory plus the frame buffer of a graphics system.

Static, Dynamic, Static, Dynamic

Random access memory (RAM) comes in two varieties— _________ and _______. Typically, a desktop system will have no more than a few tens of megabytes of ______ RAM, but hundreds or thousands of megabytes of _______RAM.

Static, Dynamic, Static, Dynamic

In the memory hierarchy if the data your program needs are stored in a CPU register, then they can be accessed in:

0 cycles

Chapter 1

1.1 Information Is Bits + Context 3 1.2 Programs Are Translated by Other Programs into Different Forms 4 1.3 It Pays to Understand How Compilation Systems Work 6 1.4 Processors Read and Interpret Instructions Stored in Memory 7 1.5 Caches Matter 11 1.6 Storage Devices Form a Hierarchy 14 1.7 The Operating System Manages the Hardware 14 1.8 Systems Communicate with Other Systems Using Networks 19 1.9 Important Themes 22

In the memory hierarchy if the data your program needs are stored in on disk, then they can be accessed in ______________ during the execution of the instruction.

10,000,000 cycles

Chapter 2

2.1 Information Storage 34 2.2 Integer Representations 59 2.3 Integer Arithmetic 84 2.4 Floating Point 108

Chapter 3

3.1 A Historical Perspective 166 3.2 Program Encodings 169 3.3 Data Formats 177 3.4 Accessing Information 179 3.5 Arithmetic and Logical Operations 191 3.6 Control 200 3.7 Procedures 238 3.8 Array Allocation and Access 255 3.9 Heterogeneous Data Structures 265 3.10 Combining Control and Data in Machine-Level Programs 276 3.11 Floating-Point Code 293

Chapter 4

4.1 The Y86-64 Instruction Set Architecture 355 4.2 Logic Design and the Hardware Control Language HCL 372 4.3 Sequential Y86-64 Implementations 384 4.4 General Principles of Pipelining 412 4.5 Pipelined Y86-64 Implementations 421

Chapter 5

5.1 Capabilities and Limitations of Optimizing Compilers 498 5.2 Expressing Program Performance 502 5.3 Program Example 504 5.4 Eliminating Loop Inefficiencies 508 5.5 Reducing Procedure Calls 512 5.6 Eliminating Unneeded Memory References 514 5.7 Understanding Modern Processors 517 5.8 Loop Unrolling 531 5.9 Enhancing Parallelism 536 5.10 Summary of Results for Optimizing Combining Code 547 5.11 Some Limiting Factors 548 5.12 Understanding Memory Performance 553 5.13 Life in the Real World: Performance Improvement Techniques 561 5.14 Identifying and Eliminating Performance Bottlenecks

Chapter 6

6.1 Storage Technologies 581 6.2 Locality 604 6.3 The Memory Hierarchy 609 6.4 Cache Memories 614 6.5 Writing Cache-Friendly Code 633 6.6 Putting It Together: The Impact of Caches on Program Performance 639

Chapter 7

7.1 Compiler Drivers 671 7.2 Static Linking 672 7.3 Object Files 673 7.4 Relocatable Object Files 674 7.5 Symbols and Symbol Tables 675 7.6 Symbol Resolution 679 7.7 Relocation 689 7.8 Executable Object Files 695 7.9 Loading Executable Object Files 697 7.10 Dynamic Linking with Shared Libraries 698 7.11 Loading and Linking Shared Libraries from Applications 701 7.12 Position-Independent Code (PIC) 704 7.13 Library Interpositioning 707 7.14 Tools for Manipulating Object Files 713

Chapter 8

8.1 Exceptions 723 8.2 Processes 732 8.3 System Call Error Handling 737 8.4 Process Control 738 8.5 Signals 756 8.6 Nonlocal Jumps 781 8.7 Tools for Manipulating Processes 786

Chapter 9

9.1 Physical and Virtual Addressing 803 9.2 Address Spaces 804 9.3 VM as a Tool for Caching 805 9.4 VM as a Tool for Memory Management 811 9.5 VM as a Tool for Memory Protection 812 9.6 Address Translation 813 9.7 Case Study: The Intel Core i7/Linux Memory System 825 9.8 Memory Mapping 833 9.9 Dynamic Memory Allocation 839 9.10 Garbage Collection 865 9.11 Common Memory-Related Bugs in C Programs 870

The bitwise operation of [0110 1111] & [1010 1100] yields

??? - Chapter 2

"Exceptions can be divided into four classes: interrupts, traps, faults, and aborts. _______________ result from unrecoverable fatal errors, typically hardware errors such as parity errors that occur when DRAM or SRAM bits are corrupted."

Aborts

Match the Component to the function (task) for that compodent. Running throughout the system is a collection of electrical conduits that carry bytes of information back and forth between the components. Devices that are the system's connection to the external world. A temporary storage device that holds both a program and the data it manipulates while the processor is executing the program. The engine that interprets (or executes) stored instructions

Buses Input/Output (I/O) Main Memory Central Processing Unit (CPU)

At any point in time, the set of virtual pages is partitioned into three disjoint subsets. Allocated pages that are currently saved in physical memory.

Catched

"Three major components are required to implement a digital system: ______to compute functions on the bits, ______ to store bits, and _____ to regulate the updating of the memory elements."

Combinational Logic for all

Point and click on the process that translates the text file "hello.i" into the text file "hello.s"

Compiler

The machine code for x86-64 differs greatly from the original C code. Parts of the processor state are visible that normally are hidden from the C programmer. Which of these hold status information about the most recently executed arithmetic or logical instruction.

Condition code registers

Discuss the term cycles per element as it relates to program performance.

Cycles per element, abbreviated CPE, is used to express program performance in a way that can guide us in improving the code. CPE measurements help us understand the loop performance of an iterative program at a detailed level. It is appropriate for programs that perform a repetitive computation, such as processing the pixels in an image or computing the elements in a matrix product.

According to the steps outlined in organizing processing into stages. The stage that reads up to two operands from the register file, giving values valA and/or valB. is called:

Decode stage

According to the steps outlined in organizing processing into stages. The stage that, using the arithmetic/logic unit (ALU), either performs the operation specified by the instruction, computes the effective address of a memory reference, or increments or decrements the stack pointer is called:

Execute Stage

"A classic example of a fault is the page fault exception, which occurs when an instruction references a virtual address whose corresponding page is not resident in memory and must therefore be retrieved from the associate website."

False

"As programmers, in order to make good coding decisions in our C programs, we do not need a basic understanding of machine-level code and or how the compiler translates different C statements into machine code."

False

"By assembling a number of logic gates into a network, we can construct computational blocks known as combinational circuits. Several restrictions are placed on how the networks are constructed. One of these is: Every logic gate input can be connected to one or more of the following: (1) one of the system inputs (known as a primary input), (2) the output connection of some memory element, or (3) the output of some logic gate."

False

"Most contemporary circuit technology represents different bit values as high or low voltages on signal wires. In current technology, logic value 0 is represented by a high voltage of around 1.0 volt, while logic value 0 is represented by a low voltage of around 0.0 volts."

False

64 bit machines cannot run programs compiled for 32 bit machines.

False

A page table is an array of page table entries (PTEs). Each page in the virtual address space has a PTE at variable offsets in the page table.

False

A single machine instruction can perform either, very simple operations or many very complex operations.

False

Among compilers, GCC is considered exceptional, in terms of its optimization capabilities.

False

An assembly language program is considered a binary file, not text.

False

An erasable programmable ROM (EPROM) has a transparent quartz window that permits light to reach the storage cells. The EPROM cells are cleared to zeros by shining ultraviolet light through the window. Programming an EPROM is done by using a special device to write ones into the EPROM. An EPROM can be erased and reprogrammed an indefinite number of times.

False

Compilers and assemblers generate executable object files (including shared object files). Linkers generate relocatable object files.

False

Due to the large number of transistors that can be integrated onto a single chip, modern microprocessors employ complex hardware that attempts to maximize program performance. As a result the actual operation is identical to that which is perceived by looking at machine-level programs.

False

In our current high-tech environment, the primary objective in writing a program must for it to run fast.

False

Linux linkers support a powerful technique, called library interpositioning, that allows you to intercept calls to shared library functions and execute your own code instead.

False

One way Loop unrolling can improve performance is to reduce the number of operations that do not contribute directly to the program result, such as loop indexing and conditional branching.

False

Static RAM chips are packaged in memory modules that plug into expansion slots on the main system board (motherboard).

False

Suppose we have a hexadecimal value of 0x76543210, stored in address locations 0x100 through 0x103, The ordering below represents Little Endian. 65743210

False

The different mathematical properties of integer versus. floating-point arithmetic stem from the difference in how they handle the finiteness of their representations—integer representations can encode a comparatively large range of values, but do so precisely, while floating-point representations can encode a small range of values, but only approximately.

False

The storage devices in every computer system are organized as a memory hierarchy similar to Figure 1.9. As we move from the top of the hierarchy to the bottom, the devices become faster, smaller, and less costly per byte.

False

The terms "double words", when referring to 32-bit words, "quad words" when referring to 64-bit words was first coined by AMD.

False

"Exceptions can be divided into four classes: interrupts, traps, faults, and aborts. ______________ result from error conditions that a handler might be able to correct."

Faults

"A _______________occurs for many reasons, usually because a program references an undefined area of virtual memory or because the program attempts to write to a read-only text segment."

General Protrection Fault

Allocators come in two basic styles. Both styles require the application to explicitly allocate blocks. They differ about which entity is responsible for freeing allocated blocks. _________ require the allocator to detect when an allocated block is no longer being used by the program and then free the block. Implicit allocators are also known as garbage collectors, and the process of automatically freeing unused allocated blocks is known as garbage collection.

Implicit allocators

"Exceptions can be divided into four classes: interrupts, traps, faults, and aborts. _______________ occur asynchronously as a result of signals from I/O devices that are external to the processor."

Interrupts

Explain the use and function of linking in programming.

Linking is the process of collecting and combining various pieces of code and data into a single file that can be loaded (copied) into memory and executed. Linking can be performed at compile time, when the source code is translated into machine code; at load time, when the program is loaded into memory and executed by the loader; and even at run time, by application programs. On early computer systems, linking was performed manually. On modern systems, linking is performed automatically by programs called linkers. Linkers play a crucial role in software development because they enable separate compilation. Instead of organizing a large application as one monolithic source file, we can decompose it into smaller, more manageable modules that can be modified and compiled separately. When we change one of these modules, we simply recompile it and relink the application, without having to recompile the other files

Describe the process of Logical Control Flow.

Logical Control Flow is a process that provides each program with the illusion that it has exclusive use of the processor, even though many other programs are typically running concurrently on the system. If we were to use a debugger to single-step the execution of our program, we would observe a series of program counter (PC) values that corresponded exclusively to instructions contained in our program's executable object file or in shared objects linked into our program dynamically at run time

A __________________ occurs as a result of a fatal hardware error that is detected during the execution of the faulting instruction.

Machine Check

According to the steps outlined in organizing processing into stages. The stage that may write data to memory, or it may read data from memory is called:

Memory Stage

In 1965, Gordon Moore, a founder of Intel Corporation made a prediction that quickly became known as Moore's Law. Describe Moore's Law and address how valid the law has proven to be.

Moore's Law is when Gordon Moore made a prediction in 1965 that the number of transistors per chip would double every year for the next 10 years. As it turns out, his prediction was just a little bit optimistic, but also too short-sighted. Over more than 50 years, the semiconductor industry has been able to double transistor counts on average every 18 months. Similar exponential growth rates have occurred for other aspects of computer technology, including the storage capacities of magnetic disks and semiconductor memories. These remarkable growth rates have been the major driving forces of the computer revolution.

One of the remarkable feats of modern microprocessors is that they employ complex and exotic microarchitectures, in which_____ can be executed in parallel, while presenting an operational view of simple _____ execution.

Multiple Instructions, Sequential Instructions

Explain the process of writing an output to a display as shown in figure 1.7.

Once the code and data in the hello object file is loaded into memory, the processor begins executing the machine-language instructions in the hello program's main routine. These instructions copy the bytes in the hello, world\n string from memory to the register file, and from there to the display device, where they are displayed on the screen.

Which of these processes uses the # C directive?

Preprocessor

The machine code for x86-64 differs greatly from the original C code. Parts of the processor state are visible that normally are hidden from the C programmer. Which of these indicates the address in memory of the next instruction to be executed.

Program Counter

Explain the operation of Static RAM.

SRAM stores each bit in a bistable memory cell. Each cell is implemented with a six-transistor circuit. This circuit has the property that it can stay indefinitely in either of two different voltage configurations, or states. Any other state will be unstable—starting from there, the circuit will quickly move toward one of the stable states. The pendulum is stable when it is tilted either all the way to the left or all the way to the right. From any other position, the pendulum will fall to one side or the other. In principle, the pendulum could also remain balanced in a vertical position indefinitely, but this state is metastable—the smallest disturbance would make it start to fall, and once it fell it would never return to the vertical position. Due to its bistable nature, an SRAM memory cell will retain its value indefinitely, as long as it is kept powered. Even when a disturbance, such as electrical noise, perturbs the voltages, the circuit will return to the stable value when the disturbance is removed.

"Match the HCL expressions for the AND, OR and NOT logic gates for the operators in C" AND gate OR gate OR gate

See Chapter 4

The ______ operand designates a value that is immediate, stored in a register, or stored in memory. The ________ operand designates a location that is either a register or a memory address. x86-64 imposes the restriction that a move instruction cannot have both operands refer to memory locations.

Source, Destination

Floating-point numbers come in two principal formats: single-precision (4-byte) values, corresponding to C data type float, and double-precision (8-byte) values, corresponding to C data type double.

T

The text make the case that binary digits are the basis of the digital revolution. In an essay response, explain why this is true.

This is true because binary is based off of based-10, which is thought of as a natural solution to human beings because it is based on the 10 fingers on the hands in which humans possess. When it comes to learning either new mathematics or language, it introduces the brain to lots of new variables, words, symbols, numerical values, etc. that it must memorize in order to conceptualize what it is being taught. With binary, however, the brain just needs the grasp two simple numerical values and where and when to place them. Therefore, I believe what the author(s) are trying to get at is the fact that it is easy to grasp in order for humans to understand these machines that are placed in front of us.

Disks read and write data in sector-size blocks. The access time for a sector has three main components: seek time, rotational latency, and transfer time. Match the term to the description. Seek Time Rotational Latency Transfer Time

To read the contents of some target sector, the arm first positions the head over the track that contains the target sector. The time required to move the arm. Once the head is in position over the track, the drive waits for the first bit of the target sector to pass under the head. When the first bit of the target sector is under the head, the drive can begin to read or write the contents of the sector.

"Exceptions can be divided into four classes: interrupts, traps, faults, and aborts. ______________ are intentional exceptions that occur as a result of executing an instruction."

Traps

"As programmers, we do not need to know the inner workings of the compiler in order to write efficient code."

True

"By assembling a number of logic gates into a network, we can construct computational blocks known as combinational circuits. Several restrictions are placed on how the networks are constructed. One of these is: The outputs of two or more logic gates cannot be connected together. Otherwise, the two could try to drive the wire toward different voltages, possibly causing an invalid voltage or a circuit malfunction."

True

"By assembling a number of logic gates into a network, we can construct computational blocks known as combinational circuits. Several restrictions are placed on how the networks are constructed. One of these is: there cannot be a path through a series of gates that forms a loop in the network. Such loops can cause ambiguity in the function computed by the network."

True

A Mark & Sweep garbage collector consists of a mark phase, which marks all reachable and allocated descendants of the root nodes, followed by a sweep phase, which frees each unmarked allocated block.

True

An exception is an abrupt change in the control flow in response to some change in the processor's state

True

Computers execute on machine code, which encodes the low-level operations that manipulate data, manage memory, read and write data on storage devices, and communicate over networks.

True

In issue with Interpositioning is that it must occur at run time as the program is being executed.

True

It is equally important for programmers to write clear and concise code, not only so that they can make sense of it, but also so that others can read and understand the code during code reviews and when modifications are required later.

True

Linking is the process of collecting and combining various pieces of code and data into a single file that can be loaded (copied) into memory and executed.

True

Loop unrolling is a program transformation that reduces the number of iterations for a loop by increasing the number of elements computed on each iteration.

True

Modern computers store and process information represented as two-valued signals.

True

One of the most important lessons in this book is that application programmers who are aware of cache memories can exploit them to improve the performance of their programs by an order of magnitude.

True

Rather than accessing individual bits in memory, most computers use blocks of 8 bits, or bytes, as the smallest addressable unit of memory.

True

Solid State Drive package consists of one or more flash memory chips, which replace the mechanical drive in a conventional rotating disk, and a flash translation layer, which is a hardware/firmware device that plays the same role as a disk controller, translating requests for logical blocks into accesses of the underlying physical device.

True

The compiler translates the text file computer program into a text file which contains an assembly-language program.

True

The dynamic linker loads and links shared libraries when an application is loaded, just before it executes. However, it is also possible for an application to request the dynamic linker to load and link arbitrary shared libraries while the application is running, without having to link in the applications against those libraries at compile time.

True

The following is considered a disadvantage of static libraries: In almost every C program uses standard I/O functions such as printf and scanf. At run time, the code for these functions is duplicated in the text segment of each running process. On a typical system that is running hundreds of processes, this can be a significant waste of scarce memory system resources.

True

The linker resolves symbol references by associating each reference with exactly one symbol definition from the symbol tables of its input relocatable object files.

True

The term safe optimizations means that, within the the limits of the guarantees provided by the C language standards, the resulting program will compile the code to have similar behavior as would an unoptimized version.

True

This state includes the program's code and data stored in memory, its stack, the contents of its general purpose registers, its program counter, environment variables, and the set of open file descriptors.

True

Volatile memory in the sense that they lose their information if the supply voltage is turned off. Nonvolatile memories, on the other hand, retain their information even when they are powered off.

True

Whenever the assembler encounters a reference to an object whose ultimate location is unknown, it generates a relocation entry that tells the linker how to modify the reference when it merges the object file into an executable.

True

In the context of a linker, there are three different kinds of symbols. Identify which of the statements below are true and which are false. _______ Global symbols that are defined by module m and that can be referenced by other modules. Global linker symbols correspond to nonstatic C functions and global variables. _______ Global symbols that are referenced by module m but defined by some other module. Such symbols are called externals and correspond to nonstatic C functions and global variables that are defined in other modules. ______ Local symbols that are defined and referenced exclusively by module m.These correspond to static C functions and global variables that are defined with the static attribute. These symbols are visible anywhere within module m, but cannot be referenced by other modules.

True for all

Computers implement arithmetic operations, such as addition and multiplication, with these different representations, similar to the corresponding operations on integers and real numbers. We consider the three most important representations of numbers. Those based on a base-2 version of scientific notation for representing real numbers, are referred to as:

Two's-complement encodings

Computers implement arithmetic operations, such as addition and multiplication, with these different representations, similar to the corresponding operations on integers and real numbers. We consider the three most important representations of numbers. Those based on traditional binary notation, representing numbers greater than or equal to 0 are referred to as:

Unsigned encodings

Discuss the reason programmers should know about virtual memory.

Virtual memory is one of the great ideas in computer systems. A major reason for its success is that it works silently and automatically, without any intervention from the application programmer. Since virtual memory works so well behind the scenes, why would a programmer need to understand it? There are several reasons. Virtual memory is central. Virtual memory pervades all levels of computer systems, playing key roles in the design of hardware exceptions, assemblers, linkers, loaders, shared objects, files, and processes. Understanding virtual memory will help you better understand how systems work in general. Virtual memory is powerful. Virtual memory gives applications powerful capabilities to create and destroy chunks of memory, map chunks of memory to portions of disk files, and share memory with other processes. For example, did you know that you can read or modify the contents of a disk file by reading and writing memory locations? Or that you can load the contents of a file into memory without doing any explicit copying? Understanding virtual memory will help you harness its powerful capabilities in your applications. Virtual memory is dangerous. Applications interact with virtual memory every time they reference a variable, dereference a pointer, or make a call to a dynamic allocation package such as malloc. If virtual memory is used improperly, applications can suffer from perplexing and insidious memory-related bugs. For example, a program with a bad pointer can crash immediately with a "segmentation fault" or a "protection fault," run silently for hours before crashing, or scariest of all, run to completion with incorrect results. Understanding virtual memory, and the allocation packages such as malloc that manage it, can help you avoid these errors.

Describe the general principles of computational pipelining.

With computational pipelines, the "customers" are instructions and the stages perform some portion of the instruction execution. An example of a simple nonpipelined hardware system. It consists of some logic that performs a computation, followed by a register to hold the results of this computation. A clock signal controls the loading of the register at some regular time interval. An example of such a system is the decoder in a compact disk (CD) player. The incoming signals are the bits read from the surface of the CD, and the logic decodes these to generate audio signals. The computational block in the figure is implemented as combinational logic, meaning that the signals will pass through a series of logic gates, with the outputs becoming some function of the inputs after some time delay.

According to the steps outlined in organizing processing into stages. The stage that writes up to two results to the register file is called:

Write Back Stage

[0110 1010] << 4 (logical) equals

[1010 0000]

Throughout the history of digital computers, two demands have been constant forces in driving improvements: we want them to do more, and we want them to run faster. Both of these factors improve when the processor does more things at once. We use the term ___ to refer to the general concept of a system with multiple, simultaneous activities, and the term ____ to refer to the use of ____ to make a system run faster. ____ can be exploited at multiple levels of abstraction in a computer system

concurrency, parallelism, concurrency, parallelism

In the memory hierarchy if the data your program needs are stored in a main memory, then they can be accessed in ______________ during the execution of the instruction.

hundreds of cycles.

The classic definition of a process is a(n) _________ of a program in execution. Each program in the system runs in the _________ of some process. The _________ consists of the state that the program needs to run correctly.

instance, context, context

The ______ is encountered when a series of operations must be performed in strict sequence, because the result of one operation is required before the next one can begin. This bound can limit program performance when the data dependencies in the code limit the ability of the processor to exploit instruction-level parallelism. The ______ characterizes the raw computing capacity of the processor's functional units. This bound becomes the ultimate limit on program performance.

latency bound, throughput bound

A compiler generates machine code through a series of stages which include all of the following except:

the preferences of the programmer


Conjuntos de estudio relacionados

Marketing Chapter 20 - Personal Selling and Sales Management

View Set

Prin of Entrepreneurship Exam 2 - MindTap Questions

View Set

Expert (Video Case 11) Witness: Starbucks

View Set

MedSurg Chapter 36: Introduction to the Nervous System

View Set