Computer Architecture and Assembly Language- Topic 4 The Processor

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Logic Elements

-Combinational Elements -State Elements -Datapath Elements

Signals

-Control Signals -Data Signals

Steps in instruction execution

-Fetch the instruction from memory. -Decode the instruction. -Execute the instruction -Data memory access -Write the result back to memory

Approaches to remediating data hazards

-Forwarding or bypassing -Scheduling -Stalling -NOP

Pipeline Hazards

-pipeline hazards -structural hazard -data hazards -control hazard

Designing for Pipelines

1. Make each instruction format, of the entire instruction set, the same length. 2. Have very few differing instruction formats. 3. Capability to operate on operands that are in main memory. 4. Ensure that instruction operands reside in the same page of storage as the instruction itself.

Stalling

A pipeline stall is a necessary stall in the processing that is required to resolve a hazard.

Clocking

Clocking is rhythmic ticking of the computer's internal timepiece. Clocking is measured in Hertz, (cycles per second), and is currently on the order of billions of units per second. Clocking defines when signals can be read and when they can be written; as well as the interval in which a unit of work must be executed. Synchronization is important so that all work units execute in lock step. If operations did not synchronize into a tight tolerance then a signal could be read at the same time it is being written thereby causing unpredictability and loss of data integrity. A combinational element has the interval of time from one clock edge to the next to complete a unit of work. Finishing late would be catastrophe. Any values updated in a sequential logic element (register or variable in memory) occur at a clock edge and are maintained for the entire clock interval.

Data Memory Access

Ensure that the operands' memory addresses can be reached, that is, they are addressable by the architecture. Ensure that the address computed by the content of the registers and displacments does exist. Ensure access to any register(s) needed. Set the condition code in the PSW.

Fetch the instruction from memory

Examine the address of the next instruction via the Program Counter (PC) or the Program Status Word (PSW). Go to the memory location containing the next instruction to be executed. Bring the instruction and operands from main memory into the CPU hardware buffer. This is generally consists of the maximum amount of data needed to populate the longest instruction format. This data starts at that memory address that the PSW points.

Exceptions

Exceptions were initially designed to handle unexpected events from within the processor, like arithmetic overflow, divide by zero, etc. This can be any unexpected change in the control flow from internal causes. When an exception occurs, the processor must perform some action to save the address of the offending instruction and then transfer control to the proper operating system Interrupt Handler. The operating system can then take the appropriate action in response to the exception, likely stopping the execution of the program and posting an error. Note that any unfinished instructions in the pipeline ahead of the instruction causing the exception must be allowed to run to completion. Furthermore, and instructions in the pipeline following the instruction causing the exception must be flushed from the system as though they never happened. From the programmers point of view, he/she must be assured by the computer architecture that the instructions were processed as though executed serially, one after another, in time. The pipelined implementation treats exceptions as another form of a control hazard. The hardware will stop the offending instruction in midstream, let all prior instructions run to completion, and flush all the instructions following it.

Pipeline Considerations

Given a large number of instructions the increase in speed is about equal to the number of pipe stages. Pipelining improves performance by increasing instruction throughput. it does not decrease the execution time for any single instruction.

Pipeline (ideal) Performance

If stages are perfectly balanced with ideal processing conditions then: Time between instructions = Time between non-piped instructions / Number of pipe stages

Pipeline vs. Single Cycle instruction performance

Just as the single-cycle design must take the worst-case number of clock cycles to process the complete instruction, the pipeline need only consider the worst-case amount of time needed to complete the slowest step in the pipeline. Outside of the memory system, the effective operation of the pipeline is usually the most important factor in determining the Clock Cycles Per Instruction of the processor and hence its performance.

Write the result back to memory

Place the data results of computation into the targeted memory location or into the result register and/or the condition code to the PSW.

Scheduling

The programmer (or compiler) explicitly avoids scheduling instructions whose sequence could possibly create data hazards.

The Von Neumann Bottleneck

The shared bus between the program memory and data memory leads to the von Neumann bottleneck, the limited throughput (data transfer rate) between the CPU and memory compared to the amount of memory. Because instructions and datat must travel between the memory and CPU across a buss, throughput is lower than the rate at which the CPU can work. This seriously limits the effective processing speed when the CPU is required to perform minimal processing on large amounts of data. The CPU is continually forced to wait for needed data to be transferred to or from memory. Since CPU speed and memory size have increased much faster than the throughput between them, the bottleneck has become more of a problem, a problem whose severity increases with every newer generation of CPU.

Execute the instruction

This takes place in the Arithmetic Logic Unit (ALU), Uses the control and data signals set by the Decode stage as a guide to determine the work to perform and the data to act on. May calculate a next address in memory to be branched (jumped) to. If the next inline, adjacent address does not contain the instruction to be next fetched, then override the next instruction address in the PSW that was calculated in the Decode stage.

Decode the Instruction

Using the Instruction Set list and the Instruction Formats, determine the Operation Code (opcode) and read the contents of the register(s). The opcode field denotes the operation to be performed and the format of the instruction. Sets the control and data signals based upon the format of the particular instruction. Each format has a unique requirement for processing. These signals effectively reconfigure the CPU on the fly. Compute the address of the next instruction to be fetched and update the Program Counter (PSW).

How important are Signals?

Without signals we would need a different, separate special purpose CPU to handle each type of instruction format. The signals allow a generalized CPU to be customized "on-the-fly" to accommodate the different instruction formats, with their various inputs and outputs. With special purpose CPUs you would not need the full implementationf of the DECODE step, thus making instruction execution faster, but only a single CPU could operate at a time. The multiple CPUs would be much more expensive to implement. With a generic CPU the DECODE stage is employed which consumes more clock cycles per instruction by the hardware, but the compromise makes this relatively inexpensive.

Control Signals

are used for the proper selection or the directing of the operation of a functional unit. These signals are setup on the basis of the format for the executing instruction. Such as: the instruction read or write to a register or memory, the string of logic gates used to process data, load the address of a calculation or the contents of the calculation. The setting of the control lines is completely determined by the instruction operation code (opcode) fields of the instruction. The CPU is configured for this instruction format. Somewhat like a car assembly line being reconfigured for a particular model with each car being produced.

State Element

is a contiguous set of bits at a memory location or in a register. The data that represents the contents or results of a logic operation. These would be the elements that would need to be saved and restored upon context-switching two tasks (interrupting, saving, restoring and re-dispatching). The state element is somewhat like a data variable in a high-level language. It holds the state or the current value of an operand used by the instruction.

Branch Prediction

is a method of resolving a branch hazard that assumes a given outcome for the branch and proceeds from that assumption rather than waiting to ascertain the actual outcome. Prediction will not slow down the pipeline when it is correct. However when wrong you need to flush the subsequent instructions from the pipeline and re-process the instruction that was processed incorrectly.

Forwarding or bypassing

is a method of resolving a data hazard by retrieving the missing data elements from the CPU internal buffers rather than waiting for results to arrive from programmer visible registers or expected place of storage in memory. Forwarding is valid only if the destination stage of instruction pipeline 'Y' occurs at a later time than the source stage in pipeline 'X'. Bypassing comes from passing the result around the register file to the desired destination unit.

Pipeline Hazard

is a situation where the next instruction cannot execute due to conditions set up by the processing of instructions currently preceding in the pipeline.

Pipelining

is a technique that exploits parallelism among multiple instructions in a sequential instruction stream. The instructions are overlapped in execution. The division of an instruction into five stages means a five-stage pipeline, which in turn means that up to five instructions (theoretically) will be in execution simultaneously during any single clock cycle. Pipelining increases the number of simultaneously processing instructions and the rate at which instructions are started and completed. Pipelining does not reduce the time it takes to complete an individual instruction. As each step in the instruction procedures finishes, the first step of the next instruction is fetched. the second step of the (next -1) instruction is decoded, the third step of the (next-2) instruction is executed, the fourth step of the (next-3) instruction accesses data memory, the last step of the (next-4) instruction writes the result. Pipelining does NOT change the order in which instructions are executed. The programmer can consider the instruction order as "set in stone".

Datapath Element

is a unit used to operate on or hold transient data within a processor. This may include intermediate results from the instructions, the registers, the ALU, the adders and the data memory accesses.

NOP (no operation)

is an instruction used as a programmed stall to consume clock cycles but performs no computational work. the NOP adds elasticity to the pipeline without loss of information or computational results.

Combinational Element

is an operational element, such as an AND or XOR gate, in the Arithmetic Logic Unit (ALU). This element will determine how the information in the two data elements (the operands) are processed or combined by the operation.

Structural Hazard

is one in which the planned instruction cannot execute in the same clock cycle because the hardware does not support the combination of instructions that are set to execute. There exists a conflict, such as a common resource that is needed by two operations and that resource cannot be shared. Example: An instructions that execute a compare operation and has not yet set the condition code in the PSW. The following instruction needs to take action on the basis of the condition code. Two instructions need access to the same memory location, the first wants to write (store), the second to read (load).

Data Signals

is used to provide content information (the operands) to the instruction that is operated on by a functional unit, such as decode. That is, identification to the further pipe stages as to the source of the inputs (register, condition code, memory location, immediate data) and what is the target of the outputs. This information is encoded in the instruction format.

Data Hazard

occurs when a planned instruction cannot execute in the same clock cycle because the input data that is needed to execute the instruction has not yet been made available; it is still in the process of being or has not yet begun to be computed. This hazard can lead to a "stall" Example: a SUBTRACT instruction needs the result of a prior ADD instruction which is still in the pipeline.

Control Hazard (Branch Hazard)

occurs when the proper instruction cannot execute in the proper pipeline clock cycle because the instruction that was fetched is not the instruction that is next needed; that is, the flow of instruction accesses has "jumped" from the address of the next expected instruction. The pipeline cannot possibly have future knowledge of what the next instruction to be executed should be if it is currently processing a branch type instruction. A solution would be to stall until the next effective instruction address was computed, but this would cause a slowdown in processing.


Set pelajaran terkait

Chapter 22- Cardiac Glycosides- digoxin

View Set

Brunner and Suddarth Chapter 45 Study Guide Questions Part 2

View Set