CS61: Introduction to Computing Systems

¡Supera tus tareas y exámenes ahora con Quizwiz!

2.7 Other Representations

*2.7.1 The Bit Vector* = It is often useful to describe a complex system made up of several units, each of which is individually and independently busy or available. - Suppose we have eight machines that we want to monitor with respect to their avail- ability. We can keep track of them with an eight-bit BUSYNESS bit vector, where a bit is 1 if the unit is free and 0 if the unit is busy - The bits are labeled, from right to left, from 0 to 7. The BUSYNESS bit vector 11000010 corresponds to the situation where only units 7, 6, and 1 are free, and therefore available for work assignment. ex: Suppose work is assigned to unit 7. We update our BUSYNESS bit vector by performing the logical AND, where our two sources are the current bit vector 11000010 and the bit mask 01111111. The purpose of the bit mask is to clear bit 7 of the BUSYNESS bit vector. The result is the bit vector 01000010. *Bit Mask* = enables one to interact some bits of a binary pattern while ignoring the rest - In the above case, the bit mask clears bit 7 and leaves unchanged (ignores) bits 6 through 0. *2.7.2 Floating Point Data Type* = it allows very large and very tiny numbers to be expressed at the expense of reducing the number of binary digits of precision - the LC-3 uses the 16-bit, 2's complement data type, which provides, in addition to one bit to identify positive or negative, 15 bits to represent the magnitude of the value. - With 16 bits used in this way, we can express values between -32,768 and +32,767, that is, between -2^15 and +2^15 -1 - Instead of using all the bits (except the sign bit) to represent the precision of a value, the floating point data type allocates some of the bits to the range of values (i.e., how big or small) that can be expressed. The rest of the bits (except for the sign bit) are used for precision. - recall that we said that the floating point data type was very much like the scientific notation you learned in high school, and we gave the example 6.023 • 10^23. This representation has three parts: the sign, which is positive, the significant dig- its 6.023, and the exponent 23. We call the significant digits the fraction. 1) The actual exponent being represented is the unsigned number in the data type minus 127. For example, if the actual exponent is +8, the exponent field contains 10000111, which is the unsigned number 135. ex: If the actual exponent is -125, the exponent field contains 00000010, which is the unsigned number 2. Note that 2 - 127 = -125. 2) the sign bit: 0 for positive numbers, 1 for negative numbers. The formula contains the factor —1s, which evaluates to +1 if s —0, and —1 if s = 1. *2.7.3 ASCII Codes* - American Standard Code for Information Interchange (ASCII) = Another representation of information is the standard code that almost all computer equipment manufacturers have agreed to use for transferring character codes between the main computer processing unit and the input and output devices. - It (ASCII) greatly simplifies the interface between a keyboard manufactured by one company, a computer made by another company, and a monitor made by a third company. - Each key on the keyboard is identified by its unique ASCII code - So, for example, the digit 3 expanded to 8 bits with a leading 0 is 00110011, the digit 2 is 00110010, the lowercase e is 01100101, and the carriage return is 00001101. - the ASCII code for the letter E is 01000101, and the ASCII code for the letter e is 01100101. Both are associated with the same key, although in one case the Shift key is also depressed while in the other case, it is not. *2.7.4 Hexadecimal Notation* - a representation that is used more as a convenience for humans than as a data type to support operations being performed by the computer - it evolves nicely from the positional binary notation and is useful for dealing with long strings of binary digits without making errors. - It will be particularly useful in dealing with the LC-3 where 16-bit binary strings will be encountered often - Since the symbols must represent values from 0 to 15, we assign symbols to these values as follows: 0, 1,2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F. - If we had first broken the string at four-bit boundaries 0011 1101 0110 1110 and then converted each four-bit string to its equivalent hex digit 3D6E - It can be used to represent binary strings that are integers or floating point numbers or sequences of ASCII codes, or bit vectors. - t simply reduces the number of digits by a factor of 4, where each digit is in hex (0,1, 2 , . . . F) instead of binary (0, 1)

8.2 Input from the Keyboard

*8.2.1 Basic Input Registers (the KBDR and the KBSR) - We have already noted that in order to handle character input from the keyboard, we need two things: 1) KBDR = a data register that contains the character to be input - is assigned to xFE02 (16 bits) even when a character need only 8 bits 2) KBSR = a synchronization mechanism to let the processor know that input has occurred (The synchronization mechanism is contained in the status register associated with the keyboard) - KBSR is assigned to xFE00 (16 bits) even when it is merely the MSB - They are assigned addresses from the memory address space *8.2.2 The Basic Input Service Routine* - KBSR[15] controls the synchronization of the slow keyboard and the fast processor - When a key on the keyboard is struck 1) the ASCII code for that key is loaded into KBDR[7:0] 2) the electronic circuits associated with the keyboard automatically set KBSR[15] to 1 3) When the LC-3 reads KBDR, the electronic circuits associated with the keyboard automatically clear KBSR[15], allowing another key to be struck - IfKBSR[15] = 1, the ASCII code corresponding to the last key struck has not yet been read, and so the keyboard is disabled. Polling = If input/output is controlled by the processor (i.e., via polling), then a program can repeatedly test KBSR[15] until it notes that the bit is set - At that point, the processor can load the ASCII code contained in KBDR into one of the LC-3 registers - Since the processor only loads the ASCII code if KBSR[15] is 1, there is no danger of reading a single typed character multiple times - Furthermore, since the keyboard is disabled until the previous code is read, there is no danger of the processor missing characters that were typed - In this way, KBSR[15] provides the mechanism to guarantee that each key typed will be loaded exactly once. The following input routine loads RO with the ASCII code that has been entered through the keyboard and then moves on to the NEXT_TASK in the program: START LDI R1, A ;test for BRzp START ;character input LDI R0, B BRnzp NEXT_TASK ;go to the next task A .FILL xFE00 ;address of KBSR B .FILL xFE02 ;address of KBDR - Note the use of the LDI instruction, which loads Rl with the contents of xFE00, the memory-mapped address of KBSR - If the Ready bit, bit [15], is clear, BRzp will branch to START and another iteration of the loop - When someone strikes a key, KBDR will be loaded with the ASCII code of that key and the Ready bit of KBSR will be set. This will cause the branch to fall through and the instruction at line 03 to be executed (bc 1 denotes a negative value if it is the MSB) - Again, note the use of the LDI instruction, which this time loads R0 with the contents of xFE02, the memory-mapped address of KBDR. The input routine is now done, so the program branches unconditionally to its NEXT TASK. *8.2.3 Implementation of Memory-Mapped Input* - Carries out the EXECUTE phase of the load instructions. Essentially three steps are required: 1. The MAR is loaded with the address of the memory location to be read. 2. Memory is read, resulting in MDR being loaded with the contents at the specified memory location. 3. The destination register (DR) is loaded with the contents of MDR. In the case of memory-mapped input, the same set of steps are carried out, except 1) instead of MAR being loaded with the address of a memory location, MAR is loaded with the address of a device register 2) instead of the address control logic enabling memory to read, the address control logic selects the corresponding device register to provide input to the MDR

8.3 Output to the Monitor

*8.3.1 Basic Output Registers (the DDR and the DSR) - Output works in a way very similar to input, with DDR and DSR replacing the roles of KBDR and KBSR, respectively 1) DDR = stands for Display Data Register, which drives the monitor display - is assigned address xFE06 2) DSR = stands for Display Status Register - is assigned address xFE04 As is the case with input, even though an output character needs only eight bits and the synchronization mechanism needs only one bit, it is easier to assign 16 bits (like all memory addresses in the LC-3) to each output device register. - In the case of DDR, bits [7:0] are used for data, and bits [15:8] contain x00 - In the case of DSR, bit [15] contains the synchronization mechanism, that is, the Ready bit. *8.3.2 The Basic Output Service Routine* - DSR[15] controls the synchronization of the fast processor and the slow monitor display. 1) When the LC-3 transfers an ASCII code to DDR[7:0] for outputting, the electronics of the monitor automatically clear DSR[15] as the processing of the contents of DDR[7:0] begins. 2) When the monitor finishes processing the character on the screen, it (the monitor) automatically sets DSR[15] 3) This is a signal to the processor that it (the processor) can transfer another ASCII code to DDR for outputting. - As long as DSR[15] is clear, the monitor is still processing the previous character, so the monitor is disabled as far as additional output from the processor is concerned. - If input/output is controlled by the processor (i.e., via polling), then a program can repeatedly test DSR[ 15] until it notes that the bit is set, indicating that it is OK to write a character to the screen. The following routine causes the ASCII code contained in RO to be displayed on the monitor: START LDI R1, A ;test for BRZP START ;character input STI R0, B BRNZP NEXT_TASK A .FILL xFE04 ;address of DSR B .FILL xFE06 ;address of DDR - lines 01 and 02 repeatedly poll DSR[15] to see if the monitor electronics is finished yet with the last character shipped by the processor - Note the use of LDI and the indirect access to xFE04, the memory-mapped address of DSR - As long as DSR[15] is clear, the monitor electronics is still processing this character, and BRzp branches to START for another iteration of the loop. - When the monitor electronics finishes with the last character shipped by the processor, it automatically sets DSR[15] to 1, which causes the branch to fall through and the instruction at line 03 to be executed. - Note the use of the STI instruction, which stores R0 into xFE06, the memory-mapped address of DDR. - The write to DDR also clears DSR[15], disabling for the moment DDR from further output. - The monitor electronics takes over and writes the character to the screen. Since the output routine is now done, the program unconditionally branches (line 04) to its NEXT_TASK. *8.3.3 Implementation of Memory-Mapped Output* - In Chapter 5, you became familiar with the process of carrying out the EXECUTE phase of the store instructions. 1. The MAR is loaded with the address of the memory location to be written. 2. The MDR is loaded with the data to be written to memory. 3. Memory is written, resulting in the contents of MDR being stored in the specified memory location. - In the case of memory-mapped output, the same steps are carried out, except 1) instead of MAR being loaded with the address of a memory location, MAR is loaded with the address of a device register 2) instead of the address control logic enabling memory to write, the address control logic asserts the load enable signal of DDR. Memory-mapped output also requires the ability to read output device registers. - before the DDR could be loaded, the Ready Bit had to be in state 1, indicating that the previous character had already been written to the screen. The LDI and BRzp instructions on lines 01 and 02 perform that test. - To do this the LDI reads the output device register DSR, and BRzp tests bit [15]. If the MAR is loaded with xFE04 (the memory-mapped address of the DSR), the address control logic selects DSR as the input to the MDR, where it is subsequently loaded into Rl and the condition codes are set. *8.3.4 Example: Keyboard Echo* - When we type at the keyboard, it is helpful to know exactly what characters we have typed. We can get this echo capability easily (without any sophisticated electronics) by simply combining the two routines we have discussed. The key typed at the keyboard is displayed on the monitor. START LDI R1, KBSR BRZP START LDI R0, KBDR ECHO LDI R1, DSR BRZP ECHO STI R0, DDR BRNZP NEXT_TASK KBSR .FILL xFE00 KBDR .FILL xFE02 DSR .FILL xFE04 DDR .FILL xFE06

5.6 The Data Path Revisited

- Filled-in arrowheads designate information that is processed. - Unfilled- in arrowheads designate control signals. Control signals emanate from the block labeled "Control." *5.6.1 Basic Components of the Data Path* 1) The Global Bus = consists of 16 wires and associated electronics - It allows one structure to transfer up to 16 bits of information to another structure by making the necessary electronic connections on the bus - Exactly one value can be transferred on the bus at one time. - Note that each structure that supplies values to the bus has a triangle just behind its input arrow to the bus - This triangle (called a tri-state device) allows the computer's control logic to enable exactly one supplier to provide information to the bus at any one time. - The structure wishing to obtain the value being supplied can do so by asserting its LD.x (load enable) signal - Not all computers have a single global bus 2) Memory = contains both instructions and data - Memory is accessed by loading the memory address register (MAR) with the address of the location to be accessed - If a load is being performed, control signals then read the memory, and the result of that read is delivered by the memory to the memory data register (MDR) - if a store is being performed, the data to be stored is first loaded into the MDR. Then the control signals specify that WE is asserted in order to store into that memory location. 3) The ALU and the Register File - The ALU is the processing element - It has two inputs, source 1 from a register and source 2 from either a register or the sign-extended immediate value provided by the instruction - The registers (R0 through R7) can provide two values, source 1, which is controlled by the 3-bit register number SRI, and source 2, which is controlled by the 3-bit register number SR2. - SRI and SR2 are fields in the LC-3 operate instruction. - The result of an ALU operation is a result that is stored in one of the registers, and the three single-bit condition codes - Also, note that the 16 bits supplied to the bus are also input to logic that determines whether that 16-bit quantity is negative, zero, or positive, and sets the three registers N, Z, and P accordingly 4) The PC and the PCMUX - The PC supplies via the global bus to the MAR the address of the instruction to be fetched at the start of the instruction cycle. The PC, in turn, is supplied via the three-to-one PCMUX, depending on the instruction being executed. The Three Inputs to the PCMUX 1) The RightMost Input to the PCMUX = during the FETCH phase of the instruction cycle, the PC is incremented and written into the PC - If the current instruction is a conditional branch and the branch is taken, then the PC is loaded with the incremented PC + PCoffset (the 16-bit value obtained by sign-extending IR[8:0]) - Note that this addition takes place in the special adder and not in the ALU 2) The output of the adder is the middle input to PCMUX 3) The third input to PCMUX is obtained from the global bus 5) The MARMUX = controls which of two sources will supply the MAR with the appropriate address during the execution of a load, a store, or a TRAP instruction - memory is accessed by supplying the address to the MAR - The right input to the MARMUX is obtained by adding either the incremented PC or a base register to a literal value or zero supplied by the IR - The left input to MARMUX provides the zero-extended trapvector, which is needed to invoke service calls Two Control Signals 1) ADDR1MUX specifies the PC or base register 2) ADDR2MUX specifies which of four values to be added *5.6.2 The Instruction Cycle* We complete our tour of the LC-3 data path by following the flow through an instruction cycle. Suppose the content of the PC is x3456 and the content of location x3456 is 0110011010000100. And suppose the LC-3 has just completed processing the instruction at x3455, which happened to be an ADD instruction. 1) Fetch = the instruction is obtained by accessing memory with the address contained in the PC - In the first cycle, the contents of the PC are loaded via the global bus into the MAR, and the PC is incremented and loaded into the PC - At the end of this cycle, the PC contains x3457. - In the next cycle (if memory can provide information in one cycle), the memory is read, and the instruction 0110011010000100 is loaded into the MDR - In the next cycle, the contents of the MDR are loaded into the instruction register (IR), completing the FETCH phase. 2) Decode = the contents of the IR are decoded, resulting in the control logic providing the correct control signals (unfilled arrowheads) to control the processing of the rest of this instruction - The opcode is 0110, identifying the LDR instruction. This means that the Base+offset addressing mode is to be used to determine the address of data to be loaded into the destination register R3 3) Evaluate Address - In the next cycle, the contents of R2 (the base register) and the sign-extended bits [5:0] of the IR are added and supplied via the MARMUX to the MAR - The SRI field specifies 010, the register to be read to obtain the base address - ADDR1 MUX selects SR1 OUT, and ADDR2MUX selects the second from the right source (SEXT [5:0]) 4) Operand Fetch - In the next cycle (or more than one, if memory access takes more than one cycle), the data at that address is loaded into the MDR. 5) Execute - The LDR instruction does not require an EXECUTE phase, so this phase takes zero cycles. 6) Store Result - In the last cycle, the contents of the MDR are loaded into R3. The DR control field specifies 011, the register to be loaded.

4.1 Basic Components

- To get a task done by a computer, we need two things: a computer program that specifies what the computer must to do to complete the task, and the computer itself that is to carry out the task. 1) Computer Program = consists of a set of instructions, each specifying a well- defined piece of work for the computer to carry out 2) Instruction = is the smallest piece of work specified in a computer program - That is, the computer either carries out the work specified by an instruction or it does not. - The computer does not have the luxury of carrying out a piece of an instruction. *Von Neumann Model* 1) Memory - The computer program is contained in the computer's memory 2) Processing Unit 3) Input 4) Output 5) Control Unit - The control of the order in which the instructions are carried out is performed by the control unit *4.1.1 Memory* - Recall that in Chapter 3 we examined a simple 2^2-by-3-bit memory that was constructed out of gates and latches. A more realistic memory for one of today's computer systems is 2^28 by 8 bits. That is, a typical memory in today's world of computers consists of 2^28 distinct memory locations, each of which is capable of storing 8 bits of information. - *We say that such a memory has an address space of 2^28 uniquely identifiable locations, and an addressability of 8 bits.* - We refer to such a memory as a 256-megabyte memory (abbreviated, 256MB). The "256 mega" refers to the 2^28 locations, and the "byte" refers to the 8 bits stored in each location. - We will see that the memory address space of the LC-3 is 2^16, and the addressability is 16 bits. - Two Characteristics of Memory Location 1) its address (MAR) 2) what is stored in the address (MDR) Memory's Address Register (MAR) = To read the contents of a memory location, we first place the address of that location in the memory's address register (MAR), and then interrogate the computer's memory Memory's Data Register (MDR) = the information stored in the location having that address will be placed in the memory's data register To write (or store) a value in a memory location, we: 1) write the address of the memory location in the MAR 2) write the value to be stored in the MDR 3) we then interrogate the computer's memory with the Write Enable signal asserted 4) The information contained in the MDR will be written into the memory location whose address is in the MAR. ex: the value 6 is stored in the memory location whose address is 4, and the value 4 is stored in the memory location whose address is 6. (The value stored in that location can be changed, but the location's memory address remains unchanged.) *4.1.2 Processing Unit* = the actual processing of information in the computer - consists of many sophisticated complex functional units, each performing one particular operation (divide, square root, etc.) Arithmetic and Logic Unit (ALU) = the simplest processing unit, and the one normally thought of when discussing the basic von Neumann model - it is usually capable of performing basic arithmetic functions (like ADD and SUBTRACT) and basic logic operations (like bit-wise AND, OR, and NOT - The size of the quantities normally processed by the ALU is often referred to as the word length of the computer, and each element is referred to as a word - in the LC-3, the ALU processes 16-bit quantities - we say the LC-3 has a word length of 16 bits - The most common form of temporary storage is a set of registers - Typically, the size of each register is identical to the size of values processed by the ALU, that is, they each contain one word. - The LC-3 has eight registers (R0, Rl, ... R7), each containing 16 bits *4.1.3 Input and Output (Peripherals)* - In order for a computer to process information, the information must get into the computer. In order to use the results of that processing, those results must be displayed in some fashion outside the computer. - Many devices exist for the purposes of input and output. Two Most Basic of Input and Output Devices - Input: keyboard (others include: mouse, digital scanners, and floppy disks) - Output: monitor (others include: printers, LED displays, and disks) *4.1.4 Control Unit* = is like the conductor of an orchestra; it is in charge of making all the other parts play together - it is the control unit that keeps track of both where we are within the process of executing the program and where we are in the process of executing each instruction. 1) Instruction Register = contains the instruction used to keep track of which instruction is being executed 2) Program Counter (PC)/Instruction Pointer (Register that Holds Next Instruction's Address) = keeps track of which instruction is to be processed next

5.3 Data Movement Instructions

Data movement instructions move information between the general purpose registers and memory, and between the registers and the input/output devices - In this chapter, we will confine ourselves to moving information between memory and the general purpose registers. 1) Load = process of moving information from memory to a register 2) Store = the process of moving information from a register to memory - In both cases, the information in the location containing the source operand remains unchanged - In both cases, the location of the destination operand is overwritten with the source operand, destroying the prior value in the destination location in the process. - Data movement instructions require two operands, a source and a destination. - The source is the data to be moved; the destination is the location where it is moved to. One of these locations is a register, the second is a memory location or an input/output device The LC-3 Contains 7 Instructions That Move Information 1) LD (0010) [15:12] = opcode, [11:9] = DR, [8:0] = address generation bits - DR refers to the destination register that will contain the value after it is read from memory - Address Generation Bits = bits [8:0] encode information that is used to compute the 16-bit address of the second operand 2) LDR 3) LDI 4) LEA 5) ST (0011) [15:12] = opcode, [11:9] = SR, [8:0] = address generation bits - SR refers to the register that contains the value that will be written to memory - Address Generation Bits = bits [8:0] encode information that is used to compute the 16-bit address of the second operand 6) STR 7) STI *4 Addressing Modes = 4 ways to interpret bits [8:0] (Stated Below)* *5.3.1 PC-Relative Mode* - LD (opcode = 0010) and ST (opcode = 0011) specify the *PC-relative addressing mode* - The memory address of bits [8:0] of the instruction is computed by sign- extending bits [8:0] to 16 bits, and adding the result to the incremented PC - The address of the memory operand can only be within +256 or —255 locations of the LD or ST instruction since the PC is incremented before the offset is added. This is the range provided by the sign-extended value contained in bits [8:0] of the instruction. Data Path Relevant to Execution of LD (0010) => loads mem[content] ex: the instruction is located at x4018, execution of this instruction results in the contents of x3FC8, which is 5, to be loaded into the destination register 1) the incremented PC (x4019) is added to the sign-extended value contained in IR[8:0] (xFFAF) originally x1AF, and the result (x3FC8) is loaded into the MAR 2) memory is read and the contents of x3FC8 are loaded into the MDR 3) Suppose the value stored in x3FC8 is 5. In step 3, the value 5 is loaded into the destination register, completing the instruction cycle. *5.3.2 Indirect Mode* - LDI (opcode = 1010) and STI (opcode = 1011) specify the *indirect addressing mode* - An address is first formed exactly the same way as with LD and ST. However, instead of this address *being* the address of the operand to be loaded or stored, it *contains* the address of the operand to be loaded or stored - *Note that the address of the operand can be anywhere in the computer's memory, not just within the range provided by bits [8:0] of the instruction as is the case for LD and ST* - The destination register for the LDI and the source register for STI, like all the other loads and stores, are specified in bits [11:9] of the instruction. Data Path Relevant to the Execution of LDI => loads [mem[mem[content]] If the instruction is in x4A1B, and the contents of x49E8 is x2110, execution of this instruction results in the contents of x2110 being loaded into the destination register 1) adding the incremented PC (x4AlC) to the sign-extended value contained in IR[8:0] (xFFCC), and the result (x49E8) loaded into the MAR 2) memory is read and the contents of x49E8 (x2110) is loaded into the MDR 3) since x2110 is not the operand, but the address of the operand, it is loaded into the MAR 4) memory is again read, and the MDR again loaded. This time the MDR is loaded with the contents of x2110. 5) Suppose the value —1 is stored in memory location x2110. In step 5, the contents of the MDR (i.e., — 1) are loaded into R3, completing the instruction cycle. *5.3.3 Base+offset Mode* - LDR (opcode = 0110) and STR (opcode = 0111) specify the *Base+offset addressing mode* - The Base+offset mode is so named because the address of the operand is obtained by adding a sign-extended 6-bit offset to a base register - The 6-bit offset is literally taken from the instruction, bits [5:0]. - The base register is specified by bits [8:6] of the instruction. - The Base+offset addressing uses the 6-bit value as a 2's complement integer between —32 and +31. Thus it must first be sign-extended to 16 bits before it is added to the base register. Data Path Relevant to the Execution of LDR => loads mem[mem] If the base register IR[8:6] contains the 16-bit quantity x2345, the instruction loads the destination register IR[11:9] with the contents of x2362 1) the contents of the base register IR[8:6] (x2345) are added to the sign-extended value contained in IR[5:0] (x001D), and the result (x2362) is loaded into the MAR 2) memory is read, and the contents of x2362 are loaded into the MDR 3) Suppose the value stored in memory location x2362 is x0F0F. Third, and finally, the contents of the MDR (in this case, x0F0F) are loaded into the destination register - *Note that the Base+offset addressing mode also allows the address of the operand to be anywhere in the computer's memory* *5.3.4 Immediate Mode* - LEA (opcode =1110) loads the register specified by bits [11:9] of the instruction with the value formed by adding the incremented program counter to the sign-extended bits [8:0] of the instruction. - The fourth and last addressing mode used by the data movement instructions is the immediate (or, literal) addressing mode - It is used only with the load effective address (LEA) instruction - The immediate addressing mode is so named because the operand to be loaded into the destination register is obtained immediately, that is, without requiring any access of memory - The LEA instruction is useful to initialize a register with an address that is very close to the address of the instruction doing the initializing - *Note that no access to memory is required to obtain the value to be loaded.* Data Path Relevant to the Execution of LEA If memory location x4018 contains the instruction LEA R5, #-3, and the PC contains x4018, R5 will contain x4016 after the instruction at x4018 is executed - LEA is the only load instruction that does not access memory to obtain the information it will load into the DR. 1) It loads into the DR the address formed from the incremented PC and the address generation bits of the instruction.

5.1 The ISA: Overview

The ISA specifies all the information about the computer that the software has to be aware of. - In other words, the ISA specifies everything in the computer that is available to a programmer when he/she writes programs in the computer's own machine language - the ISA also specifies everything in the computer that is available to someone who wishes to translate programs written in a high-level language like C or Pascal or Fortran or COBOL into the machine language of the computer. - The ISA specifies the memory organization, register set, and instruction set, including opcodes, data types, and addressing modes *5.1.1 Memory Organization* - The LC-3 memory has an address space of 216 (i.e., 65,536) locations, and an addressability of 16 bits (Not all 65,536 addresses are actually used for memory locations) - Since the normal unit of data that is processed in the LC-3 is 16 bits, we refer to 16 bits as one word, and we say the LC-3 is word-addressable. *5.1.2 Registers* - Since it usually takes far more than one machine cycle to obtain data from memory, the LC-3 provides (like almost all computers) additional temporary storage locations that can be accessed in a single machine cycle - Most Common Type of Temporary Storage Location = the general purpose register set (the one used in the LC-3 is) - Each register in the set is called a general purpose register (GPR) - Registers have the same property as memory locations in that they are used to store information that can be retrieved later - The number of bits stored in each register is usually one word - With the eight values 1, 3, 5, 7, -2, -4, -6, and -8 stored in R0, ... R7, respectively *5.1.3 The Instruction Set* - An instruction is made up of two things, its opcode (what the instruction is asking the computer to do) and its operands (who the computer is expected to do it to) - The instruction set of an ISA is defined by its set of opcodes, data types, and addressing modes. The addressing modes determine where the operands are located. - The operation the instruction is asking the computer to perform is 2's complement integer addition, and the locations where the computer is expected to find the operands are the general purpose registers *5.1.4 Opcodes* - Some ISAs have a very large set of opcodes, one for each of a large number of tasks that a program may wish to carry out - Other ISAs have a very small set of opcodes. Some ISAs have specific opcodes to help with processing scientific calculations. For example, the Hewlett Packard Precision Architecture has an instruction that performs a multiply, followed by an add (A * B) + C on three source operands. - Still other ISAs have specific opcodes to help with handling the tasks of the operating system. - The LC-3 ISA has 15 instructions, each identified by its unique opcode. The opcode is specified by bits [15:12] of the instruction. Since four bits are used to specify the opcode, 16 distinct opcodes are possible - The code 1101 has been left unspecified, reserved for some future need that we are not able to anticipate today Three Different Types of Opcodes 1) Operates = process information 2) Data Movement = move information between memory and the registers and between registers/memory and input/output devices 3) Control = change the sequence of instructions that will be executed *5.1.5 Data Types* = a representation of information such that the ISA has opcodes that operate on that representation - If the ISA has an opcode that operates on information represented by a data type, then we say the ISA supports that data type - the only data type supported by the ISA of the LC-3: 2's complement integers *5.1.6 Addressing Modes* = is a mechanism for specifying where the operand is located - Three Places Operands Can Be Found 1) In Memory 2) In a Register 3) As a Part of the Instruction (literal or immediate operand) - LC-3 Supports Five Addressing Modes 1) Immediate (or literal) 2) Register 3) Three Memory Addressing Modes (PC-relative, indirect, and Base+offset) Operate Instructions Use Two Addressing Modes 1) Immediate (or literal) 2) Register Data Movement Instructions Use All Five Addressing Modes 1) Immediate (or literal) 2) Register 3) Three Memory Addressing Modes (PC-relative, indirect, and Base+offset) *5.1.7 Condition Codes (NZP)* - The LC-3 has three single-bit registers that are set (set to 1) or cleared (set to 0) each time one of the eight general purpose registers is written - The three single-bit registers are called N, Z, and P, corresponding to their meaning: negative, zero, and positive. - Each time a GPR is written, the N, Z, and P registers are individually set to 0 or 1, corresponding to whether the result written to the GPR is negative, zero, or positive - Each of the three single-bit registers is referred to as a condition code because the condition of that bit can be used by one of the control instructions to change the execution sequence

1.5 Two Very Important Ideas

Idea 1 = all computers (the biggest and the smallest, the fastest and the slowest, the most expensive and the cheapest) are capable of computing exactly the same things if they are given enough time and enough memory - anything a fast computer can do, a slow computer can do also - a more expensive computer cannot figure out something that a cheaper computer is unable to figure out as long as the cheap computer can access enough memory Idea 2 = we describe our problems in English or some other language spoken by people. Yet the problems are solved by electrons running around inside the computer. It is necessary to transform our problem from the language of humans to the voltages that influence the flow of electrons. This transformation is really a sequence of systematic transformations, developed and improved over the last 50 years, which combine to give the computer the ability to carry out what appears to be some very complicated tasks. In reality, these tasks are simple and straightforward.

5.4 Control Instructions

LC-3 Has Five Opcodes That Enable Sequential Flow to Be Broken 1) Conditional Branch 2) Unconditional Jump 3) Subroutine (sometimes called function) Call 4) TRAP = it allows a programmer to get information into and out of the computer without fully understanding the intricacies of the input and output devices 5) Return from Interrupt *5.4.1 Conditional Branches (0000)* - Bits [11], [10], and [9] correspond to the three condition codes - all instructions that write values into the general purpose registers set the three condition codes in accordance with whether the value written is negative, zero, or positive (These instructions are ADD, AND, NOT, LD, LDI, LDR, and LEA.) - The condition codes are used by the conditional branch instruction to deter- mine whether to change the instruction flow The instruction cycle is as follows 1) FETCH and DECODE are the same for all instructions. The PC is incremented during FETCH. 2) The EVALUATE ADDRESS phase is the same as that for LD and ST: the address is computed by adding the incremented PC to the 16-bit value formed by sign-extending bits [8:0] of the instruction. 3) During the EXECUTE phase, the processor examines the condition codes whose corresponding bits in the instruction are 1 - That is, if bit [ 11 ] is 1, condition code N is examined. If bit [10] is 1, condition code Z is examined. If bit [9] is 1, condition code P is examined - If any of bits [11:9] are 0, the corresponding condition codes are not examined - If any of the condition codes that are examined are in state 1, then the PC is loaded with the address obtained in the EVALUATE ADDRESS phase. - If none of the condition codes that are examined are in state 1, the PC is left unchanged Data Path Relevant to the Execution of BR - If the last value loaded into a general purpose register was 0, then the current instruction (located at x4027) would load the PC with x4101, and the next instruction executed would be the one at x4101, rather than the instruction at x4028 - If all three bits [11:9] are 1, then all three condition codes are examined - Since all three are examined, the PC is loaded with the address obtained in the EVALUATE ADDRESS phase. - We call this an unconditional branch since the instruction flow is changed unconditionally, that is, independent of the data that is being processed. *5.4.3 Two Methods for Loop Control* - Loop = describes a sequence of instructions that get executed again and again under some controlling mechanism Two Common Methods for Controlling the Number of Iterations of a Loop 1) the use of a counter 2) use a sentinel (This method is particularly effective if we do not know ahead of time how many iterations we will want to perform) ex: NEWLINE was a common sentinel for when to stop iterating - a sentinel could be a # or a *, that is, something that is not a number. Our loop test is simply a test for the occurrence of the sentinel. When we find it, we know we are done. *5.4.5 The JMP Instruction* - Since bits [8:0] specify a 2's complement integer, the next instruction executed after the conditional branch can be at most +256 or —255 locations from the branch instruction itself. What if we would like to execute next an instruction that is 1,000 locations from the current instruction. We cannot fit the value 1,000 into the 9-bit field; ergo, the conditional branch instruction does not work. - The LC-3 ISA does provide an instruction JMP (opcode = 1100) that can do the job. - The JMP instruction loads the PC with the contents of the register specified by bits [8:6] of the instruction. ex: If this JMP instruction is located at address x4000, R2 contains the value x6600, and the PC contains x4000, then the instruction at x4000 (the JMP instruction) will be executed, followed by the instruction located at x6600. - Since registers contain 16 bits, the full address space of memory, the JMP instruction has no limitation on where the next instruction to be executed must reside. *5.4.6 The TRAP Instruction* - useful to get data into and out of the computer - TRAP (1111) = changes the PC to a memory address that is part of the operating system so that the operating system will perform some task in behalf of the program that is executing - the TRAP instruction invokes an operating system SERVICE CALL - Bits [7:0] of the TRAP instruction form the trapvector, which identifies the service call that the program wishes the operating system to perform. - Once the operating system is finished performing the service call, the program counter is set to the address of the instruction following the TRAP instruction, and the program continues - In this way, a program can, during its execution, request services from the operating system and continue processing after each such service is performed. Services Required for Now: 1) Input a character from the keyboard (trapvector = x23) 2) Output a character to the monitor (trapvector = x21) 3) Halt the program (trapvector = x25)

8.6 Implementation of Memory-Mapped I/O Revisited

We have also learned that in order to support interrupt-driven I/O, the two status registers must be writeable as well as readable. - The Address Control Logic block controls the input or output operation. Note that there are three inputs to this block. MIO.EN indicates whether a data movement from/to memory or I/O is to take place this clock cycle - MAR contains the address of the memory location or the memory-mapped address of an I/O device register. - R.W indicates whether a load or a store is to take place. - Depending on the values of these three inputs, the Address Control Logic does nothing (MIO.EN = 0), or provides the control signals to direct the transfer of data between the MDR and the memory or I/O registers. 1) If R.W indicates a load, the transfer is from memory or I/O device to the MDR. 2) The Address Control Logic block provides the select lines to INMUX to source the appropriate I/O device register or memory (depending on MAR) and also enables the memory if MAR contains the address of a memory location. 3) If R.W indicates a store, the contents of the MDR are written either to memory or to one of the device registers. The Address Control Logic either enables a write to memory or it asserts the load enable line of the device register specified by the contents of the MAR.

1.1 What We Will Try to Do

*CHAPTER 1* Like a house, we will start at the bottom, construct the foundation first, and then go on to add layers and layers, as we get closer and closer to what most people know as a full-blown computer - each time we add a layer, we will explain what we are doing, you will be able to write programs in a computer language such as C, using the sophisticated features of that language, and understand what is going on underneath, inside the computer

1.6 Computers as Universal Computational Devices

*Universal Computational Device* (Alan Turing) - When you think of a new kind of computation, you do not have to buy or design a new computer. You just give the old computer a new set of instructions (or program) to carry out the computation. This is why we say the computer is a universal computational device. *Analog Machines* = machines that produced an answer by measuring some physical quantity such as distance or voltage - problem: very hard to increase their accuracy *Digital Machines* = machines that perform computations by manipulating a fixed finite set of digits or letters - came to dominate computing - increase accuracy by adding more digits - when talking about computers in this book, we will always mean digital machines *Alan Turing* = proposed in 1937 that all computations could be carried out by a particular kind of machine, which is now called a Turing machine - Turing proposed that every computation can be performed by some Turing machine (Turing's thesis) - computers and universal Turing machines can compute anything that can be computed because they are programmable (this is the reason that a big or expensive computer cannot do more than a small, cheap computer)

CHAPTER 1

Welcome Aboard

CHAPTER 3: DIGITAL LOGIC STRUCTURES

In this chapter, we will explain how the MOS transistor works (as a logic element), show how these transistors are connected to form logic gates, and then show how logic gates are interconnected to form larger units that are needed to construct a computer. In Chapter 4, we will connect those larger units into a computer. But first, the transistor.

CHAPTER 5: THE LC-3

We are now ready to introduce a "real" computer, the LC-3. To be more nearly exact, we are ready to introduce the instruction set architecture (ISA) of the LC-3. - the ISA is the interface between what the software commands and what the hardware actually carries out

CHAPTER 8: INPUT/OUTPUT

We have chosen to study the keyboard as our input device and the monitor display as our output device. Not only are they the simplest I/O devices and the ones most familiar to us, but they have characteristics that allow us to study important concepts about I/O without getting bogged down in unnecessary detail.

8.5 Interrupt-Driven I/O

- interaction between the processor and an I/O device can be controlled by the processor (i.e., polling) or it can be controlled by the I/O device (i.e., interrupt driven) - In Sections 8.2, 8.3, and 8.4, we have studied several examples of polling. In each case, the processor tested the Ready bit of the status register, again and again, and when it was finally 1, the processor branched to the instruction that did the input or output operation. *8.5.1 What is Interrupt-Driven I/O?* The essence of interrupt-driven I/O is the notion that an I/O device that may or may not have anything to do with the program that is running can 1) force that program to stop 2) have the processor carry out the needs of the I/O device, and then 3) have the stopped program resume execution as if nothing had happened *Instruction execution flow for interrupt-driven I/O* Program A is executing instruction n Program A is executing instruction n+1 Program A is executing instruction n+2 1: Interrupt signal is detected 1: Program A is put into suspended animation 2: The needs of the I/O device start being carried out 2: The needs of the I/O device are being 2: The needs of the I/O device are being 2: The needs of the I/O device are being 2: The needs of the I/O device have been 3: Program A is brought back to life Program A is executing instruction n+3 Program A is executing instruction n+4 - As far as Program A is concerned, the work carried out and the results computed are no different from what would have been the case if the interrupt had never happened *8.5.2 Why Have Interrupt-Driven I/O?* - As is undoubtedly clear, polling requires the processor to waste a lot of time spinning its wheels, re-executing again and again the LDI and BR instructions until the Ready bit is set - With interrupt-driven I/O, none of that testing and branching has to go on - Interrupt-driven I/O allows the processor to spend its time doing what is hopefully useful work, executing some other program perhaps, until it is notified that some I/O device needs attention *8.5.3 Generation of the Interrupt Signal* - There are two parts to interrupt-driven I/O: 1) the enabling mechanism that allows an I/O device to interrupt the processor when it has input to deliver or is ready to accept output 2) the mechanism that manages the transfer of the I/O data. The two parts can be briefly described as: 1) generating the interrupt signal, which stops the currently executing process, and 2) handling the request demanded by this signal - to handle interrupt requests (part 2), the LC-3 uses a stack, and we will not get to stacks until Chapter 10 Part One Several things must be true for an I/O device to actually interrupt the processor: 1) The I/O device must want service. 2) The device must have the right to request the service. 3) The device request must be more urgent than what the processor is currently doing. The Interrupt Signal From the Device - For an I/O device to generate an interrupt request, the first two elements in the previous list must be true: The device must want service, and it must have the right to request that service. 1) The first element we have discussed at length in the study of polling. It is the Ready bit of the KBSR or the DSR. - That is, if the I/O device is the keyboard, it wants service if someone has typed a character - If the I/O device is the monitor, it wants service (i.e., the next character to output) if the associated electronic circuits have successfully completed the display of the last character. - In both cases, the I/O device wants service when the corresponding Ready bit is set. 2) The second element is an interrupt enable bit, which can be set or cleared by the processor, depending on whether or not the processor wants to give the I/O device the right to request service - In most I/O devices, this interrupt enable (IE) bit is part of the (device status register*. In the KBSR and DSR shown in Figure 8.7, the IE bit is bit [14]. *Interrupt Request From the I/O Device* = is the logical AND of the IE bit and the Ready bit - If the interrupt enable bit (bit [14]) is clear, it does not matter whether the Ready bit is set; the I/O device will not be able to interrupt the processor. In that case, the program will have to poll the I/O device to determine if it is ready. - If bit [14] is set, then interrupt-driven I/O is enabled. In that case, as soon as someone types a key (or as soon as the monitor has finished processing the last character), bit [15] is set. *The Importance of Priority* 3) The third element in the list of things that must be true for an I/O device to actually interrupt the processor is whether the request is sufficiently urgent - Every instruction that the processor executes, it does with a stated level of urgency. The term we give for the urgency of execution is priority. - The LC-3 has eight priority levels, PLO,.. PL7. The higher the number, the more urgent the program. - If a program is running at one PL, and a higher-level PL request seeks access to the computer, the lower-priority program suspends processing until the higher-PL program executes and satisfies that more urgent request. - For our I/O device to successfully stop the processor and start an interrupt- driven I/O request, the priority of the request must be higher than the priority of the program it wishes to interrupt - We will see momentarily that the processor will stop executing its current program and service an interrupt request if the INT signal is asserted. *The Test for INT* - The final step in the first part of interrupt-driven I/O is the test to see if the processor should stop and handle an interrupt. - Recall from Chapter 4 that the instruction cycle sequences through the six phases of FETCH, DECODE, EVALUATE ADDRESS, FETCH OPERAND, EXECUTE, and STORE RESULT. - The additional logic to test for the interrupt signal is to replace that last sequential step of always going from STORE RESULT back to FETCH, as follows: The STORE RESULT phase is instead accompanied by a test for the interrupt signal INT. 1) If INT is not asserted, then it is business as usual, with the control unit returning to the FETCH phase to start processing the next instruction. 2) If INT is asserted, then the control unit does two things before returning to the FETCH phase a) First it saves enough state information to be able to return to the interrupted program where it left off. b) Second it loads the PC with the starting address of the program that is to carry out the requirements of the I/O device.

2.1 Bits and Data Types

*2.1.1 The Bits as the Unit of Information* - it is much easier simply to detect whether or not a voltage exists between a pair of points in a circuit than it is to measure exactly what that voltage is - we represent the presence of a voltage as "1" and the absence of a voltage as "0" - we refer to each 0 and each 1 as a bit (which is a shortened form of a binary digit) - the electronic circuits in the computer differentiate voltages close to 0 from voltages far from 0 (ex: voltage of 2.9 volts or a voltage of 0 volts => 2.9 volts signifies 1 and 0 volts signifies 0 => therefore, 2.6 volts will be taken as 1 and 0.2 volts will be taken as 0) - if we are limited to eight bits, we can differentiate at most only 256 (that is, 2^8) different values - In general, with k bits, we can distinguish at most 2^k distinct items (each patter of these k bits is a code; that is, it corresponds to a particular value) *2.1.2 Data Types* - the number five can be written as a 5. This is the standard *decimal notation* that you are used to. - a fourth notation for five is the *binary representation* 00000101. - *Data Type* = a particular representation is a data type if there are operations in the computer that can operate on information that is encoded in that representation (Each ISA has its own set of data types and its own set of instructions that can operate on those data types) *Two Main Data Types:* - 2's Complement Integers (for representing positive and negative integers that we wish to perform arithmetic on) - ASCII Codes (for representing characters on the keyboard that we wish to input to a computer or display on the computer's monitor)

2.2 Integer Data Types

*2.2.1 Unsigned Integers* - If we wish to perform a task some specific number of times, unsigned integers enable us to keep track of this number easily by simply counting how many times we have performed the task "so far." - Unsigned integers also provide a means for identifying different memory locations in the computer, in the same way that house numbers differentiate 129 Main Street from 131 Main Street. - We can represent unsigned integers as strings of binary digits. - You are familiar with the decimal number 329, which also uses positional notation. The 3 is worth much more than the 9, even though the absolute value of 3 standing alone is only worth 1/3 the value of 9 standing alone. This is because, as you know, the 3 stands for 300 (3 *102) due to its position in the decimal string 329, while the 9 stands for 9 • 10°. - *2's Complement* = works the same way as above except that the digits used are the binary digits 0 and 1, and the base is 2, rather than 10 ex: if we have five bits available to represent our values, the number 6 is represented as 00110, corresponding to (0 • 2^4) + (0 • 2^3) + (1 • 2^2) + (1 • 2^1) + (0 •2°) - With k bits, we can represent in this positional notation exactly 2^k integers, ranging from 0 to 2k —1. *2.2.2 Signed Integers, 1's Complement (Including Negative Quantities)* - it is often (although not always) necessary to be able to deal with negative quantities as well as positive - We could take our 2k distinct patterns of k bits and separate them in half, half for positive numbers, and half for negative numbers ex: with five-bit codes, instead of representing integers from 0 to +31, we could choose to represent positive integers from +1 to +15 and negative integers from —1 to —15. - Since there are k bits, and we wish to use exactly half of the 2k codes to represent the integers from 0 to 2k~l —1, all positive integers will have a leading 0 in their representation ex: (with k = 5), the largest positive integer +15 is represented as 01111. *1's Complement* = a negative number is represented by taking the representation of the positive number having the same magnitude (5 vs -5), and flipping all the bits *ex: +5 is represented as 00101; therefore we designate -5 as 11010*

2.3 2's Complement Integers

*2.3 2's Complement Integers* - the representations of the integers from —16 to +15 for the 2's complement data type - The choice of representations for the negative integers was based, as we said previously, on the wish to keep the logic circuits as simple as possible. - The addition of two binary strings is performed in the same way addition of two decimal strings is performed, from right to left, column by column. - If the addition in a column generates a carry, the carry is added to the column immediately to its left. - the binary ALU (arithmetic and logic unit) only ADDs and does not CARE - it would be nice if, when the ALU adds the representation for an arbitrary integer to the integer of the same magnitude and opposite sign, the sum is 0 ex: if the inputs to the ALU are the representations of non-zero integers A and -A, the output of the ALU should be 00000 *2's Complement* = a negative number is represented by taking the representation of the positive number having the same magnitude (5 vs -5), flipping all the bits, and adding 00001 (if it's 5 bits) *ex: since 00101 is the representation of +5, 11011 is chosen as the representation of -5* - This is sufficient to guarantee (as long as we do not get a result larger than +15 or smaller than — 16) that the binary ALU will perform addition correctly. - because the carry obtained by adding 00001 to 11111 is ignored, the carry can always be ignored when dealing with 2's complement arithmetic (adding 00001 to 11111 = 00000; therefore, the carry is ignored -> it is NOT 100000) - With k = 5, we can uniquely identify 32 distinct quantities, and we have accounted for only 31 (15 + 15 + 1)

CHAPTER 4: THE VON NEUMANN MODEL

- We will build on the logic structures that we studied in Chapter 3, both decision elements and storage elements, to construct the basic computer model first proposed by John von Neumann in 1946.

2.5 Operations on Bits - Part 1: Arithmetic

*2.5.1 Addition and Subtraction* - Instead of generating a carry after 9 (since 9 is the largest decimal digit), we generate a carry after 1 (since 1 is the largest binary digit). - adding a number to itself (provided there are enough bits to represent the result) is equivalent to shifting the representation one bit position to the left. *2.5.2 Sign-Extension* = performed in order to be able to operate on bit patterns of different lengths. It does not affect the values of the numbers being represented. - It is often useful to represent a small number with fewer bits ex: rather than represent the value 5 as 0000000000000101, there are times when it is useful to allocate only six bits to represent the value 5 as 000101. - We obtained the negative representation from its positive counterpart by complementing the positive representation and adding 1. ex: the representation for —5, given that 5 is represented as 000101, is 111011. If 5 is represented as 0000000000000101, then the representation for —5 is 1111111111111011. - *In the same way that leading 0s do not affect the value of a positive number, leading Is do not affect the value of a negative number.* - In order to add representations of different lengths, it is first necessary to represent them with the same number of bits. ex: suppose we wish to add the number 13 to - 5 , where 13 is represented as 0000000000001101 and —5 is represented as 111011. If we do not represent the two values with the same number of bits, we have 0000000000001101 + 111011 - if we understand that a 6-bit —5 and a 16-bit —5 differ only in the number of meaningless leading Is, then we first extend the value of —5 to 16 bits before we perform the addition. Thus, we have 0000000000001101 + 1111111111111011 = 0000000000001000 and the result is +8, as we should expect. *2.5.3 Overflow* EX1) Suppose we wish to add +9 and +11: 01001 + 01011 = 10100 - The fact that the number is too large means that the number is larger than 01111, the largest positive number we can represent with a five-bit 2's complement data type. Note that because our positive result was larger than +15, it generated a carry into the leading bit position. But this bit position is used to indicate the sign of a value. Since we are adding two positive numbers, the result must be positive. Since the ALU has produced a negative result (1), something must be wrong. The thing that is wrong is that the sum of the two positive numbers is too large to be represented with the available bits. We say that the result has overflowed the capacity of the representation. EX2) Suppose we wish to add -12 and -6: 10100 + 11010 = 01110 - the leading bit position (0) is wrong since it indicates that the answer is positive, -18 is "more negative" than -16 (the negative number with the largest allowable magnitude)

2.6 Operations on Bits - Part II: Logical Operations

*2.6.1 The AND Function* - AND is a binary logical function. This means it requires two pieces of input data. - AND requires two source operands. - The output of AND is 1 only if both sources have the value 1. Otherwise, the output is 0. We can think of the AND operation as the ALL operation; that is, the output is 1 only if ALL two inputs are 1. Otherwise, the output is 0. *Truth Table* = consists of n+1 columns and 2^n rows (the first n columns correspond to the n source operands). - Since each source operand is a logical variable and can have one of two values, there are 2^n unique values that these source operands can have. - Each such set of values (sometimes called an input combination) is represented as one row of the truth table. The final column in the truth table shows the output for each input combination. - *Bit-Wise AND* = We can apply the logical operation AND to two bit patterns of m bits each. This involves applying the operation individually to each pair of bits in the two source operands. For example, if a and b in Example 2.6 are 16-bit patterns, then c is the AND of a and b. - *Bit Mask* = is a binary patten that enables the hits of A Lo be separated into two parts generally the part you care about and the part you wish to ignore. Is said to mask our the values in bit positions 7 through ex: If A is 01010110. the AND of A and the bit mask 000000I1 is 00000010. If 4 is 11111100, the AND of A and the bit mask 0000011 is 00000000 *2.6.2 The OR Function* - OR is also a binary logical function. It requires two source operands, both of which are logical variables - The output of OR is 1 if any source has the value 1. Only if both sources are 0 is the output 0 - We can think of the OR operation as the ANY operation; that is, the output is 1 if ANY of the two inputs are 1. *2.6.3 The NOT Function* - NOT is a unary logical function. This means it operates on only one source operand. It is also known as the complement operation. The output is formed by complementing the input. - A 1 input results in a 0 output. A 0 input results in a 1 output. *2.6.4 The Exclusive-OR Function* - Exclusive-OR, often abbreviated XOR, is a binary logical function. It, too, requires two source operands, both of which are logical variables. The output of XOR is 1 if the two sources are different. The output is 0 if the two sources are the same.

3.4 Basic Storage Elements

*3.4.1 The R-S Latch* - It can store one bit of information - Two 2-input NAND gates are connected such that the output of each is connected to one of the inputs of the other. - As long as the inputs S and R remain 1, the state of the circuit will not change (We say the R-S latch stores the value 1 (the value of the output a) - We should also note that in order for the R-S latch to work properly, one must take care that it is never the case that both S and R are allowed to be set to 0 at the same time. If that does happen, the outputs a and b are both 1, and the final state of the latch depends on the electrical properties of the transistors making up the gates and not on the logic being performed. *3.4.2 The Gated D Latch* - it is necessary to control when a latch is set and when it is cleared - It consists of the R-S latch of Figure 3.18, plus two additional gates that allow the latch to be set to the value of D, but only when WE is asserted. - When WE is not asserted (i.e., when WE equals 0), the outputs S and R are both equal to 1. (Since S and R are also inputs to the R-S latch, if they are kept at 1, the value stored in the latch remains unchanged, as we explained in Section 3.4.1) - if S is set to 0, the R-S latch is set to 1 - If R is set to 0, the R-S latch is set to 0 - Thus, the R-S latch is set to 1 or 0 according to whether D is 1or 0. When WE returns to 0, S and R return to 1, and the value stored in the R-S latch persists. *3.4.3 Register* - The register is a structure that stores a number of bits, taken together as a unit (That number can be as large as is useful or as small as 1)

7.3 The Assembly Process

*7.3.1 Introduction* - Before an LC-3 assembly language program can be executed, it must first be translated into a machine language program, that is, one in which each instruction is in the LC-3 ISA. It is the job of the LC-3 assembler to perform that translation. - If you have available an LC-3 assembler, you can cause it to translate your assembly language program into a machine language program by executing an appropriate command. *7.3.2 A Two-Pass Process* (To prevent the assembler from becoming stuck) 1) First Pass = identify the actual binary addresses correspond- ing to the symbolic names (or labels); this set of correspondences is known as the *symbol table* - we construct the symbol table 2) Second Pass = we translate the individual assembly language instructions into their corresponding machine language instructions. *7.3.3 The First Pass: Creating the Symbol Table* - the symbol table is simply a correspondence of symbolic names with their 16-bit memory addresses - We obtain these correspondences by passing through the assembly language program once, noting which instruction is assigned to which address, and identifying each label with the address of its assigned entry. - We keep track of the location assigned to each instruction by means of a location counter (LC). - The assembler examines each instruction in sequence and increments the LC once for each assembly language instruction. - If the instruction examined contains a label, a symbol table entry is made for that label, specifying the current contents of LC as its address. - The first pass terminates when the .END instruction is encountered. - At the conclusion of the first pass, the symbol table has the following entries: Symbol TEST GETCHAR OUTPUT ASCII PTR Address x3004 x300B x300E x3012 x3013 *7.3.4 The Second Pass: Generating the Machine Language Program* - Second Pass = consists of going through the assembly language program a second time, line by line, this time with the help of the symbol table. - At each line, the assembly language instruction is translated into an LC-3 machine language instruction. - This time, when the assembler gets to line 0C, it can completely assemble the instruction since it knows that PTR corresponds to x3013. The instruction is LD, which has an opcode encoding of 0010. The destination register (DR) is R3, that is, 011. - At each step, the LC is incremented and the location specified by LC is assigned the translated LC-3 instruction or, in the case of .FILL, the value specified. When the second pass encounters the .END instruction, assembly terminates.

1.3 Two Recurring Themes

*Two Themes* 1) the notation of abstraction 2) the importance of not separating in your mind the notions of hardware and software *1.3.1 The Notion of Abstraction* - productivity enhancer - allows us to deal with a situation at a higher level, focusing on the essential aspects, while keeping the component ideas in the background - allows us to be more efficient in our use of time and brain activity - allows us to not get bogged down in the detail when everything about the detail is working just fine - our ability to abstract must be combined with our ability to un-abstract (ability to go from the abstraction back to its component parts) - if there is a problem with getting the logic circuit to work, it is often helpful to look at the internal structure of the gate and see if something about its functioning is causing the problem - goal: continually raise the level of abstraction (examine carefully the details of each component in order to uncover the problem); keep the level of abstraction as high as possible, consistent with getting everything to work effectively *The Bottom Line* - abstractions allow us to be much more efficient in dealing with all kinds of situations - one can be effective without understanding what is below the abstraction as long as everything behaves nicely - if we have to combine multiple components into a larger system, we should be careful not to allow their abstractions to be the deepest level of our understanding. If we don't know the components below the level of their abstractions, then we are at the mercy of them working together without our intervention. If they don't work together, and we are unable to go below the level of abstraction, we are stuck. And that is the state we should take care not to find ourselves in. *1.3.2 Hardware Versus Software* - hardware = the physical computer and all the specifications associated with it - software = the programs, whether operating systems like UNIX or Windows, or database systems like Oracle or DB-terrific, or application programs like Excel or Word - there is an implication that it is OK to be an expert at one of these (hardware or software) and clueless about the other - hardware and software are names for components of two parts of a computing system that work best when they are designed by someone who took into account the capabilities and limitations of both *The Bottom Line* - whether your inclinations are in the direction of a computer hardware career or a computer software career, you will be much more capable if you master both

8.1 I/O Basics

*8.1.1 Device Registers* - The simplest I/O devices usually have at least two device registers: 1) to hold the data being transferred between the device and the computer 2) to indicate status information about the device. - An example of status information is whether the device is available or is still busy processing the most recent I/O task. *8.1.2 Memory-Mapped I/O versus Special Input/Output Instructions - Most computers prefer to use the same data movement instructions that are used to move data in and out of memory. - Most computer designers prefer not to specify an additional set of instructions for dealing with input and output. - They use the same data movement instructions that are used for loading and storing data between memory and the general purpose registers. - Since programmers use the same data movement instructions that are used for memory, every input device register and every output device register must be uniquely identified in the same way that memory locations are uniquely identified. Therefore, each device register is assigned an address from the memory address space of the ISA. => Memory Mapped I/O = the I/O device registers are mapped to a set of addresses that are allocated to I/O device registers rather than to memory locations Memory Location Addresses = x0000 to xFDFF Input/Output Device Register Addresses = xFE00 to xFFFF *8.1.3 Asynchronous versus Synchronous* - Most I/O is carried out at speeds very much slower than the speed of the processor. - A typist, typing on a keyboard, loads an input device register with one ASCII code every time he/she types a character. A computer can read the contents of that device register every time it executes a load instruction, where the operand address is the memory-mapped address of that input device register. - Most interaction between a processor and I/O is asynchronous (not existing or happening at the same time) To control processing in an asynchronous world requires some protocol or handshaking mechanism: 1) In the case of the keyboard, we will need a 1-bit status register, called a flag, to indicate if someone has or has not typed a character. 2) In the case of the monitor, we will need a 1-bit status register to indicate whether or not the most recent character sent to the monitor has been displayed. - these flags are the simplest form of synchronization. - A single flag, called the Ready bit, is enough to synchronize the output of the typist with the input to the processor - Each time the typist types a character, the Ready bit is set - Each time the computer reads a character, it clears the Ready bit - By examining the Ready bit before reading a character, the computer can tell whether it has already read the last character typed. (If the Ready bit is clear, no characters have been typed since the last time the computer read a character, and so no additional read would take place.) *The single Ready bit provides enough handshaking to ensure that the asynchronous transfer of information between the typist and the microprocessor can be carried out accurately.* *8.1.4 Interrupt-Driven versus Polling* (Who Controls the Interaction Between Processor and Typist?) - The processor, which is computing, and the typist, who is typing, are two separate entities. Each is doing its own thing. Still, they need to interact, that is, the data that is typed has to get into the computer. Interrupt-Driven I/O = where the keyboard controls the interaction Polling = the processor controls the interaction by interrogating (usually, again and again) the Ready bit until it (the processor) detects that the Ready bit is set. - named polling since the Ready bit is polled by the processor, asking if any key has been struck.

1.2 How We Will Get There

*CHAPTER 2* = the computer is a piece of electronic equipment and consists of electronic parts interconnected by wires - every wire in the computer, at every moment in time, is either at a high voltage or a low voltage - only care whether there is a large voltage relative to 0 volts - the absence or presence of a large voltage relative to 0 is represented as 0 or 1 - we will encode all information as sequences of 0s or 1s (ex: letter a = 01100001 or number 35 = 00100011) *CHAPTER 3* = see how the transistors make up today's microprocessors work *CHAPTER 5* 1) Understand LC-3 (Simple Computer) *LC-3 OR LITTLE COMPUTER 3* (we started with LC-1 but needed two more shots at it before we got it right) = has all the important characteristics of the microprocessors that you may have already heard of such as - *Intel 8088* = used in the first IBM PCs back in 1981 - *Motorola 68000* = used in the Macintosh, vintage 1984 - *Pentium IV* = one of the high-performance microprocessors of choice in the PC of the year 2003 *CHAPTER 6 & CHAPTER 7* 2) Program LC-3 - own language - Chapter 6 - assembly language (easier for humans to work with) - Chapter 7 *CHAPTER 8* = deals with the problem of getting information into (input) and out of (output) the LC-3 *CHAPTER 9* = covers two sophisticated LC-3 mechanisms 1) TRAPS 2) subroutines *CHAPTER 10* = introducing two important concepts (stacks and data conversion) and then by showing a sophisticated example: an LC-3 program that carries out the work of a handheld calculator *TURN OUR ATTENTION TO A HIGH-LEVEL PROGRAMMING LANGUAGE C FOR THE LAST 10 CHAPTERS*

1.4 A Computer System

*Computer* = a mechanism that does two things 1) directs the processing of information (figuring out which task should get carried out next) 2) performs the actual processing of information (doing the actual additions, multiplications, and so forth that are necessary to get the job done) (does both in response to a computer program) *Central Processing Unit (CPU) or simply a processor* = a more precise term for a computer - this textbook is primarily about the processor and the programs that are executed by the processor - today, a processor usually consists of a single microprocessor chip, built on a piece of silicon material, measuring less than an inch square, and containing many millions of transistors *Computer System* = usually includes 1) the processor 2) a keyboard for typing commands 3) a mouse for clicking on menu entries 4) a monitor for displaying information that the computer system has produced 5) a printer for obtaining paper copies of that information 6) memory for temporarily storing information 7) disks and CD-ROMs of one sort or another for storing information for a very long time 8 the collection of programs (the software) that the user wishes to execute

1.7 How Do We Get the Electrons to Do the Work?

*Levels of Transformation* 1) Problems 2) Algorithms 3) Language 4) Machines (ISA) Architecture 5) Microarchitecture 6) Circuits 7) Devices - at each level we have choices - if we ignore any of the levels, our ability to make the best use of our computing system can be very adversely affected *1.7.1 The Statement of the Problem* - we describe the problems we wish to solve with a computer in a "natural language" - natural languages are languages that people speak, like English, French, Japanese, Italian, and so on - languages are fraught with a lot of things unacceptable for providing instructions to a computer. Most important of these unacceptable attributes is ambiguity. Natural language is filled with ambiguity. - to tell a computer to do something where there are multiple interpretations would cause the computer to NOT know which interpretation to follow. *1.7.2 The Algorithm* - the first step in the sequence of transformations is to transform the natural language description of the problem to an algorithm, and in so doing, get rid of the objectionable characteristics. - *An Algorithm* = a step-by-step procedure that is guaranteed to terminate, such that each step is precisely stated and can be carried out by the computer 1) *Definiteness* = describe the notion that each step is precisely stated (ex: A recipe for excellent pancakes that instructs the preparer to "stir until lumpy" lacks definiteness, since the notion of lumpiness is not precise) 2) *Effective Computability* = describe the notion that each step can be carried out by a computer (ex: A procedure that instructs the computer to "take the largest prime number" lacks effective computability, since there is no largest prime number) 3) *Finiteness* = describe the notion that the procedure terminates (ex: For every problem there are usually many different algorithms for solving that problem) *!.7.3 The Program* - The next step is to transform the algorithm into a computer program, in one of the programming languages that are available. Programming languages are "mechanical languages." - *Mechanical Languages* = did not evolve through human discourse, rather, they were invented for use in specifying a sequence of instructions to a computer (do not suffer from failings such as ambiguity that would make them unacceptable for specifying a computer program) - there are more than 1,000 programming languages (some designed for use with particular applications) - *Two Kinds of Programming Languages* 1) high-level languages (machine independent) = are at a distance (a high level) from the underlying computer 2) low-level languages = are tied to the computer on which the programs will execute (assembly language for that computer) *1.7.4 The ISA (Instruction Set Architecture)* - The next step is to translate the program into the instruction set of the particular computer that will be used to carry out the work of the program. - ISA = the complete specification of the interface between programs that have been written and the underlying computer hardware that must carry out the work of those programs 1) specifies the set of instructions the computer can carry out, that is, what operations the computer can perform and what data is needed by each operation 2) specifies the mechanisms (addressing modes) that the computer can use to figure out where the operands are located 3) specifies the number of unique locations that comprise the computer's memory and the number of individual 0s and Is that are contained in each location *Operand* = used to describe individual data values *Data Types* = a legitimate representation for an operand such that the computer can perform operations on that representation - The number of operations, data types, and addressing modes specified by an ISA vary among the different ISAs. *Compiler* = a translating program that translates a high-level language (such as C) to the ISA of the computer on which the program will execute (such as x86) *Assembler* = translates unique assembly language of a computer to its ISA *ISA Examples* 1) x86, introduced by Intel Corporation in 1979 and currently also manufactured by AMD and other companies 2) Power PC (IBM and Motorola) 3) PA-RISC (Hewlett Packard) 4) SPARC (Sun Microsystems) *1.7.5 The MicroArchitecture* - The next step is to transform the ISA into an implementation. *Microarchitecture* = the detailed organization of an implementation ex: x86 has been implemented by several different microprocessors over the years, each having its own unique microarchitecture. - Each implementation is an opportunity for computer designers to make different trade-offs between the cost of the microprocessor and the performance that microprocessor will provide - the "microarchitecture" of the specific automobile is a result of the automobile designers' decisions regarding cost and performance. *1.7.6 The Logic Circuit* - The next step is to implement each element of the microarchitecture out of simple logic circuits. - there are choices, as the logic designer decides how to best make the trade-offs between cost and performance. ex: even for the simple operation of addition, there are several choices of logic circuits to perform this operation at differing speeds and corresponding costs *1.7.7 The Devices* - Finally, each basic logic circuit is implemented in accordance with the requirements of the particular device technology used. So, CMOS circuits are different from NMOS circuits, which are different, in turn, from gallium arsenide circuits. *1.7.8 Putting It Together* - Since we can't speak electron and they can't speak English, the best we can do is this systematic sequence of transformations. - At each level of transformation, there are choices as to how to proceed. Our handling of those choices determines the resulting cost and performance of our computer. - We show how *transistors* combine to form logic circuits, how *logic circuits* combine to form the microarchitecture, and how the *microarchitecture* implements a particular *ISA*, in our case, the *LC-3* - We complete the process by going from the English- language description of a problem to a C program that solves the problem, and we show how that C program is translated (i.e., compiled) to the ISA of the LC-3.

4.2 The LC-3: An Example von Neumann Machine

*Two Kinds of Arrowheads* 1) Filled In = denote data elements that flow along the corresponding paths - ex: the box labeled ALU in the processing unit processes two 16-bit values and produces a 16-bit result. The two sources and the result are all data, and are designated by filled-in arrowhead. 2) Not Filled In = denote control signals that control the processing of the data elements - ex: The operation performed on those two 16-bit data elements (it is labeled ALUK) is part of the control—therefore, a not-filled-in arrowhead. *Memory* = consists of the storage elements, along with the MAR for addressing individual locations and the MDR for holding the contents of a memory location on its way to/from the storage. - MAR contains 16 bits (reflecting the fact that the memory address space of the LC-3 is 2^16 memory locations) - MDR contains 16 bits (LC-3 is 16-bit addressable) *Input/Output* = consists of a keyboard and a monitor - Simplest Keyboard Requires Two Registers: 1) Data Register (KBDR) = for holding the ASCII codes of keys struck 2) Status Register (KBSR) = for maintaining status information about the keys struck - Simplest Monitor Requires Two Registers 1) Display Data Register (DDR) = holding the ASCII code of something to be displayed on the screen 2) Display Status Register (DSR) = maintaining associated status information *The Processing Unit* = consists of a functional unit that can perform arithmetic and logic operations (ALU) and eight registers (R0,... R7) for storing temporary values that will be needed in the near future as operands for subsequent instructions - The LC-3 ALU can perform one arithmetic operation (addition) and two logical operations (bitwise AND and bitwise complement) *The Control Unit* = consists of all the structures needed to manage the processing that is carried out by the computer - Program Counter = a part of the control unit; it keeps track of the next instruction to be executed after the current instruction finishes Finite State Machine (most important structure of the Control Unit) - Input (filled in arrowheads bc inputting data) 1) CLK Input = processing is carried out step by step, or rather, clock cycle by clock cycle (CLK input to finite state machine specifies how long each cycle lasts) 2) Instruction Register (IR) = an input to the finite state machine since what LC-3 instruction is being processed determines what activities must be carried out - Output (not filled in arrowheads bc outputs control the processing) 1) ALUK (two bits) = controls the operation performed in the ALU (add, and, or not) during the current clock cycle 2) GateALU = determines whether or not the output of the ALU is provided to the processor bus during the current clock cycle

3.5 The Concept of Memory

- *Address* = the unique identifier associated with each memory location - *Addressability* = number of bits of information stored in each location For example, an advertisement for a personal computer might say, "This computer comes with 16 megabytes of memory." Actually, most ads generally use the abbreviation 16 MB. This statement means, as we will explain momentarily, that the computer system includes 16 million memory locations, each containing 1 byte of information. *3.5.1 Address Space* - *Address Space* = the total number of uniquely identifiable locations - With n bits of address, we can uniquely identify 2n locations *3.5.2 Addressability* - The number of bits stored in each memory location is the memory's address- ability - A 16 megabyte memory is a memory consisting of 16,777,216 memory locations, each containing 1 byte (8 bits) of storage *3.5.3 A 2^2-by-3-Bit Memory* - the memory has an address space of four locations, and an addressability of 3 bits per location

3.3 Combinational Logic Circuits (Decision Elements)

- 2 Kinds of Logic Circuits 1) Those that include storage of information 2) Those that DO NOT include storage of information - Their outputs are not at all dependent on any past history of information that is stored internally, since no information can be stored internally in a combinational logic circuit *3.3.1 Decoder* (no storage) - FUNCTION: useful in determining how to interpret a bit pattern - A decoder has the property that exactly one of its outputs is 1 and all the rest are 0s - n inputs - 2^n outputs - a 4-to-16 decoder is a simple combinational logic structure for identifying what work is to be performed by each instruction *3.3.2 MUX* - FUNCTION: to select one of the inputs and connect it to the output - the select signal/control line determines which input is connected to the output *3.3.3 Full Adder* - FUNCTION: find sum and carry - There are two results, the sum bit (Si) and the carryover to the next column, carry(i+1) *3.3.5 The Programmable Logic Array (PLA)* - FUNCTION: appropriately connecting AND gate outputs to OR gate inputs - consists of an array of AND gates (called an AND array) followed by an array of OR gates (called an OR array) - any logic function we wished to implement could be accomplished with a PLA

7.4 Beyond the Assembly of a Single Assembly Language Program

- Although it is still quite a large step from C or C++, assembly language does, in fact, save us a good deal of pain - We have also shown how a rudimentary two-pass assembler actually works to translate an assembly language program into the machine language of the LC-3 ISA. - our reason for teaching assembly language is not to deal with its sophistication, but rather to show its innate simplicity. *7.4.1 The Executable Image* = the entity being executed when a computer begins execution of a program - The executable image is created from modules often created independently by several different programmers. - The final step is to link all the object modules together into one executable image. During execution of the program, the FETCH, DECODE, ... instruction cycle is applied to instructions in the executable image.

3.6 Sequential Logic Circuits

- In this section, we discuss digital logic structures that can both process information (i.e., make decisions) and store information. - That is, these structures base their decisions not only on the input values now present, but also (and this is very important) on what has happened before - they contain storage elements that allow them to keep track of prior history information (feedback) - Sequential logic circuits are used to implement a very important class of mechanisms called *finite state machines.* - Finite State Machines are used as controllers of: 1) electrical systems 2) mechanical systems 3) aeronautical systems, and so forth ex: A traffic light controller that sets the traffic light to red, yellow, or green depends on the light that is currently on (history information) and input information from sensors such as trip wires on the road and optical devices that are monitoring traffic. *3.6.1 A Simple Example: The Combination Lock* - the lock stores the previous rotations and makes its decision (open or don't open) on the basis of the current input value (R3) and the history of the past operations. *3.6.2 The Concept of State* - The problem is that, at any one time, the only external input to the lock is the current rotation (looking back at the previous example) A. The lock is not open, and NO relevant operations have been performed. B. The lock is not open, but the user has just completed the R13 operation. C. The lock is not open, but the user has just completed R13, followed by L22. D. The lock is open. - We have labeled these four situations A, B, C, and D. We refer to each of these situations as the state of the lock *State* = The state of a system is a snapshot of all the relevant elements of the system at the moment the snapshot is taken. *Another example of concept of state: Tic Tac Toe* - The game of tic-tac-toe can also be described in accordance with the notion of state. - Recall that the game is played by two people (or, in our case, a person and the computer). The state is a snapshot of the game in progress each time the computer asks the person to make a move *3.6.3 Finite State Machines* - We have seen that a state is a snapshot of all relevant parts of a system at a particular point in time. At other times, that system can be in other states. The behavior of a system can often be best understood by describing it as a finite state machine. - A finite state machine consists of five elements: 1. a finite number of states 2. a finite number of external inputs 3. a finite number of external outputs 4. an explicit specification of all state transitions 5. an explicit specification of what determines each external output value. - The set of states represents all possible situations (or snapshots) that the system can be in. Each state transition describes what it takes to get from one state to another. *The State Diagram* *The Clock* = is a signal whose value alternates between 0 volts and some specified fixed voltage - the mechanism that triggers the transition from one state to the next. - In the case of the "sequential" combination lock, the mechanism is the completion of rotating the dial in one direction, and the start of rotating the dial in the opposite direction - the mechanism that triggers the transition from one state to the next is a clock circuit. - In digital logic terms, a clock is a signal whose value alternates between 0 and 1 - A clock cycle is one interval of the repeated sequence of intervals - In electronic circuit implementations of a finite state machine, the transition from one state to another occurs at the start of each clock cycle

5.2 Operate Instructions

- Operate instructions process data. Arithmetic operations (like ADD, SUB, MUL, and DIV) and logical operations (like AND, OR, NOT, XOR) are common examples. - The LC-3 has three operate instructions: ADD, AND, and NOT. 1) NOT (opcode = 1001) instruction is the only operate instruction that performs a unary operation, that is, the operation requires one source operand (EX: NOT R2, R4) - Bits [8:6] specify the source register and bits [11:9] specify the destination register. Bits [5:0] must contain all 1s - The control signal to the ALU directs the ALU to perform the bit-wise complement operation. The output of the ALU (the result of the operation) is stored into destination register 2) ADD (opcode = 0001) - binary operations - requires two 16-bit source operands - performs a 2's complement addition of its two source operands - Bits [8:6] specify the source register and bits [11:9] specify the destination register (where the result will be written). - Bits[4:3] are set to 0 whenever the second source operand is another register - Bits[4:3] are set to 1 whenever the second source operand is an immediate number 3) AND (opcode = 0101) - binary operations - requires two 16-bit source operands - performs a bit-wise AND of each pair of bits in its two 16-bit operands - Bits [8:6] specify the source register and bits [11:9] specify the destination register (where the result will be written). - Bits[4:3] are set to 0 whenever the second source operand is another register - Bits[4:3] are set to 1 whenever the second source operand is an immediate number

7.1 Assembly Language Programming - Moving Up a Level

- We generally partition mechanical languages into two classes, high-level and low- level. High Level Language - Of the two, high-level languages are much more user-friendly. ex: C, C++, Java, Fortran, COBOL, Pascal, plus more than a thousand others - Instructions in a high-level language almost (but not quite) resemble statements in a natural language such as English. - High-level languages tend to be ISA independent. That is, once you learn how to program in C (or Fortran or Pascal)for one ISA, it is a small step to write programs in C (or Fortran or Pascal) for another ISA. - Before a program written in a high-level language can be executed, it must be translated into a program in the ISA of the computer on which it is expected to execute Low Level Language - Assembly language is a low-level language. - Each assembly language instruction usually specifies a single instruction in the ISA. - low-level languages are very much ISA dependent. - Assembly languages let us use mnemonic devices for opcodes, such as ADD and NOT, and they let us give meaningful symbolic names to memory locations, such as SUM or PRODUCT, rather than use their 16-bit addresses. - This makes it easier to differentiate which memory location is keeping track of a SUM and which memory location is keeping track of a PRODUCT. We call these names symbolic addresses.

4.5 Stopping the Computer

Ad Nauseum = it appears that the computer will continue processing instructions, carrying out the instruction cycle again and again - Usually, user programs execute under the control of an operating system. UNIX, DOS, MacOS, and Windows NT are all examples of operating systems. Operating systems are just computer programs themselves. - So as far as the computer is concerned, the instruction cycle continues whether a user program is being processed or the operating system is being processed. This is fine as far as user programs are concerned since each user program terminates with a control instruction that changes the PC to again start processing the operating system—often to initiate the execution of another user program. Clock = inside the computer, a component that corresponds very closely to the conductor's baton, and it defines the machine cycle. - Stopping the instruction cycle requires stopping the clock. - Every machine cycle, the voltage rises to 2.9 volts and then drops back to 0 volts - If the RUN latch is in the 1 state (i.e., Q — 1), the output of the clock circuit is the same as the output of the clock generator. If the RUN latch is in the 0 state (i.e., Q = 0), the output of the clock circuit is 0. - Thus, stopping the instruction cycle requires only clearing the RUN latch. Every computer has some mechanism for doing that. In some older machines, it is done by executing a HALT instruction.

4.4 Changing the Sequence of Execution

Everything we have said thus far suggests that a computer program is executed in sequence. That is, the first instruction is executed, then the second instruction is executed, followed by the third instruction, and so on. Three Types of Instructions 1) Operate Instruction = processes data (ADD Instruction) 2) Data Movement Instruction = moves data from one place to another (LDR Instruction) 3) Control Instruction = whose purpose is to change the sequence of instruction execution ex: there are times, as we shall see, when it is desirable to first execute the first instruction, then the second, then the third, then the first again, the second again, then the third again, then the first for the third time, the second for the third time, and so on As we know, each instruction cycle starts with loading the MAR with the PC. Thus, if we wish to change the sequence of instructions executed, we must change the PC between the time it is incremented (during the FETCH phase of one instruction) and the start of the FETCH phase of the next - Control instructions perform that function by loading the PC during the EXECUTE phase, which wipes out the incremented PC that was loaded during the FETCH phase. - The result is that, at the start of the next instruction cycle, when the computer accesses the PC to obtain the address of an instruction to fetch, it will get the address loaded during the previous EXECUTE phase, rather than the next sequential instruction in the computer's program. JMP Example - The 4-bit opcode for JMP is 1100. Bits [8:6] specify the register which contains the address of the next instruction to be processed. Thus, the instruction encoded here is interpreted, "Load the PC (during the EXECUTE phase) with the contents of R3 so that the next instruction processed will be the one at the address obtained from R3." - Processing will go on as follows. Lets start at the beginning of the instruction cycle, with PC = X.16A2. The FETCH phase results in the IK being loaded with the JMP instruction and the PC updated to contain the address x36A3. Suppose the content of R3 at the start of this instruction is x5446. During the KXIiCUTK phase, the PC is loaded with x5446. Therefore, in the next instruction cycle, the instruction processed will be the one at address x5446. rather than the one at address x36A3 *4.4.1 Control of the Instruction Cycle* - FETCH, required the three sequential steps of loading the MAR with the contents of the PC, reading memory, and loading the IR with the contents of the MDR - Each step of the FETCH phase, and indeed, each step of every operation in the computer is controlled by the finite state machine in the control unit. - The FETCH phase takes three clock cycles. - The DECODE phase takes one cycle. - Instructions that change the flow of instruction processing in this way are called control instructions. This can be done very easily by loading the PC during the EXECUTE phase of the control instruction

8.4 A More Sophisticated Input Routine

How does the person sitting at the keyboard know when to type a character? Sitting there, the person may wonder whether or not the program is actually running, or if perhaps the computer is busy doing something else. - To let the person sitting at the keyboard know that the program is waiting for input from the keyboard, the computer typically prints a message on the monitor. - Such a message is often referred to as a prompt. 1) DSR[15] is tested (line 6) to see if DDR can accept a character 2) If DSR[15] is clear, the monitor is busy, and the loop (lines 06 and 07) is repeated. 3) When DSR[15] is 1, the conditional branch (line 7) is not taken, and x0A is written to DDR for outputting (line 8)

2.4 Binary-Decimal Conversion

It is often useful to convert integers between the 2's complement data type and the decimal representation that you have used all your life. *2.4.1 Binary to Decimal Conversion* - Recall that an eight-bit 2's complement number takes the form a7 a6 a5 a4 a3 a2 a1 a0, where each of the bits ai is either 0 or 1. 1) Examine the leading bit a7. - If it is a 0, the integer is positive, and we can begin evaluating its magnitude. - If it is a 1, the integer is negative. In that case, we need to first obtain the 2's complement representation of the positive number having the same magnitude. 2. The magnitude is simply (a6 • 2^6) + (a5 • 2^5) + (a4 • 2^4) + (a3 • 2^3) + (a2 • 2^2) + (a1 • 2^1) + (a0 • 2^°) which we obtain by simply adding the powers of 2 that have coefficients of 1. 3. Finally, if the original number is negative, we affix a minus sign in front. Done! *2.4.2 Decimal to Binary Conversion* - The crux of the method is to note that a POSITIVE binary number is odd if the rightmost digit is 1 and even if the rightmost digit is 0. 1) Determine the largest power of 2 that will fit into the decimal number 2) Subtract that value from the number, and apply that same step to the remainder 3) Repeat until the remainder is 0 4) Construct the binary number by putting 1 if the numbers are in consecutive order and 0 if they are missing numbers in between 5) Add any 0's to the beginning to make it 4's (zero extending) for unsigned magnitude binary numbers

3.1 The Transistor

Most computers today, or rather most microprocessors (which form the core of the computer) are constructed out of MOS transistors. - *MOS* stands for metal-oxide semiconductor - if somehow transistors start misbehaving, we are at their mercy - it is unlikely that we will have any problems from the transistors - Two Types of MOS Transistors: 1) P-Type 2) N-Type - both operate "logically," very similar to the way wall switches work *Wall Switch* - the most basic of electrical circuits: a power supply (in this case, the 120 volts that come into your house), a wall switch, and a lamp (plugged into an outlet in the wall). - In order for the lamp to glow, electrons must flow; in order for electrons to flow, there must be a closed circuit from the power supply to the lamp and back to the power supply. The lamp can be turned on and off by simply manipulating the wall switch to make or break the closed circuit. *N-Type MOS Transistor* - Instead of the wall switch, we could use an n-type or a p-type MOS transistor to make or break the closed circuit. - The transistor has three terminals. They are called the gate, the source, and the drain. The reasons for the names source and drain are not of interest to us in this course. - *Closed Circuit* = the gate of the n-type transistor is supplied with 2.9 volts, the connection from source to drain acts like a piece of wire and the connection is known as a closed circuit - *Open Circuit* = If the gate of the n-type transistor is supplied with 0 volts, the connection between the source and drain is broken. We say that between the source and drain we have an open circuit. - When the gate is supplied with 2.9 volts, the transistor acts like a piece of wire completing the circuit and causing the bulb to glow. When the gate is supplied with 0 volts, the transistor acts like an open circuit, breaking the circuit, and causing the bulb not to glow. *P-Type MOS Transistor* - The p-type transistor works in exactly the opposite fashion from the n-type transistor - *Closed Circuit* = When the gate is supplied with 0 volts, the p-type transistor acts (more or less) like a piece of wire, closing the circuit. - *Open Circuit* = When the gate is supplied with 2.9 volts, the p-type transistor acts like an open circuit - Because the p-type and n-type transistors act in this complementary way, we refer to circuits that contain both p-type and n-type transistors as CMOS circuits, for *complementary metal-oxide semiconductor*

3.2 Logic Gates

One step up from the transistor is the logic gate. That is, we construct basic logic structures out of individual MOS transistors. In Chapter 2, we studied the behavior of the AND, the OR, and the NOT functions. In this chapter we construct transistor circuits that implement each of these functions. The corresponding circuits are called AND, OR, and NOT gates. *3.2.1 The NOT Gate (Inverter)* - the simplest logic structure that exists in a computer is constructed from two MOS transistors, one p-type and one n-type - Figure 3.4b shows the behavior of the circuit if the input is supplied with 0 volts. Note that the p-type transistor conducts and the n-type transistor does not conduct. The output is, therefore, connected to 2.9 volts. On the other hand, if the input is supplied with 2.9 volts, the p-type transistor does not conduct, but the n-type transistor does conduct. The output in this case is connected to ground (i.e., 0 volts). - If we replace 0 volts by the symbol 0 and 2.9 volts by the symbol 1, we have the truth table (Figure 3.4d) for the complement or NOT function, which we studied in Chapter 2. *3.2.2 OR and NOR Gates* - It contains two p-type and two n-type transistors ex: the behavior of the circuit if A is supplied with 0 volts and B is supplied with 2.9 volts. In this case, the lower of the two p-type transistors produces an open circuit, and the output C is disconnected from the 2.9-volt power supply. However, the leftmost n-type transistor acts like a piece of wire, connecting the output C to 0 volts. - Note that if both A and B are supplied with 0 volts, the two p-type transistors conduct, and the output C is connected to 2.9 volts. - Note further that there is no ambiguity here, since both n-type transistors act as open circuits, and so C is disconnected from ground. *3.2.3 AND and NAND Gates* - Note that if either A or B is supplied with 0 volts, there is a direct connection from C to the 2.9-volt power supply - The fact that C is at 2.9 volts means the n-type transistor whose gate is connected to C provides a path from D to ground - Therefore, if either A or B is supplied with 0 volts, the output D of the circuit of Figure 3.7 is 0 volts. - On the other hand, if both A and B are supplied with 2.9 volts, then both of their corresponding p-type transistors are open. However, their corresponding n-type transistors act like pieces of wire, providing a direct connection from C to ground. Because C is at ground, the rightmost p-type transistor acts like a closed circuit, forcing D to 2.9 volts. *Inverter* = NOT gate *3.2.4 DeMorgan's Law* NOT(NOT A AND NOT B) = A OR B

4.3 Instruction Processing

The central idea in the von Neumann model of computer processing is that the program and data are both stored as sequences of bits in the computer's memory, and the program is executed one instruction at a time under the direction of the control unit. *4.3.1 The Instruction* - Two Parts 1) Opcode (what the instruction does) => Bits [15:12] 2) Operands (who it is to do this instruction to) => [11:0] - each LC-3 instruction consists of 16 bits (one word), numbered from left to right, bit [15] to bit [0] - 2^4 distinct opcodes ex: ADD Instruction and LDR Instruction *4.3.2 The Instruction Cycle (Six Phases)* - Instructions are processed under the direction of the control unit in a very systematic, step-by-step manner - Instruction cycle = the sequence of steps - Phase = each step Six Phases of the Instruction Cycle A) Fetch = obtains the next instruction from memory and loads it into the instruction register (IR) of the control unit - In order to carry out the work of the next instruction, we must first identify where it is. The program counter (PC) contains the address of the next instruction. Thus, the FETCH phase takes multiple steps: 1) Load the MAR with the contents of the PC, and simultaneously increment the PC 2) Interrogate memory, resulting in the instruction being placed by the memory in the MDR 3) Load the IR with the contents of the MDR - the amount of time taken by each machine cycle is one clock cycle, and one machine cycle (or clock cycle) takes 0.303 billionths of a second (0.303 nanoseconds) B) Decode = examines the instruction in order to figure out what the microarchitecture is being asked to do - In the LC-3, a 4-to-16 decoder identifies which of the 16 opcodes is to be processed - Input is the four-bit opcode IR[15:12] - The output line asserted is the one corresponding to the opcode at the input. Depending on which output of the decoder is asserted, the remaining 12 bits identify what else is needed to process that instruction. C) Evaluate Address = computes the address of the memory location that is needed to process the instruction - Recall the example of the LDR instruction: The LDR instruction causes a value stored in memory to be loaded into a register. In that example, the address was obtained by adding the value 6 to the contents of R3. This calculation was performed during the EVALUATE ADDRESS phase. D) Fetch Operands = obtains the source operands needed to process the instruction - In the LDR example, this phase took two steps: loading MAR with the address calculated in the EVALUATE ADDRESS phase, and reading memory, which resulted in the source operand being placed in MDR. - In the ADD example, this phase consisted of obtaining the source operands from R2 and R6. (In most current microprocessors, this phase [for the ADD instruction] can be done at the same time the instruction is being decoded. E) Execute = carries out the execution of the instruction - In the ADD example, this phase consisted of the single step of performing the addition in the ALU. F) Store Result = The final phase of an instruction's execution. The result is written to its designated destination. - The final phase of an instruction's execution. The result is written to its designated destination. Repeat - Since the PC was updated during the previous instruction cycle, it contains at this point the address of the instruction stored in the next sequential memory location. - Thus the next sequential instruction is fetched next. Processing continues in this way until something breaks this sequential flow. *The LC-3 ADD and LDR instructions do not require all six phases. In particular, the ADD instruction does not require an EVALUATE ADDRESS phase. The LDR instruction does not require an EXECUTE phase.*

7.2 An Assembly Language Program

The program in Figure 7.1 multiplies the integer initially stored in NUMBER by 6 by adding the integer to itself six times. For example, if the integer is 123, the program computes the product by adding 123 + 123 + 123 + 123+ 123 + 123. Pseudo-ops = are messages from the programmer to the translation program to help in the translation process. Assembler = translation program Assembly = the translation process *7.2.1 Instructions* Instead of an instruction being 16 0s and Is, as is the case in the LC-3 ISA, an instruction in assembly language consists of four parts, as follows: | LABEL | OPCODE | OPERANDS | COMMENTS | Opcodes and Operands - Opcode = the thing the instruction is to do - Operand = the things it is supposed to do it to Labels = symbolic names that are used to identify memory locations that are referred to explicitly in the program - In LC-3 assembly language, a label consists of from one to 20 alphanumeric characters (i.e., a capital or lowercase letter of the alphabet, or a decimal digit), starting with a letter of the alphabet. Two Reasons for Explicitly Referring to a Memory Location 1) The location contains the target of a branch instruction (for example, AGAIN in line 0C). 2) The location contains a value that is loaded or stored (for example, NUMBER, line 12, and SIX, line 13). - If a location in the program is not explicitly referenced, then there is no need to give it a label. *7.2.2 Pseudo-ops (Assembler Directives) - the pseudo-op is strictly a message to the assembler to help the assembler in the assembly process. Once the assembler handles the message, the pseudo-op is discarded. Five Different Pseudo-Ops: .ORIG = tells the assembler where in memory to place the LC-3 program. .FILL = tells the assembler to set aside the next location in the program and initialize it with the value of the operand. .BLKW = tells the assembler to set aside some number of sequential memory locations (i.e., a BLocK of Words) in the program .STRINGZ = tells the assembler to initialize a sequence of n + 1 memory locations. .END = tells the assembler where the program ends. - .END does not stop the program during execution. In fact, .END does not even exist at the time of execution. It is simply a delimiter—it marks the end of the source program.


Conjuntos de estudio relacionados

Chronic PrepU- Ch. 55 Urinary Disorders

View Set

Chapter 1 Intermediate Accounting

View Set

Congress of Vienna/Ch 24 True or False

View Set

Chapter 13 - Bipolar (Psych) EAQ's

View Set

Flex Automation Chapters 5, 6, 7

View Set