Computer Architecture: ISA part 1

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Stack Code:

(Case 1) The data and the operations are pushed on the stack •We want to evaluate this expression: C = A + B •In this approach, the stack is initialized as below •Then, the two operands are popped; the operation is popped; they're added and the results is stored on the stack Top of Stack A B +

What are the instruction categories?

- Arithmetic and logic: integer arithmetic and logical operations - Data transfer: load-store - Control: Branch, jump, procedure call and return, traps - System: Operating system call, virtual memory management instructions - Floating point: Floating-point operations: add, divide, compare - Decimal: Decimal add, multiply, decimal-to-character conversion - Strings: String move, string compare, string search - Graphics: Pixel and vertex operations, compression/decompression operations

Why does the code run?

- The PC‐relative mode uses few bits to specify the offset between the PC and the target address - Rather than using a full address (32‐bit or 64‐bit) to indicate the branch target

Recap on the experiment setup

- The program is a fair program (benchmark) - The computer supports a lot of addressing mode - The compiler is unbiased; only aiming at high performance - Due to these constraints, this study is considered a best attempt to get unbiased measurements as to which addressing mode is the most popular

What is Register-memory Architecture?

1) A register- memory architecture can operate on data that's in the memory directly 2) The ALU can have one operand as a register (top) and the other operand from the memory (bottom) Therefore, there's np need to load the data from memory into a register beforehand Some instructions use two registers beforehand • However, usually, it's not possible for the ALU to use two memory locations (only one)

Who deals with ISA?

1) Assembly language programmer should know the assembly instructions (and their use) to write a program 2) The CPU hardware engineer should know the ISA since the CPU hardware implement all the assembly instruction 3) A complier writer should know the ISA because a compiler translates a high-level code(eg: C or C++) into assembly code

Personal mobile devices and embedded applications

1) Energy consumption should be minimized 2) In some embedded applications, the cost should be reduced since the product itself is inexpensive 3) Code size should be reduced because less memory means there's less power consumption and less cost of computer 4) It's optional to have a dedicated floating-point unit, therefore, the chip cost can be reduced

Layered Instruction Set Complications:

1) However, x86 is CISC architecture, containing complex hardware deign 2) RISC architectures, on the other hand, use simple instructions that make the hardware simple

How can Intel keep its x86 architecture to maintain backward compatibility and, at the same time, benefit from RISC's simple hardware design?

1) In the usual way, the instruction set is implemented directly by the hardware 2) However, Intel came up with a solution where the x86 instructions go through a hardware layer and are transformed into RISC-like intructions 3) Potentially, a complex x86 instruction becomes one or more of the RISC-like instructions 4) The CPU hardware then runs the RISC-like simple instructions This approach was used by Intel (e.g. in the Pentium 4 CPU)

Addressing Mode Scaled 2:

1) In this addressing mode, the address of the data in memory is: 100 + R2 + (R3 * Scale) 2) the scale is the size of the array element - If the array element is 4 bytes, then scale=4 3) Let's consider this instruction: Add R1, 0(R2)[R3] We initialize: R2 = 400 (the start address of the array) and R3 = 3 (since we want to access Array[3]) The address is: 0 + R2 + (4*R3) = 400 + 4*3 = 412 4*3 = 412 - This is the address of Array[3] in the memory)

what does ISA consist of?

1) Instruction set(list of all instruction **********embly language) 2) The instruction set includes the addressing mode 3) The format of the instructions

Layered Instruction Set History:

1) Intel's x86 instruction set provides backwards compatibility to earlier architectures - This is important commercially since a computer owner can run the program they've bought for their older computers

What is ISA's Impact on the computer?

1) It effects the overall performance of the computer 2) Various computer markets have their own specific requirements; a good ISA meets the requirements of the computer in which it's used

What is a server?

1) Server nowadays are mainly uses for databases, file servers and web application 2) Floating-point performance is not important in these 3) but integers and characters strings performance is important

What is the advantage of Register‐Memory Architecture?

1) The advantage of a register-memory architecture is that data can be accessed in the memory directly (without having to do a separate instruction to copy the data into a register)

What is ISA's Impact on computers?

1) practically, a certain instruction set can be used in all of the three computer types: desktop, server, mobile & embedded 2) The MIPS instruction set is used in all of these three computer types

How many bits should the displacement mode have?

16‐bit field is well utilized and is not an oversize choice look at slide 31-32

What is a byte?

A byte is a small data type (can ony represent 0 - 255 unsigned or from - 128 to +127)

What computer should we use for the study?

A computer that has a lot of addressing modes wince we want to see which ones the compiler will choose; so we have better have a lot of addressing modes available so we can filter the good ones

What is the disadvantage of Register‐Memory Architecture?

A disadvantage of register-memory architecture is that some instructions take a few clock cycles (those who don't access the memory; however, other instructions take many more clock cycles) s (these instructions that access the memory) Another disadvantage of register-memory architecture is that instructions encoding is not uniform among all instructions Some instructions what don't have memory address take a small number of bits as below: Op Reg 1 Reg 2 ___ ______ _______ on the other hand, instruction that have a memory address take a larger number of bits, such as below Op Reg 1 Memory Address ___ _____ __________________

Equations:

A module n = 0

Conclusion on Addressing Modes:

A newly designed CPU should support at least 'immediate', 'displacement' and 'register indirect' addressing modes due to their popularity The size of the 'displacement' field in the displacement addressing mode should be at least 12 to 16 bits The size of the 'immediate' field should be at least 8 to 16 bits

What kind of program should we use?

A program that is neutral and doesn't necessarily lead the compiler to favor one particular addressing mode (such as a benchmark)

Why is this an advantage?

A register would take a few bits to encode since the number of registers in the CPU is small Op Reg 1 Reg 2 Reg 3 ___ _____ ______ ______ Also, since all the instruction take (more or less) the same number of cycles, they're more suitable to build the pipeline However, register‐memory instructions are not uniform in the instruction clock cycles and can't easily build an efficient pipeline

What sizes do memory addressing come in?

A typical memory chip is "byte addressable" This means every byte has an address

What is Load Store Architecture?

Also known as register-register architecture It has two operands of the ALU as registers. The two operands of the ALU are registers (an operand can't be a memory location)

Register‐Memory Architecture Code Example:

Below is a code that evaluates the expression according to a register‐ memory architecture • The variables A, B, C and D are located in the memory initially: (A + B) * C/D Load R1, A // copy A from memory into R1 Add R1, B // add R1 to B from the memory Mul R1, C // multiply R1 by C from the memory Div R1, D // divide R1 by D from the memory

What are the pros and cons to simple and complex address modes?

Complex: Advantages: 1) Reduce the instruction count since the instruction is 'powerful' in its ability to access data from the memory; this reduces the memory use

What is control flow?

Control Flow instructions are the branch and the jump instruction in the code. (Such instructions can be 'conditional' or 'unconditional' )

What are the pros and cons to simple and complex address modes?

Disadvantage: More instructions will be used because there's less flexibility in accessing data from the memory (additional instructions are used to compute memory addresses)

What are the pros and cons to simple and complex address modes?

Disadvantages: 1) The hardware is complex since it implements the instructions' complex way of access of the memory There will be great variations between the number of clock cycles used by the instructions (instructions that use simple modes need few clock cycles; others that use complex modes take more clock cycle); this variation makes it difficult to apply pipelining

Autoincrement

Ex. Add R1, (R2)+ 1) This mode is used to access array elements in the memory 2) R2 is used as the address of the data in memory 3) When the data is fetched, R@ is incremented automatically so it's the address of the next array 4) Therefore, there is no need to increment R2 manually (through another 'add' instruction) R2 is incremented by 'n', where 'n' is the size (in bytes) of one array element slide 25 for more clarity

Addressing Modes Scaled:

Ex. Add R1, 100(R2)[R3] 1) This addressing more is used to access data (array elements or data structures) from the memory 2) First, let's look at the address of an array element in the memory 3) The address of an element A [y] is Start address + (Element Size in bytes * y) 4) For example, the address of A[3] = 200 + (4*3) = 212

Memory indirect mode:

Ex. Add R1, @(R3) Register R3 is the address of the pointer; once the pointer is read, another memory access fetches the data

What is immediate?

Ex. addi t0, zero, 4 it is useful since it is used too load constants in the register

What is displacement?

Ex. lw t0, 12(s0) it is useful since it allows having an offset from the base register therefore it can be used to access array elements A[4]= 0; -> sw zero, 12(s0)

If we're designing a new CPU, which addressing mode should we include?

First, measure the frequency of addressing modes in a typical program adopt the mode with the high frequency

What are the common data types today?

For desktops and servers, the common data sizes are: - Character (8-bit) - Half word (16-bit) - Word (32-bit) - Single-precision floating-point (32-bit) - double-precision floating-point The data type and size are often related - For example, a character is usually either 8‐bit (ASCII) or 16‐bit (Unicode)

Which data types are the most used in a program?

For integer operation, the 64-bit data is the most popular For the floating-point operations, the 64-bit data type is also the most popular Otherwise, the 32-bit data (integer and floating-point) also have 26-29% frequency of use

which instructions are most widely executed?

Generally, across all architecture, the mostly widely executed instructions are the simple operations in the instruction set

What types of Instruction Set Architectures is there?

ISA can be classified based on the types of internal storage that they use The internal storage can be: Stack, Accumulator, or Registers The various types of ISAs are: - Accumulator architecture - Stack Architecture - Memory-Memory architecture - Register-memory architecture - Load-store architecture

When is memory always aligned?

If 1 byte, it is always aligned. Data types 2 are aligned when the're stored at bytes [0-1], [2-3], [4-5]... if a 2-byte data type is stored at [1-2] or [3-4], spanning the natural boundary, then it's not aligned

Is it better to have the memory aligned? If so, why?

If the memory is aligned, the computer only uses one cycle opposed to 2 for unaligned data. It is more efficient to keep it aligned.

Which were the most used?

Immediate and displacement addressing modes

What is a word?

In a computer, a unit of data (word) is 16-bits, 32-bits, or 64-bits Therefore, a 'word' occupies multiple memory address in the memory

How does Load-Store Architecture Work

In a load-store architecture, the data is first 'loaded' from the memory into a register The data is then used in ALU computations Finally, the data is 'stored' back in the memory

How many operands does Load-Store Architecture have?

In load-store architecture, there are usually three operands per instruction This is the case in the MIPS architecture Example: add $t0, $t1, $t2 • The instruction above adds registers $t1 and $t2 and stores the result in register $t0 (registers $t1 and $t2 are not modified)

Little Endian

In the little Endian representation, the data type ends at the 'little address' it represents the data with the least important bit in the front Ex. : The 32‐bit hex number 01 2E AC 34 is represented in the memory as shown below: Address: 0 1 2 3 ___ ___ ____ ___ 34 AC 2E 01 The string "ABCD" is represented as shown below: Address: 0 1 2 3 ____ ____ ____ _____ D C B A

what are the numbers in the present day?

Integer = 64-bit number A signed 32-bit integer can represent numbers in the range of [‐2billion, +2billion]; while this is suitable for many applications, some implementations prefer to use the 64-bit integer to reduce the possibility of overflow floating point = double data type, 64-bits data type showed up as the most frequent 32‐bit number, whether integer or floating‐point, are still useful data types and showed a 26‐29% use

what are other benefits have aligned memory?

It is also beneficial to aspects of architecture. Ex. In MIPS, the memory is aligned and, therefore, the valid word (word = 4 bytes) addresses are: 0, 4, 8, 12... These addresses are multiples of 4 end in '00' when they're written in binary The encoding of the 'beq' instruction uses this knowledge and the '00' are not encoded in the instruction They're added automatically when the instruction is decoded This increases the branch range by a factor of 4! :D

What is the addressing modes?

It refers to how an instruction specifies the address of a operands: Ex. 1) A register that usually consist of a few bits - with 8 registers on the CPU, the register address is 3-bit, etc. 2) A memory address that usually has more bits than a register address 3) An immediate number - there's no real address here, the constant value is encoded in the instruction.

Simpy vs. Complex

Look at slide 56

What is Memory-Memory Architecture?

Memory-memory architecture keeps all the data in the memory (no data is stored in the registers) NO USED NOW-A-DAY! • In the past this approach has been advocated since it makes the compiler simple • The variables don't have to be allocated to registers since they always reside in the memory

How are integers represented?

Nowadays, integers are represented as two's complement binary numbers on all computers (this wasn't the case in some of the earliest computers)

The benefits of Stack Architecture:

One benefit of stack architecture is the compiler doesn't have to do variable-to-register allocation This used to be a difficult problem and the stack architecture tries to circumvent it These are two variants of the stack architecture: 1) the data and the operations are pushed on the stack 2) The data only is pushed on the stack; the operations come from the code

What is the benefit of Accumulator Architecture?

One benefit to the accumulator architecture is that the instruction can be encoded on a small number of bits (like 8‐bit instruction)

Conclusion on size of operand:

One conclusion we can draw from this result is that it makes sense to have a 64-bit path to the memory since most of the data fetched is 64-bits (This will allow fetching the data from the memory in one clock cycle verse 2 cycles is the the path was 32-bit)

What is a disadvantage of Load‐Store Architecture?

One disadvantage of Load‐Store Architecture is the number of instructions in the code is large This is because all the variables in the memory should be loaded first before they can be used However, the register‐memory can access memory data without loading them

What is the difference between Register memory and MIPS?

Register memory architecture usually use two operands in the instruction (add eax, ebx). as oposed to MIPS which uses three operands in the instructions

What types of instruction set Architecture are used today?

Register-memory architecture Load-store architecture

What are the pros and cons to simple and complex address modes?

Simple: Advantage: 1) Keep the hardware simple because the hardware implements the instructions 2) Keep the CPI (Clock-per-instruction) small since the instruction does a small task

Why are the small and large values more useful then the intermediate ones?

Small immediate values: •Small immediate values are used in computations in a program •For example, loading a small value as a loop counter bound; or incrementing a counter by 1 Large immediate values: •The large immediate values occur when an address or a mask is loaded into a register (eg: load the address 8000 into R1) •Another use of large immediate values is to load a 32‐bit address into a register;

What types of instruction set Architecture are not used today?

Stack architecture Accumulator architecture Memory-memory architecture

Which vary?

System varies widely among CPU's

What is Accumulator Architecture?

The Accumulator architecture was the main approach among the earliest of CPUs These CPUs didn't have a lot of storage space (not possible to put multiple registers) Therefore, the accumulator register was the default register that's used in all the operations 1) The accumulator is always an operand of the ALU (called the implicit operand) 2) The other operand is a data that's in the memory (called explicit operand)

Is Accumulator Architecture still useful now-a-day?

The Accumulator is not used anymore in the modern CPU design The architectures that are popular today have general‐purpose registers (GPR) A general‐purpose register can be used to store any variable

Types and Size of Operands

The CPU should be able to do operations on multiple types and sizes of operands Types: Character, integer, floating point. Size: 8-bit, 16-bit, 32-bit, 64-bit

what is PC-Relative Addressing?

The PC-relative addressing mode is used in 'branch' instructions Example of syntax: branch R1, R2, Label This is a possible encoding: Op R1 R2 Offset ____ ____ ____ _______ In PC-relative addressing mode, the 'offset' is added to the PC (Program Counter) Branch address is: PC + Offset

How does the code run?

The PC-relative mode allows the code to run independently of where it is loaded in the memory - This is referred to as 'position independence' (this eliminates some work when the program is linked)

What is Stack Architecture?

The Stack Architecture has been used up to 1980 The ALU operands are popped from the stack, and ALU does the operation, add the result is pushed on the stack The two operands(top 2 words on the stack) are popped from the stack, the ALU does the operation, and the result is pushed on the stack Data from memory can be loaded into the stack (at the top position) Similarly, the top position of the stack can be stored in the memory

Where is the stack located?

The Stack is usually located in memory however, since the stack is used extensively in the stack architecture, the top few words of the stack can be saved in registers to make the operations fast; the remaining stack locations are in the memory

What is the advantage of Load‐Store Architecture?

The advantage of Load‐Store Architecture is that the instructions have a simple encoding and are usually fixed-length

How does the CPU know which type and size the operand is?

The approach used nowadays is through the opcode (for every data type and size, a different opcode is used) Another approach used in earlier computers is called 'tagging'

Stack Architecture Code Example 3:

The code below evaluates the expression: (A+B) * C/D How can we rewrite this code without using the 'temp' variable? Push A Push B Add Push C Multiply Pop Temp Push D Push Temp Divide

Accumulator Architecture Code

The code below evaluates the expression: C = A+B • The variables, A, B and C are in the memory 1) The first instruction loads the variable A in the accumulator (the label 'A' serves as a memory address) 2) The second instruction grabs the variable B from the memory and adds it to the accumulator; the result of the addition goes in the accumulator 3) The third instruction stores the accumulator in the memory at the variable C 12 Load A // load A in accumulator Add B // add B to the accumulator Store C // store the accumulator in C

In this experiment, who will pick the addressing mode?

The compiler (assuming that the compiler is in excellent condition and it's selecting the addressing modes that maximize the performance)

Why does the scaled addressing mode contain a constant number? (The number '100' in the instruction above)

The constant number is used to skip a 'record' in data structure

Big Endian

The data type ends at the 'big address' Ex. The 32‐bit hex number 01 2E AC 34 is represented in the memory as shown below: 01 Address: 0 1 2 3 ____ ____ _____ _____ 01 2E AC 34 The string "ABCD" is represented as shown below Address: 0 1 2 3 ____ ____ _____ ______ A B C D Big Endian is preferred in this case, because it reads it in order

What is Memory Alignment?

The memory is aligned when access to data types that are larger than 1 byte happen at the natural boundaries

How should we specify the destination?

The most common way to specify the destination is to use a displacement that is added to the PC The PC-relative addressing is a good idea since the target address is often close to the PC - The PC‐relative mode uses few bits to specify the offset between the PC and the target address - Rather than using a full address (32‐bit or 64‐bit) to indicate the branch target

Where does the procedure return return the data?

The procedure returns to the various addresses depending on there it is called from Therefore, the return address cannot be encoded in the jump instruction for procedure returns (the return address is usually saved in a register and procedure returns use a jump-to-register instruction procedure returns)

Is Stack Architecture used in modern CPU architecture? Why or why not?

The stack architecture is not used in the modern CPU architecture One advantage for the stack architecture is that it's simple to write the compiler • The problem of allocating the variables to register isn't difficult since all the variable are eventually passed to the top of the stack

What complexity of language should we use?

The study will use VAX architecture since it supports a larger number of addressing modes

How many control flows are there?

There are four. These are the following: - Conditional branches - Jumps - Procedure calls (links the return address) - Procedure return (jumps back to the calling code)

What does it mean to increment?

When the address increments by 1, we are referring to the next byte in the memory Address: 0 1 2 3 _____ _____ _____ _____ 1 byte 1 byte 1 byte 1 byte

Program linking:

When the program is linked, library functions are added to the code written by the programmer The jump addresses between the user code and library functions are resolved With PC-relative addressing, no addresses need to be resolved since the target address is specified to the PC The PC‐relative mode is also suitable for programs that are linked dynamically at run‐time

How does tagging work?

With the tagging approach, the opcode is the same. The opcode is the same (eg. ADD) whether the data is integer, floating-point, character... The tag indicates the size and type of the operand THE TAG APPROACH IS NOT USED ANYMORE NOW WE HAVE CLEARLY DEFINED DATA TYPES THAT ARE USED ACROSS ALL COMPUTERS

Which are well supported in the CPU?

arithmetic & logic, decimal transfer, control

Which depend on the CPU model whether they are supported or not?

floating-point decimal string graphic

Which is the most popular?

look at slides 47-49

Why can't Load-Store Architecture access memory directly?

the ALU can't access a memory location directly To do an ALU operation on a memory location, first the data is copied from the memory into a register Then, the ALU can access the data access the data

How many bits should the 'immediate' field be?

the immediate field supported is 16‐bit Figure A.10 shows that, for integer instructions, small immediate values are quite useful (that use 6 bits or less) and large immediate values are useful (that use 13 bits or more) - The values in the middle, that use between 7 and 12 bits are not as frequent

Little Endian Vs. Big Endian

there are two ways to store a data type larger than one byte into memory Little Endian and Big Endian

Unpacked Decimal

• In the architectures that use packed decimals, a regular numeric string is called an 'unpacked decimal' since each number takes one byte • For example, the string "47" takes two bytes which are two ASCII codes • "47" "ASCII of 4", "ASCII of 7" 00110100 00110111 • Some CPUs support packing and unpacking to convert between the two formats

Accumulator Architecture Code Example 2:

• The accumulator code below evaluates this expression (A+B) * C/ D • The variables A, B, C and D are in the memory (A+B) * C / D Load A // load A in the accumulator Add B // add B to accumulator Multiply C // multiply C to the accumulator Div D // divide the accumulator by D • What is the accumulator code for this expression? (A+B) / (C+D)

packed decimal ('binary‐coded decimal)

• The digits are: 0 to 9 • Each digit is represented on a 4‐bit number • In addition, two digits are stored side‐by‐side in an 8‐bit number • This operation is called 'packing' • For example: the number 47 is written as 4: '0100' and 7: '0111' • The packed representation is to put these two digits in one byte: '01000111'

Load‐Store Architecture code:

• The expression below is evaluated with the following code in a load‐store architecture • The variables A, B, C and D are initially in the memory (A + B) * C/D Load R1, A // copy A from memory into R1 Load R2, B // copy B from memory into R2 Add R3, R1, R2 // R3 contains A+B Load R1, C // copy C from memory into R1 Mul R3, R3, R1 // R3 now contains (A+B)*C Load R1, D // copy D from memory into R1 Div R3, R3, R1 // R3 contains the final result

Jal MIPS instruction:

• The instruction 'jal' (jump and link) jumps to the label and saves the return address automatically in register $ra Jal Procedure1 - The register $ra is always used, therefore, it's not encoded in the instruction - Such a register is generally called the 'link register'

Stack Architecture Code Example 2:

•This is the expression we're evaluating: C = A + B •In this approach, the code pushes A and B from the memory onto the stack •The 'Add' operation adds them •Finally, the 'Pop' operation grabs the top of stack (the result of the addition) and stores it a the variable C in the memory Push A Push B Add Pop C


Ensembles d'études connexes

Psys 100 chapter 2 study guide forbey

View Set

17 & 18 Building Real World Network & Managing Risks

View Set