CS 271 - Computer Architecture and Assembly Language
large value prefixes for decimals and binary
[prefix]: [decimal multiplier] or [binary multiplier] kilo: 10^3 (k) or 2^10 (Ki) mega: 10^6 (M) or 2^20 (Mi) giga: 10^9 (G) or 2^30 (Gi) tera: 10^12 (T) or 2^40 (Ti) peta: 10^15 (P) or 2^50 (Pi) exa: 10^18 (E) or 2^60 (Ei) zetta: 10^21 (Z) or 2^70 (Zi) yotta: 10^24 (Y) or 2^80 (Yi) binary prefixes often omit the 'i' and are followed by b/B for bits/bytes
stored program
a program is an organized list of instructions and data which both reside in memory, and are NOT separated and managed independently key concept of Von Neumann computer architecture
backwards compatability
ability of new devices to support old systems
array address calculation
address of list[n] = address of list + (n - 1)(TYPE) to get to nth element of list
passing parameter by reference
address of memory location passed using OFFSET, change will be visible outside the procedure
endianness with strings and arrays
affects byte (not bit) ordering, only impacts multi-byte data types in arrays, index ordering is not impacted but each element within an array can be in strings, endianness does not have any impact
unconditional branching
always causes branching
indirect operand
array access method uses any 4-byte multi-purpose register surrounded by brackets ex: [ESI]
indexed operand
array access method uses data label of array as placeholder for array's base address ex: myArray[8], many other forms
base+offset
array access method - type of indirect operand uses register (base pointer) + immediate or register (byte offset) surrounded by brackets ex: [EBP + 4] or [EDI + EAX]
register indirect
array access method - type of indirect operand uses register surrounded by brackets which is directly incremented/decremented to reference elements ideal for iterating through arrays ex: [ESI] and ADD ESI, TYPE myArray
TYPE
array operator returns number of bytes of each element in the array
LENGTHOF
array operator returns number of elements in the array (only reads first line of multi-line declaration)
SIZEOF
array operator returns size of memory assigned in declaration of array (only reads first line of multi-line declaration) LENGTHOF x TYPE = SIZEOF
decimal
base 10 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
hexadecimal
base 16 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
binary
base 2 digits: 0, 1
octal
base 8 digits: 0, 1, 2, 3, 4, 5, 6, 7
post-value number representation
base indicated after last digit: binary (b), octal (o), decimal (d), hexadecimal (h) leading 0 added if first digit is non-numeric ex: 1011b, 173o, 432d, 0A1Bh
pre-value number representation
base indicated prior to value: binary (b), octal (t), decimal (n), hexadecimal (x) leading 0 required ex: 0b1011, 0t173, 0n432, 0xA1B
big vs. little endian benefits
big endian - simpler to read, easier to sign check little endian - simpler arithmetic operations, typecasting, and some types of parity checking (odd vs. even)
integer literal radix indicators
binary (b, y) octal (o, q) decimal (d, t, none - default) hexadecimal (h)
fundamental data types
bit nibble (4 bits) byte (8 bits) word (2 bytes) etc...
central processing unit (CPU)
carry weight of computational load subcomponents: system clock, registers, buses, ALU, CU, memory I/O, on-chip memory
literal
character or numeric value
components of an instruction
code label (optional) instruction mnemonic (required) operands (often required) comments (optional)
return address
code segment address where the program will return to after the called procedure finishes running
linker
combines machine language objects into a single executable file
system clock
computer's metronome guarantees synchronization of operations
arithmetic logic unit (ALU)
computes basic arithmetic (addition, subtraction, multiplication, division) and logical (and, or, not) operations
floating point unit (FPU)
computes floating point arithmetic operations
preconditions
conditions that must be satisfied before a procedure is called, a subset of which are inputs
postconditions
conditions that will exist after a procedure is called, a subset of which are outputs
register
container for storing data fastest access memory on the chip
passing parameter by value
contents of memory location passed, changes will not be visible outside the procedure
runtime stack
contiguous series of memory locations used to pass parameters to/from procedures and for general purpose temporary storage
direction flag (DF/UP)
control flag (bit 10 of EFLAGS) controls direction of automatic memory traversal for string instructions, set using STD for auto-decrement of ESI/EDI (moving backwards in memory), clear using CLD for auto-increment of ESI/EDI (moving forward in memory)
interrupt enable flag (EI/I)
control flag (bit 9 of EFLAGS) set to issue hardware-generated interrupts, clear to mask them
control flags
controls toggles for the program bits 9, 10 of EFLAGS
input/output (I/O) unit
controls transfer of information between memory/CPU and attached onboard/peripheral devices
cross assembler
converts between machine languages
control register
coordinates execution of instruction by running a micro-program
ways to save data
copy data to a memory variable vs. push data to the stack
listing file (.lst)
copy of the program's source code, with memory addresses (in hex) of data/code labels and instructions, along with hex opcodes
$
current location counter gives address of instruction or data label of the current line only can be used to find the length of strings or arrays
types of instructions
data movement arithmetic/logical operations comparison branching procedure flow control
when to preserve register contents
data saved by the calling procedure vs. data saved by the called procedure
stack
data structure with a LIFO (Last-In, First-Out) property operations reference the top of the stack, manipulate ESP, and update the value of ESP
instruction decoder
decodes current instruction and passes details to control register
directives
direct instructions to the assembler
INVOKE
directive calls a procedure at the address given by the expression w/ comma-delimited arguments ex: INVOKE ExitProcess, 0
LOCAL
directive creates unique data or code label local to a particular procedure or macro creates and terminates stack frame, so do not include EBP set-up/take-down instructions if using LOCAL
=
directive defines numerical constants
EQU
directive defines numerical or text constants can be used w <> to prevent expressions from being evaluated before the program is assembled
TEXTEQU
directive defines text macros which can be a literal string, the contents of an existing text macro, or an expression (if preceded by %)
ENDM
directive defines the end of a macro ex: mWriteStr ENDM
ENDP
directive defines the end of a procedure ex: main ENDP
MACRO
directive defines the start of a macro ex: mWriteStr MACRO
PROC
directive defines the start of a procedure ex: main PROC
INCLUDE
directive inserts source code from specified file into current file during assembly ex: INCLUDE Irvine32.lib
.code
directive marks beginning of the code segment ex: .code
.data
directive marks beginning of the data segment ex: .data
END
directive marks the end of the module, optionally set program entry point (otherwise execution starts on first byte of generated executable file) ex: END main
.stack
directive sets the size of the stack ex: .STACK [size] defaults to 1024
USES
directive used in conjunction with a procedure header to specify which registers to save ex: someProc PROC USES EAX EBX
components of MASM programs
directives identifiers memory locations literals data types instructions
DWORD, WORD, BYTE...
directives specify memory space to be allocation for variable storage ex: valA DWORD 40
REAL8
double precision floating point data type 64 bits - 1 sign bit, 11 exponent bits, 52 mantissa bits
even parity
each bit grouping must have an even number of 1 bits
odd parity
each bit grouping must have an odd number of 1 bits
multicomputer parallelism
each processor has its own memory, completes its own subtask, then communicates w other processors through an interconnection network ex: internet, cloud computing
protected mode
each running program is reserved a linear address space in memory of 4GB, and programs aren't allowed to access each other's allocated space, prevents direct access to EIP native operating mode
FPU components
eight 10-byte registers named R0-R7, organized as a pushdown stack TOP indicates top register ST(0)-ST(7) correspond to the top through bottom of stack
signal noise
electrical interference
electronic representation of bits
energy states (voltage levels) within circuits, with values over X assigned to 1 and values under X assigned to 0
post-test loop
equivalent to a do-while loop will execute 1 or more times
counted loop
equivalent to a for loop will execute until ECX is 0, ECX is decremented after every loop
procedure
equivalent to a function/method in high-level programming languages
pre-test loop
equivalent to a while loop will execute 0 or more times
conditional branching
equivalent to if/elif/else statements
parallelism
execution of multiple parts of code at the same time
REAL10
extended precision floating point data type 80 bits - 1 sign bit, 15 exponent bits, 64 mantissa bits
parity bit
extra bit added to each bit grouping, set to either 0 or 1 to make the grouping match the system parity
stack and heap shared space
finite space in stack segment stack starts at end and grows toward the beginning heap starts at beginning and grows toward the end
FADD
floating point instruction add source to destination
FABS
floating point instruction clear sign of ST(0), so gives absolute value
FST / FIST
floating point instruction copy top of FPU stack to operand, does not pop value so ST(0) remains unchanged
FDIV
floating point instruction divide destination by source
FDIVR
floating point instruction divide source by destination
FCHS
floating point instruction invert sign of ST(0)
FMUL
floating point instruction multiply source by destination
FSTP / FISTP
floating point instruction pop top of FPU stack to operand, shifts ST(n)
FLD / FILD
floating point instruction pushes value from operand to FPU stack
FSUBR
floating point instruction subtract destination from source
FSUB
floating point instruction subtract source from destination
FSQRT
floating point instruction take square root of destination
base number systems
for base n, each digit represents a multiple of a power of n positions are numbered right to left, starting at 0, corresponding to their power of n
EBP
general purpose register special purpose: stack pointer, points to the base of a stack frame (should not be used for anything else)
ESP
general purpose register special purpose: stack pointer, points to the top of the runtime stack aka last value pushed or added to the stack (should not be used for anything else)
EDI
general purpose register special purpose: used by memory transfer functions as the destination address (should not be used for anything else)
ESI
general purpose register special purpose: used by memory transfer functions as the source address (should not be used for anything else)
general purpose registers
general storage of values, pointers, operands, etc. most have additional special purposes, except for EBX list: EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP
complex instruction set computer (CISC)
has ISA that creates a micro-program for each instruction which is then executed
reduced instruction set computer (RISC)
has ISA that directly executes instructions and has a smaller set of instructions
instruction register (IR)
holds code describing the current instruction being executed
instruction pointer (IP)
holds memory address of the next instruction to be executed
memory address register (MAR)
holds memory address where the read/write will occur this value goes to the address bus
memory data register (MDR)
holds value to be written to / read from memory this value goes to / comes from the data bus
code label
identifiers that refer to memory locations in the code segment, allowing us to use the code label instead of the address offset when referring to a specific instruction or block of instructions
data label
identifiers that refer to memory locations in the data segment, allowing us to use the data label instead of the address offset when referring to a specific variable
status flags
indicate results of arithmetic instructions bits 0, 2, 4, 6, 7, 11 of EFLAGS
initializer
initial value put into the allocated memory space for the data label ? indicates leaving the memory location uninitialized
CLD
instruction clears direction flag, primitives will automatically increment their pointer
POP
instruction copies value at top of stack into operand then increments ESP value is not removed, still there until overwritten cannot POP to immediate
LOOP
instruction decrements ECX then JNZ to address of operand destination address must be within [-127, 128] bytes of the instruction address
PUSH
instruction decrements ESP then copies operand onto the stack change in ESP memory address equal to size of operand (in bytes)
FINIT
instruction initializes FPU stack, must be executed before any floating point instructions
CMP
instruction performs implied subtraction of second operand from first operand, changes EFLAGS instead but does not overwrite operand values used in conditional branching
RET
instruction pops top of stack into EIP (when given no operands) w operand n... adds n bytes to ESP after EIP assigned, used to de-reference any parameters passed to the stack before the CALL
CALL
instruction pushes EIP corresponding to return address onto the stack then jumps to beginning of named procedure
JMP
instruction sets EIP to address specified by the operand used in unconditional branching
STD
instruction sets direction flag, primitives will automatically decrement their pointer
pipelining
instruction broken down into parts and hardware provides separate units that each perform a specific part
EIP
instruction pointer register that contains the offset in the current code segment of the next instruction to be executed cannot be directly accessed
types of hardware parallelism
instruction-level parallelism (pipelining, instruction caching) processor-level parallelism (multiprocessors, multicomputers)
Jcond
instructions JMP to address of operand if condition is satisfied used in conditional branching
PUSHA / POPA
instructions push/pop all of the 16-bit versions of the general purpose registers onto/from the stack order of push: AX, CX, DX, BX, SP, BP, SI, DI
PUSHAD / POPAD
instructions push/pop all of the 32-bit general purpose registers onto/from the stack order of push: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI
PUSHFD / POPFD
instructions push/pop the 32-bit EFLAGS register onto/from the stack
on-chip memory (L1 cache)
largest memory that resides on the CPU very fast but very expensive, so generally small
little endian
least significant byte stored first in memory ex: x86-64 systems
block starting symbol (bss) segment
location in memory of additional static data for variables or constants
code segment
location in memory of current program
data segment
location in memory of data from current program
stack segment
location in memory of runtime stack and heap for current program
language hierarchy
lvl 4: natural languages lvl 3: programming languages lvl 2: assembly languages lvl 1: machine language lvl 0: machine hardware one-to-one correspondence of assembly languages to machine languages
control unit (CU)
manages flow of execution subcomponents: IP, IR, instruction decoder, control register, status register
big endian
most significant byte stored first in memory ex: networks
multiprocessor parallelism
multiple processors each complete their own subtask and communicate through shared memory
multiprocessor vs. multicomputer parallelism
multiprocessor system is hard to build but relatively easy to program multicomputer system is relatively easy to build but hard to program
input-output parameter
must be passed by reference data will be used and memory containing data will be modified by procedure
output parameter
must be passed by reference memory containing data will be modified by procedure
identifiers
names of variables, functions, labels, etc.
identifier restrictions
no spaces not case sensitive can start w and include letters, underscore, @, ?, $ can include numerical digits after the first character cannot be a reserved word
real-address mode
only one program can run at a time, any memory address within its 1 MB memory space is accessible including those directly linked to hardware provides compatibility with legacy programs
implied operand
operand that is used implicitly without being defined usually in single-operand instructions
OFFSET
operator returns pointer to a memory variable
PTR
operator used when destination operand is de-referenced pointer bc it does not have an implicit type ex: MOV DWORD PTR [EDI], 10 instead of MOV [EDI], 10
DUP
operator that replaces the initializer for a data label and allocates space for the storage of multiple items usually used when defining empty strings or arrays n DUP(m) for n contiguous elements initialized to m
software parallelism
parallelizability of algorithms depends on... number of processors available trades b/t increased efficiency and costly network overhead of parallel activities sequential vs. parallel code
potentially parallelizable part
part of program that can be parallelized, either partially or completely
inherently sequential part
part of program that cannot be parallelized
bus
physical pipelines for transfer of data transfer rates correspond to bus width
RISC benefits
physically smaller, so same performance for cheaper lower complexity of CPU circuitry, so long-term reliability lower power consumption, so more environmentally friendly lower operating temperature, so reduce costs of heat dissipation components
segment registers
pointers to specific segments of memory, whose usage depends on the mode of operation
ways to push data to the stack
preserving/restoring all registers using PUSHAD/POPAD vs. selectively preserving/restoring used registers using PUSH/POP (*preferred*)
ECX
primary general purpose register special purpose: counter
EDX
primary general purpose register special purpose: implied extension for EAX for ALU operations that extend beyond 32-bits
EAX
primary general purpose register special purpose: implied operand in many ALU operations
EBX
primary general purpose register special purpose: none
SCASB / SCASW / SCASD
primitive instruction compare accumulator to memory addressed by EDI, then updates pointer
CMPSB / CMPSW / CMPSD
primitive instruction compare memory addressed by ESI to memory addressed by EDI, then updates pointers
MOVSB / MOVSW / MOVSD
primitive instruction copy data from memory addressed by ESI into memory addressed by EDI, then updates pointers
LODSB / LODSW / LODSD
primitive instruction load memory address by ESI into accumulator, then updates pointer
STOSB / STOSW / STOSD
primitive instruction store accumulator contents into memory addressed by EDI, then updates pointer
RISC design principles
prioritize single-cycle instruction execution instructions executed directly by hardware use of instruction cache only two memory access instructions - LOAD and STORE simplification of instructions few addressing modes
called procedure
procedure that is the operand of the CALL instruction
control module
procedure that provides an outline for the program and calls other procedures
calling procedure
procedure where the CALL instruction is used
system management mode
processor will switch to a separate address space so vital memory isn't affected, program execution halted until end enables special configurations to be implemented
IA-32 architecture modes of operation
protected mode real-address mode system management mode
loader
reads an executable file and stores contents into memory for execution
memory I/O
reads/writes information in main memory using MAR and MDR
types of operands
reg - register mem - memory location imm - immediate accum - AL, AX, or EAX segreg - segment register shortlabel - location in the code segment w/i -127 to 128 bytes of current location nearlabel - location in the current code segment farlabel - location in an external code segment instruction - instruction
instruction set architecture (ISA)
registers, instructions, and operands which define an assembly language
REP
repeat prefix runs instruction then decrements ECX, repeat as long as ECX > 0
REPNZ / REPNE
repeat prefix runs instruction then decrements ECX, repeat as long as ECX > 0 and zero flag is clear only SCAS* and CMPS* modify zero flag
REPZ / REPE
repeat prefix runs instruction then decrements ECX, repeat as long as ECX > 0 and zero flag is set only SCAS* and CMPS* modify zero flag
stack trace
report of the active stack frames at a certain point during the execution of the program
CS
segment register points to beginning of code segment
DS
segment register points to beginning of data segment
ES
segment register points to data storage
FS
segment register points to data storage
GS
segment register points to data storage
SS
segment register points to stack segment
call stack
series of activation records from nested procedures stacked on top of each other
status register
set of flags corresponding to processed data or indicating ways in which data will be processed
instruction mnemonic
short, descriptive phrase that indicates which instruction is being executed
virtual machine
simulates another computer's architecture
REAL4
single precision floating point data type 32 bits - 1 sign bit, 8 exponent bits, 23 mantissa bits
sub-register access method
special property of general purpose registers that allows direct access of a subset of their bits for the 8 general purpose registers, access the lower 16 bits by removing E from register name (AX, BX, CX, DX, SI, DI, SP, BP) for the 4 primary general purpose registers, access the lowest (AL, BL, CL, DL) or highest (AH, BH, CH, DH) bytes of the 16-bit versions
instruction caching
speeds up rate of fetching instructions and reduces idle time by fetching multiple instructions at a time efficient for sequential areas of programming, but anywhere w branching/repetition/etc causes work to be done caching instructions that will never be executed
pushdown stack
stack with limited capacity that overwrites the bottom element if exceeded, any of the registers can be accessed directly using ST(n)
single-line comment
starts with ; and includes all following text on that line
multi-line comment
starts with COMMENT followed by a special character and includes all text until the special character is encountered again ex: COMMENT !
instruction
statement that makes processor process information
control structure
statements that control the flow of execution of the program
carry flag (CF/CY)
status flag (bit 0 of EFLAGS) set if unsigned arithmetic operation generates a carry/borrow of the most-significant bit of the result
overflow flag (OF/OV)
status flag (bit 11 of EFLAGS) set if signed arithmetic operation generates a result that is too large a positive number or too small a negative number to fit in the destination operand
parity flag (PF/PE)
status flag (bit 2 of EFLAGS) set if the least-significant byte of the result contains an even number of 1 bits
auxiliary carry flag (AF/AC)
status flag (bit 4 of EFLAGS) set if binary-coded decimal arithmetic operation generates a carry/borrow of bit 3 of the result
zero flag (ZF/ZR)
status flag (bit 6 of EFLAGS) set if result is zero
sign flag (SF/PL)
status flag (bit 7 of EFLAGS) set equal to the sign bit of the result
EFLAGS
status/control register used to control/report on the status of the processor
[reg]
syntax to deference a pointer and give the value at that address instead
activation record / stack frame
the return address, argument, local variables, and preserved registers of a procedure which are all pushed onto the stack
computer memory model
to access data... 1. place memory address on address bus 2. trigger memory read 3. wait a clock cycle for the information to be copied onto the data bus 4. move data into its destination so it can be processed by another component data is copied, so must go back and trigger a memory write if changes are to be saved bottleneck at step 3 b/c data bus used for both instructions and data, so additional tiers of cache memory exist
parity
total number of 1 bits (including the added parity bit) in a grouping
precision vs. range in floating point numbers
tradeoff between precision (up to 1.4 x 10^-45) and range (up to +/- 3.4 x 10^38) exponents in [0, 126] indicate more precision exponent of 127 is perfectly centered exponent in [128, 255] indicate more range
caching
transferring information from a farther, slower memory location to a closer, faster one
input/output (I/O) bus
transfers data between CPU and attached onboard/peripheral devices
data bus
transfers instructions/data between RAM and CPU
address bus
transfers memory addresses between RAM and CPU
assembler
translates assembly language code to machine language objects
compiler
translates programming language code to assembly language code
branching structure
type of control structure moves execution of the program to a specified point in the code segment
repetition structure
type of control structure repeats execution of a certain segment of the program
control bus
uses signals to coordinate all devices attached to the system
input parameter
usually passed by value, but may be passed by reference (ex: arrays) data will be used by procedure
return value
value determined by the called procedure and communicated back to the calling procedure
argument (actual parameter)
value/reference that is passed to a procedure
parameter (formal parameter)
value/reference that is received by a procedure
bit error
variation of energy state that causes the converter to incorrectly assign a bit value surprisingly common, especially in high data-rate or long distance signal transmission
source operand
where data is copied from usually the second operand in two-operand instructions
destination operand
where result is stored usually the first operand in two-operand instructions
range of unsigned integers w/ n bits
0 to 2^n - 1
RISC instruction execution cycle
1. FETCH instruction 2. EXECUTE instruction
CISC instruction execution cycle
1. FETCH next instruction to be executed (IP incremented immediately after) 2. DECODE instruction (and fetch operands if required) 3. EXECUTE instruction by feeding it and any operands into the control unit (and store result into output operand)
comparison of when/how to preserve/restore registers
1. calling procedure uses memory 2. calling procedure uses the stack both require calling procedure to track used registers which does not promote good procedure modularity 3. called procedure uses memory good modularity but requires additional memory allocation and usage of global variable 4. called procedure uses runtime stack (*preferred*) good modularity, no global variables, efficient in resource usage
ways to pass data to/from procedures
1. shared memory usage of global variables has a high risk of side effects 2. pass parameters in registers (ex: Irvine Library) violates "register states preserved" property of good modularization, very limited number of registers that can be used 3. pass parameters on the stack (*preferred*) reduces code clutter, good modularity, can be used by high-level programming languages
range of signed integers w/ n bits
-2^(n-1) to 2^(n-1) - 1
number, size, and type of registers
8x 32-bit general purpose registers 6x 16-bit segment registers 1x 32-bit program status and control register 1x 32-bit instruction pointer register
ways to define constants
= EQU TEXTEQU
CISC primary components
CPU RAM I/O Unit
CISC vs. RISC
RISC programs are longer and bc fewer built-in instructions (memory), but execute quicker bc no micro-program overhead (performance)