CS 354 - Final Exam Prep

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

1] What is unistd.h?

Header file which contains a collection of system call (asking the OS to do something, ie open a file, print a file) wrappers (bundles things together --> system calls in this case).

1] What is the POSIX API?

Portable OS Interface - standard for maintaining compatibility. Allows standardized way for software to interact with operating system across different platforms (different computing devices with different O.S's: Linux/Windows/etc)

1] stdlib.h contains a collection of 25 common c functions, name most common types

conversion functions: atoi --> converts string to int, strtol --> converts string to long integer (can specify type in param 10 is for binary, 16 is for hex) execution flow: abort, exit math: abs searching functions: bsearch (binary search) sorting functions: qsort (quick sort) random number: random, srandom (difference is srandom allows you to set seed, which gives you specified initial values, then calls random after that)

1] In String I/O (Input & Output), what does sscanf and sprintf do?

int sscanf(const char *str, const char *format_string, &v1, &v2, ...) → reads formatted input from str, stores those chars in vars mem locations, returns number of chars read, or negative if error int sprintf(char *str, const char *format_string, v1, v2, ...) → writes formatted output to specified str (space must exist/create buffer to write to, otherwise compiler error), returns number of chars written, or negative if error

[insert questions about operand specifiers - to assist with memorizing them]

something

1] __ to __ exception numbers are defined by processor, __ to __ exception numbers are defined by OS (operating system). 2] List all the exception numbers we need to know for processor and what they do. List all the exception numbers we need to know for OS and what they do. 3] What are the steps for making system calls in assembly?

1] 0-31 defined by processor, 32-255 defined by O.S. 2] Processor: 0 divide by 0, 13 general protection fault ie SEG FAULT, 14 page fault, 18 machine check (hardware error). O.S: 128 ($0x80) trap system call 3] Step 1] put service number in %eax Step 2] put system call arguments in registers: %ebx, %ecx, %edx, %esi, %edi Step 3] int $0x80 (128) - trap system call

1] Consider this code: int a[ROWS][COLS]; for (int c = 1; c < COLS; c++) { for (int r = 1; r < ROWS; r++) { a[r][c] = r * c; } } a.) What is the memory layout look like for code? b.) What is stride of code? c.) Is it good or bad spatial locality? 2] Explain Good or bad locality for a.) instruction flow: sequencing, selection, repetition? b.) Searching algorithms: linear search, binary search? Warning: This question is talking about instruction flow (so operands/commands not the actual data itself)

1] a.) Its a SAA, so its in row-major order (multiple rows laid out contiguously) b.) Stride of the code = # COLS = poor locality c.) Bad locality, because since we are going from row to row (inner loop), then col to col (outer loop), our stride is # COLS, not adjacent elements. 2] a.) Sequencing (top to bottom execution, refers to completing instruction on line 1, then line 2, etc): Spatial Locality is good (because adjacent instructions), no temporal locality (not revisiting the line, unless repetition) Selection (conditionals): spatial locality is poor (because it causes jumps in code, if don't satisfy conditional), no temporal locality Repetition (loops): spatial locality (good if stride is small, ie instructions are adjacent), temporal locality is good (revisiting set of instructions) b.) Linear Search: - ARRAY_LIST: Spatial locality good because arrays store element contiguously (step size/stride = small). Temporal Locality: Only used on match var (comparing match var to element found when traversing). - LINKED_LIST: Spatial locality poor because need to trace pointers (nodes are scattered throughout memory and ptrs link them together, not contiguous) so stride/step size = large. Temporal Locality: Only used on match var (comparing match var to element found when traversing). Binary Search: - ARRAY_BASED: Poor spatial locality (bin search means cutting arr in half, picking left or right, etc, jumping many elements). Temporal Locality: Use temp var to store var you are splitting at, and then compare it to var looking for. - Never do on Linked List - O(N^2)

1] What do the following unsigned comparisons do? t = a - b if a - b < 0 => CF = 1 if a - b > 0 => ZF = 0 a.) setb/setnae D - b for below, nae for not above or equal b.) setbe/setna D - be for below or equal, na for not above c.) seta/setnbe D - a for above, nbe for not below or equal d.) setae/setnb D - ae for above or equal, nb for not below 2] What do the following signed (2's complement) comparisons do? a.) setl/setnge D - l for less, nge for not greater or equal b.) setle/setng D - le for less or equal, ng for not greater c.) setg/setnle D - g for greater, nle for not less than or equal d.) setge/setnl D - ge for greater or equal, nl for not less than

1] a.) D <-- CF if < below - set destination to 1 if CF = 1 (a - b < 0) b.) D <-- CF | ZF if <= below or equal - sets destination to 1 if CF = 1 or ZF = 1 (a - b <= 0) c.) D <-- ~CF & ~ZF if > above - sets destination to 1 if ZF = 0 & CF = 0 (a - b > 0) d.) D <-- ~CF if >= above or equal - sets destination to 1 if ZF = 1 or CF = 0 (a - b >= 0) 2] a.) D <-- SF ^ OF if < less - Sets destination to 1 if Sign Flag or Overflow Flag = 1 b.) D <-- (SF ^ OF) | ZF if <= less or equal - Sets destination to 1 if (Sign Flag or Overflow Flag = 1) or if Zero Flag = 1 c.) D <-- ~(SF ^ OF) & ~ZF if > greater - Sets destination to 1 if Sign Flag or Overflow Flag = 0 AND Zero Flag = 0 d.) D <-- ~(SF ^ OF) if >= greater or equal - Sets destination to 1 if Sign Flag and Overflow Flag = 0

1] What is the formula for Stride Misses, for (% misses)? 2] Temporality Locality Impacts ___? 3] Spatial Locality Impacts ___ ? 4] Memory access speed is not characterized by single value, instead its what?

1] % misses = min (1, (word size * K)/B ) * 100 where K = stride length & B = block size in bytes, and min function picks the smaller between 1 & (word size * K)/B 2] size 3] stride 4] Its a landscape that can be explained by temporal & spatial locality (2 things).

1] How do I access a 2D array in assembly? &A[i][j] 2] Given array A as declared above, if xa in %eax, i in %ecx, j in %edx then write A[i][j] in assembly? 3] For compiler optimization, what does the compiler do if only accessing part of array? If taking a fixed stride through the array?

1] &A[i][j] = start addr + offset to row i + offset to col j = xa + (L * C * i) + (L * j) where L is byte/element & C is # columns/elements per row. 2] leal (%ecx, %ecx, 2), %ecx ; ecx = 3i sall $2, %edx ; edx = 4j - left shift edx by 2 bits so makes it bigger by 2^2 = 4 addl %eax, %edx ; eax + 4j movl (%edx, %ecx, 4), %eax ; M[xa + (4*3*i) + 4j] = %eax 3] If only accessing the part of array compiler makes a pointer (address) to that part of array. If taking a fixed stride through the array then compiler uses stride * element size as offset. ???

1] Convert a[i] into address arithmetic. Explain it. 2] What is the equivalent to a[0] = 77; 3] Is int *p = &a valid; 4] What is call stack tracing? 5] Explain pass-by-value for scalars, pointers & arrays. 6] Does changing a callee's param changes callers arg too?

1] *(a+i). Starts at a's beginning address & then adds byte offset (by element/indexing i), compiler autoscales int = 4 bytes, char = 1 byte, double = 8 bytes 2] *(a+0) = 77; *(a) = 77; a = 77; 3] No, produces a warning, supposed to be int *p = a; In C pointers & arrays are closely related but are not the same. 4] Manually tracing function calls (callers to callees), each function gets a box (stack frame), and "top" box is currently running function 5] Scalars: param is a scalar var that gets a copy of its scalar arg, Pointers: param is pointer var thats get a copy of memory address, Arrays: param is a pointer var thats get copy of array address at beginning (element[0]) 6] No caller is calling the callee, thus changing callees param only changes callee's arg, thus passing an address requires caller trust callee (because callee can modify what callee points to, and caller doesn't know).

1] How Much address space does a 32-bit processor have? 2] What is address space? A process? Kernel? User Process? 3] Explain C's abstract memory model, all parts (Code Segment, Data Segment, Heap, Stack)

1] 32-bit processor has 32-bit addresses, 2^32 = 4 GB address space which in binary is: 1111 1111 1111 1111 1111 1111 1111 1111 = 0xFFFFFFFF 2] Address space refers to range of valid memory address for process. Process refers to running program, kernel is memory resident portion of O.S (stays in memory throughout operations), & user process is process that is not kernel. 3] Code Segment contains program instructions, has 2 parts a. .text section - binary machine code b. .rodata section - "string literals" Lifetime: entire program's execution Initialization: from executable object file, EOF (ie a.out), by Loader (*learn more about later*). Note: can tell EOF about object file if talks about type of file, & end of file if talks about end char Access: read-only Data Segment contains global & static local vars Lifetime: entire program's execution Initialization: from E.O.F by loader, has 2 parts a. .data section - initialized to non-zero values b. .bss section - not initialized/initialized to 0 Access: read/write Heap (AKA Free Storage) contains dynamically alloc'd memory (mem alloc'd & free'd by programmer while program is running - during execution/runtime) Lifetime: managed by programmer Initialization: none by default Access: read/write Stack (AKA Auto Storage) contains memory in stack frames, auto alloc'd & free'd by function calls & returns. Stack frame (AKA activation record) - contains non-static local vars, params, temp vars, etc Lifetime: from declaration to end of scope Initialization: none by default Access: read/write

1] What is the size of int? int *? 2] What is pointer? pointee? 3] What is address of operator? Dereference operator? 4] In Lin Mem Diag how do you get the memory location of an int? 5] What is a pointer value of: int *p2; or int *p2 = NULL; 6] What happens if you dereference p2? ex: print("%i\n", *p2); //applies for null & not initializing p2 to anything. 7] What does this code do: j = 5; int *p1 = %j; int **q = &p1;

1] 4 bytes for both 2] Pointer contains address & does pointing, pointee is what is pointed to 3] Dereference operator (*) gives value at address, and address of operator (&) gives address of value. - * used to declare pointer type (mem address), multiply and dereference, be careful how you use *. 4] Convert int to binary, then to hex 5] Undefined (AKA garbage or indeterminate) int *p2 = NULL; //initializes it to point to nothing 6] Gives error, doesn't point to anything & results in undefined behavior giving runtime issues 7] q points to an address that points to pointer (in other words: points to a pointer)

1] What is a context switch? 2] When does context switch happen? Why does it happen? 3] How does a context switch happen? 4] What is the impact of a context switch on the cache?

1] A context switch is when OS switches from 1 process to anther. Requires preservation of process context so it can restart. Context includes: - CPU state - User's stack (save - %esp , %ebp) - Kernel's stack (%esp, %ebp) - Kernel's data structure: a. page table (maps physical page to virtual page) b. process table (which processes are running, each process has id - pid) c. file table (which files have been open & given a file descriptor - ie stdin/stdout/stderr) 2] Happens as a result of an exception when kernel executes another process, (ie scheduler runs after a timer interrupt). Context switch enables exceptions to be processed. 3] Step 1] SAVE context of current process, Step 2] RESTORE context of some other process, Step 3] TRANSFER control to restore process *Context switches are very expensive (a lot of work/time - for what needs to be saved) compared to just executing the next instruction* 4] Negative - cache pollution because if you have to swap everything out can't count on whats loaded (creates misses - because you load lots of new things in)

1] What is a function pointer? 2] What do function pointers enable/allow us to do? 3] int func(intx) { ... }. Declare a function pointer, assign function to function pointer, use function pointer as function.

1] A pointer to code, which stores address of 1st instruction of function (recall: functions stored in code, code stored in memory, memory has addresses, thus functions are pointers to address of function) 2] Enables functions to be passed & returned from other functions, stored in arrays (storing function pointer in array = address to first instruction of function) 3] int (*fptr)(int); //declares function pointer of return/param type int fptr = func; //assigns function to function pointer - can do &func, but func itself is associated with address of start of function int x = fptr(11); //uses function pointer as a function Note: function pointer can only point to functions that return/take(param) int

1] What is a dangling pointer? 2] What error do you get when you dereference uninitialized or null pointers? 3] Consider this code: int *p = malloc(sizeof(int)); int *q = malloc(sizeof(int)); ... p = q; //q was pointing to another int before What error does this give? 4] Why shouldn't functions return the address of local vars?

1] A pointer var with an address to heap memory that has been freed 2] Intermittent error 3] Memory leak since the stuff that q was previously point to, did not get free'd and has nothing point to it now. 4] Local vars memory is only available INSIDE THE FUNCTION, thus returning it is useless because it's not available outside of function. Note: Same applied for SAA, since SAA are always localized to functions.

1] What is a "C" String? Can you modify a C String? "C" String Ex: -> char str[9] = "CS 354"; 2] What is a string literal? How is it different than C Strings? 3] In most cases, when using a string literal as a source operand what address does it provide? 4] Consider code: char *sptr = "CS 354"; & char str[9] = "CS 354"; are the follow valid: a. sptr = "mumpsimus"; b. str = "folderol"; 5] Is "C" String mutable or immutable? String literal?

1] A sequence of characters terminated with null character '\0' created as a 1D array of chars with string length + 1 (size of "C" string). Yes you can modify a C String, as long as its the same length as the original string. 2] A constant source code string, that is allocated prior to execution (in .rodata section in Code Segment) 3] String literal is a string array alloc'd in the Code Segment, and when accessing its pointer, ie: char *sptr = "CS 354"; //note: ptr in stack, points to code seg you access the start address (just as you would of a SAA) 4] a. is OK, because we're reassigning what pointer points to which is fine b. str associated with char array, assignment is OK in initialization & compiler does when building array, but can't do after initialization - compiler error 5] C String is mutable but cant exceed bounds of array, string literal is immutable. *Applies when dealing with string functions, not assignment*

1] What general register stores return value? 2] What is a frame base pointer - ie how does the callee use base pointer (%ebp)? 3] How does caller use %esp (stack pointer)? How does callee uses %esp?

1] Accumulator --> %eax 2] Callee uses %ebp to access callee's args. offset(%ebp) - goes to next stack frame & accesses those args. offsets > 0 Callee uses %ebp to access callee's local vars. offset(%ebp) for offset < 0 Note: offset = 0 is the base pointer, we are specifying distance away from the base pointer 3] Caller uses %esp to set up args for function call --> movl ARG, offset(%esp) where offset > 0 + save return addresses --> call. Callee uses %esp to restore return address --> ret, and to save/restore base frame pointer --> pushl/popl %ebp.

Explain C's Heap Allocator Functions: 1] void *malloc (size_t size) 2] void *calloc (size_t nItems, size_t size) 3] void *realloc (void *ptr, size_t size) 4] void free (void *ptr) *remember if malloc/calloc/realloc returns NULL the cs 354 standard is to print (error msg); exit(1);

1] Allocates block of memory of specified size bytes (doesn't initialize content) & returns pointer to alloc'd memory, returns NULL if allocation fails. Ex: malloc (sizeof(int) * 4) 2] Allocates block of memory of nItems x size bytes (does math itself) & initializes it to 0 (every elem/entire block), return NULL if alloc fails. Ex: calloc (4, sizeof(int)) 3] Takes alloc block (pointed to by pointer) & realloc it with size bytes, return NULL if alloc fails. Ex: realloc (p, 4 * sizeof(int)) - error handling for realloc: if (ptr == NULL) return malloc(size) else if (size == 0) free (ptr); return NULL; else //attempt reallocation 4] Frees memory pointed to by pointer. If ptr is NULL, does nothing (can't communicate error because of void return type)

1] What is an exception? 2] Whats the difference between an asynchronous and synchronous exception? 3] What is the General Exceptional Control Flow?

1] An event that side-steps usual logical flow, which can originate from hardware or software, and an indirect function call that abruptly changes flow of execution 2] Asynchronous exception - event unrelated to current instruction Synchronous exception - event related to current instruction 3] a. exceptional event occurs b. control transfers to appropriate exception handler c. run appropriate exception handler d. return control to: - current instruction (if page fault) - next instruction (if file input/output) - OS - abort! (if Seg Fault)

1] What is a process? 2] Why do we have a process VAS (Virtual Address Space)? What are the key illusions for process VAS? Who is the illusionist? 3] What is concurrency? 4] What is a scheduler, interleaved execution, time slice, parallel execution? 5] What are processor modes? 6] What is a mode bit? What is kernel mode? What is user mode? 7] How do you flip mode (user to kernel, vice versa)? 8] How does sharing the kernel work in physical memory (for multiple processes - multiple operating systems)?

1] An instance of executable program (running), has "context" info needed to restart process 2] Because it's easier to treat process as a single entity running by itself. Key illusions - process exclusively uses: CPU, memory & devices (printer, terminal, etc). Illusionist: Operating System (O.S). 3] Concurrency is combined execution of 2 or more processes. 4] Inside operating system is Scheduler which is code to switch between processes (illusion = full control). Reality: interleaved execution - 1 CPU shared among process that take turns running & time slice is the amount of time a process runs. Parallel execution is opposite (can run concurrently) but with multiple CPUs where each process gets its own core. 5] Processor modes are different "privilege" level that a process can run on. 6] Mode bit indicates current mode 1 = Kernel 0 = User Mode - Kernel Mode: can execute any instruction (instructions can be anywhere), can access any memory location, can access any device. *Note: Kernel needs to do it right but NO restrictions* - User Mode: can execute some instructions (in process VAS)/, limited memory (only memory alloc'd for our process), access devices via O.S. 7] Start in user mode, only an exception can switch from user to kernel mode, Kernel's exception handler can switch to user mode (whenever it wants to) 8] Operating system is shared by all user processes, and is memory resident (once you load in a page it doesn't move). The VAS doesn't correspond to same program physical memory address. *The sequence where you put things don't matter.*??

1] What are some array caveats? 2] Which dimension is the only dimension not required to be specified for SAA for accessing SAA element? Why is this?

1] Array have no bounds checking (buffer overflow can occur), arrays cant be return types: int[] smg (int size) {...} is NOT valid (COMPILER ERROR), must do: int * smg (int size) {...} - Note: Not all 2D arrays are alike (depending if they are on stack or heap, even for 2D arrays on heap mem diagrams can be different) 2] Only the first dimension is NOT required to be specified/initialized, other dimensions have to be correct for accessing. This is because the compiler only needs other dimensions to generate code (doesn't need first dimension).

1] How are types generated by machine? Think lower-level view of data 2] At the low-level can bits distinguish between instructions from data or from pointers? 3] How does a machine know what its getting from memory? 4] Given this table, fill in blank: C, IA-32, Assembly Suffix, Size in Bytes char, byte, b, _a_ short, word, w, _b_ int, double word, L, _c_ long int, double word , L, _d_ char*, double word, L, _e_ float, single precision, s, _f_ double, double precision, L, _g_ long double, extended precision, t, _h_ 5] In IA-32 a word is __ bytes?

1] Array of bytes where each element is a byte 2] No, a single bit cant make the distinction between instructions from data or from pointers??? 3] by how its accessed, is instruction fetch vs operand load, or by instruction itself (will have operations & operands)??? 4] a. 1, b. 2, c. 4, d. 4, e. 4, f. 4, g. 8, h. 10 (typically 12 - better alignment) 5] 2

1] What is double word alignment? 2] What is external fragmentation? 3] What is false fragmentation? 4] What is internal fragmentation? 5] Why doesn't Java allow primitives on heap?

1] Block sizes must be multiple of 8, payload address must be multiple of 8 2] External fragmentation is caused by the order of mallocs & frees where theres enough heap mem (for alloc request) but its broken into too many small blocks. Note: External frag means free blocks separated by alloc'd blocks 3] False fragmentation multiple free blocks that are next to each other (neighbors) 4] Internal fragmentation is when theres a lot of overhead/padding for heap mem 5] Because primitives are small, so lots of padding = wastes lots of space = bad mem util

1] What is a function? 2] What is caller function? 3] What is callee function? 4] What is argument? What is parameter? 5] What is pass-by-value (passing in)? What is return-by-value? 6] Is main(void) same as main()

1] Function is like a method, not linked to instance of class 2] Caller starts a new function's execution (ie activation record in stack) 3] Calle is function that gets executed (doesn't initialize exchange, is recipient) 4] Argument is the data value thats actually passed, and parameter is location/place of where things are being stored (var types) 5] Pass by value means that copy of params are passed in, and return-by-value means copy of returned value is returned 6] Yes

1] Does C use a compiler or interpreter? What is C control flow? 2] What is true/false in selection statements in C? 3] What are the 3 types of loops? 4] Explain difference from post-increment and pre-increment? Does it make an impact on for loop?

1] C uses a compiler, thus has a sequential control flow meaning it starts in main(), and flows top to bottom (goes from caller to callee in stack for example), and does 1 statement after another. 2] Any non-zero integer is true, and only 0 is false. 0 wrapped into a char - '0' is true BUT '/0' is false because it means bit sequence is 0. True & False are not reserved words in C, so they are also False. 3] For loop (use when you know how many times to do something), Do while (do at least once), while (don't know how many times to do something) 4] Post increment means you do stuff inside the loop first, them increment after - j++, pre increment means you increment before doing stuff inside the loop - ++j. It does not make an impact for for loop (defaults to post increment every time)

1] What does CMP S2, S1 do? 2] What does TEST S2, S1 do? 3] What does testl %eax, %eax do? 4] List all condition codes (CC) and what they represent 5] What condition codes would CMP set? 6] What condition codes would TEST set?

1] CC <-- S1 - S2, does operation/subtract but only sets condition code 2] CC <-- S1 & S2, like AND but only sets condition code 3] Sets condition code based on value in %eax 4] ZF - Zero Flag: if result is 0, ZF = 1 CF - Carry Flag: if unsigned overflow (positive requires another bit), CF = 1 SF - Sign Flag: if sign-bit is 1 (negative), SF = 1 OF - Overflow Flag: if 2's complement overflow occurs, OF = 1 - 2s complement overflow: occurs when sign bit changes from operation when not supposed to, ie neg + neg = pos, neg + neg = pos, neg + pos or vice versa = wrong val) 5] ??? 6] ???

1] What is Command Line Arguments? What is program arguments? 2] What is argc and argc?

1] CLA is whitespace separated list of input entered after the terminal's command prompt, program argument: args that follow command, $gcc myprog.c -Wall -m32 -std=gnu99 -o myprog → entire thing is CLA everything after gcc is program argument, note: -o defines output file name: myprog 2] argc is argument count (#CLAs), argv is argument vector (array of char* → each element points to char array for each command/argument)

1] What is cache miss? What is cold miss, capacity miss & conflict miss? 2] What is cache hit? What are the placement policies? What is a victim block? Working set? 3] How is memory laid out in Main memory?

1] Cache miss occurs when block is not found in cache. Cold miss occurs when cache location is empty, capacity miss occurs when cache is too small for working set, conflict miss is when 2 or more blocks map to same cache location 2] Cache hit is block found in cache, placement policies - L1: unrestricted, L2: restricted %16, replacement policies - L1: choose any block, L2: restricted %16, victim block is cache block chosen to get replaced, working set is the set of blocks used during some interval of time 3] 4 Kb memory (1 Page) of 32 byte blocks

1] What is an array? 2] Why do we use an array? 3] void someFunction() { int a[5]; //__AA? Where is the array allocated? 4] Can a SAA (Stack allocated array) be used as source operand? destination operand?

1] Compound unit of storage of the same type, access via identifier and index, allocated as contiguous FIXED-SIZE block of memory 2] Fast access to element, and easier to declare than individual vars. 3] Stack, because it is localized inside of a function, and it is NOT a variable. 4] It can be used as a source operand, where you provide the arrays address ex: printf("%p\n", a); If used as destination operand it results in compiler error

1] What is control transfer? What is control flow? 2] What control structure results in a smooth flow of execution? 3] What control structures result in abrupt changes in the flow of execution? 4] What is logical control flow? What is exceptional control flow? What is event? What is processor state? 5] How does the process use exceptions? How does O.S (operating system) use exceptions? How does hardware use exceptions?

1] Control transfer is when you transition from one instruction to the next, control flow is a sequence of control transitions/transfers. 2] Sequential (one instruction after another, no jump/skips) 3] Selection, repetition, function calls, returns 4] Logical control flow is normal execution that follows program logic. - Exceptional control flow is special execution that enables a system to react to unusual/urgent/anomalous events. - Event is a change in processor state that may or may not be related to current instruction - Processor state is processors internal memory elements (registers, flags, signals) 5] Process uses exceptions to: ask for Kernel services, share info with other processes, send & receive signals - OS: communicate with processes & hardware, switch execution among processes, deal with memory pages - "page faults" - Hardware: indicate device status: ready, error, timer ended

1] Instructions - MOV, PUSH, POP. What do these instructions all do in common? 2] What does MOV S, D do? MOVB/MOVW/MOVL S, D? 3] What does MOVZ S, D & MOVS S, D do? 4] What does pushl S do? 5] What does popl D do? 6] for mov you have immediate values, registers, memory, for source and destination what combination is not allowed?

1] Copy data from S (source) to D (destination). Enable info/data to move around in memory & registers 2] moves/copies data from source to destination. The other iterations just specify amount of bytes copies (b - 1 byte, w - word so 2 bytes, l - double word so 4 bytes). 3] Zero extends source, then moves to destination. Sign extends source, then moves to destination. 4] Makes room in stack, copies value to top of stack (-4 because notice l for double word 4 bytes/32 bits), updates %esp register R[%esp] <- (R[%esp] - 4) M[R[%esp]] <- S 5] Removes from stack, updates %esp register D <- M[R[%esp]] R[%esp] <- R[%esp] + 4 6] memory, memory not allowed IA-32

1] What does LEAL S, D do? 2] What is LEAL different than MOV? To do this convert int y = points[i].y; into ASM with mov & convert int *py = &points[i].y; into ASM with leal? 3] How are these code fragments different & similar: subl $3, %ebx //PART A movl %ebx, %eax vs. leal -3(%ebx), %eax //PART B 4] If the address in base pointer register is 0x1000, after leal -3(%ebx), %eax, what is stored inside of %eax?

1] D <-- &S (S must be memory operand, memory is not accessed, condition code registers are not set) 2] int y = points[i].y; => mov 4(%ebx, %ecx, 8), %eax ; %ebx is base register, %ecx is count register, 8 bytes/2 ints is scale factor, & 4 is offset to y member. *Calculate effective address, retriever val at address & stores in %eax* int *py = &points[i].y; => leal 4(%ebx, %ecx, 8), %eax *Loads address to %eax 3] They both compute effective address (value in base pointer register - 3) & load it into %eax, but Part A sets condition codes that subl operation & leal does not set condition codes 4] 0x1000 - 3 = 0xFFD (remove 1 from the beginning and get 0xFFF, then remove another 2 from least significant digit F & get D, thus - 0xFFD)

1] What is the different between reading data & writing data in cache? 2] What is Write Hits? 3] When should a block be updated in lower memory levels? 4] How does write through method flush memory buffer?

1] Difference is reading data puts data from main memory into cache, and writing data copies data from CPU/cache back into the cache (lower level) for persistent storage (moves it closer to main memory). Remember: data transfer speed is a block between L3 Cache & Main Memory, but also for L1 to L2, L2 to L3, etc. 2] Write Hits occur when writing (adding/replacing - ie updating the data in a cache block) to a block that is in the cache (any cache level) 3] a. Write Through - writing to current & lower cache level (Con) - have to wait for lower level to write (Con) - more bus traffic - bus traffic is just amount of data being transferred in cache level (Pro) - Have it written in both levels b. Write back: writes to lower level only when higher level is changed/replaced (ie when a line is "evicted") - (Pro) faster, no wait - (Con) must track if changed ie the dirty-bit gets set (which means block has been changed). If replaced & dirty-bit gets set, then you must write to next lower level. 4] Write through flushes the buffer because data in higher cache level is more likely to get evicted so trickling it to lower levels that are larger & closer to main memory makes it so that its "safer" via adding redundancy

1] What are the 3 types of cache designs? 2] a.) Explain direct-mapped cache, for Direct-Mapped-Cache: b.) What is the address breakdown (b-bits, t-bits, s-bits) if blocks are 32 bytes, & there are 1024 sets. c.) Is the cache operation fast O(1), or slow O(S) where S is #/number of sets? d.) What happens when 2 different memory blocks map to same set? 3] a.) Explain fully associative cache, for fully associative cache: b.) What is the address breakdown, if blocks are 32 bytes & there is 1 set c.) Is the cache operation fast O(1), or slow O(E) where E is #/number of lines? d.) What happens when 2 different memory blocks map to same set? e.) How many lines should a fully associative cache?

1] Direct Mapped, Fully Associative, Set Associative Caches 2] a.) Direct Mapped Cache - a cache that has s sets & 1 line/per set (memory blocks map to 1 set). *Appropriate for larger caches (L3)* - (Pro): Simple circuity if tag matches t-bits b.) 32 bits for address breakdown, 32 = 2^5, so 5 b-bits, 1024 = 2^10, so 10 s-bits, 32 - 5 - 10 = 17 bits for t-bits/tag. c.) Fast, O(1) because you just access (the set) + O(1) since no lines in set (theres no searching/line matching) d.) Conflict miss (2+ blocks map to same set) - since set line only stores 1 block (1 block/set), Thrashing - continuously exchange blocks in same set (Note: both Conflict Miss & Thrashing can occur together) 3] a.) Fully associative cache - a cache that has 1 set & E lines per/set (lines/set), memory blocks can be stored on any line. *Appropriate for smaller caches (L1)* - (Con): requires more t-bits - (Con): complex circuity to match t-bits with tag in each line b.) there is no s-bit if theres only 1 set. b-bit is 5 bits, thus 32-5 = 27 bits are for t-bits/tag. c.) O(1) for set selection + O(E) for line matching d.) 1 set BUT multiple lines, so choose a free line (this architecture reduces conflict misses) e.) C = S (sets) x B (blocks) x E (lines) = 1 x 32 x E = 32 E, E = C/B (lines/per set)

1] What is direction recursion? What is a recursive case? What is base case? What is infinite recursion? What does infinite recursion cause (___ error)? 2] When tracing functions in assembly code ?

1] Direction recursion is when a function calls itself, recursive case calls the recursive function, base case stops the recursion, "infinite" recursion is like an infinite loop - recursion that doesn't stop. Causes stack overflow error because each recursive call requires memory in stack & you end up running out of memory. 2] ???

1] What is encoding targets? What is absolute encoding? 2] What is the problem with absolute encoding? 3] What is the solution? 4] When doing a jump instruction, __GET BACK TO__

1] Encoding target is technique used by direct jump for specifying targets (jump to instruction encoding address). Absolute encoding is target specified as 32-bit address (address for instruction you're jumping to). 2] Code is not compact, because target requires 32-bits, and code cannot be moved without changing target (theres dependency) 3] Relative encoding - target is specified as distance from instruction immediately after jmp instructions. In IA-32: distance is specified in 1, 2 or 4 bytes in 2s complement. - Distance is calculated from instruction immediately after jmp instruction?? 4]

1] Divide by zero Exception ___ interrupts to kernel handler. Kernel signals user process with ___. 2] Illegal memory reference Exception __ interrupts to kernel handler. Kernel signals user process with ___. 3] Keyboard interrupt - irq #11 - ctrl-c interrupts to kernel handler which ___. This does what? - ctrl-z interrupts to kernel handler which ___. This does what?

1] Exception _0_ interrupts to kernel handler. Kernel signals user process with _SIGFPE 8_ (fpe = floating point error). 2] Exception _13_ interrupts to kernel handler. Kernel signals user process with _SIGSEGV 11_ (Seg Fault). 3] ctrl-c interrupts to kernel handler which signal _SIGINT 2_. This terminates foreground process by default. - ctrl-z interrupts to kernel handler which signal _SIGSTP 20_. This suspends (temporarily stop/pause) foreground process by default.

1] For opening and closing a file, what are the 2 methods that are used? 2] What is buffering? 3] What is Flushing?

1] FILE *fopen(const char *filename, const char *mode) → the mode is either in "r" read or "w" write. Opens filename in specified mode, returns file pointer to opened file's descriptor, or NULL if there's access problem (on O.S level means reading first page of file & set up file pointer) - error check: if (fopen( _ , _) == NULL) { print(...) exit() } int fclose(FILE *stream) → flushes output buffer, & then closes stream, returns 0 or EOF if error - error check: if (fclose(filename) == 0) { print(...) exit() } 2] Buffering: When you write data to a file, it's not directly written to the file (in physical memory), it's first stored in output buffer in memory (temporary mem location) for efficiency. *Learn more in caching* 3] Flushing: Forcibly sending output data (output buffer) to destination/write to its associated file in physical memory (rather than waiting for buffer to do it).

1] Most computers restrict the addresses where primitive data can be stored. Why? 2] Assume cpu reads 8 byte words, f is misaligned float. Why is this slower than if it were aligned? 3] What are the alignment restrictions for IA-32? Linux? Windows? 4] What is the implication of these restrictions (think space complexity - memory utilization) 5] The total size of a struct is a multiple of ____?

1] For better memory performance 2] Because it requires 2 reads, 1 for each word where the float is stored, extracts the bits from each read & combines them into a float. 3] IA-32: no restrictions. Linux: - short = address must be multiple of 2 (least signif bit = 0) - int, float, pointer, double = address must be multiple of 4 (least significant 2 bits = 0) Windows: - Same as linux but double = address must be multiple of 8 (least significant 3 bits = 0) 4] To maintain these restriction padding is added by compiler 5] Total size of struct is multiple of _largest data member size_

1] What is a global var? Static local var? Why use either? 2] In general ___ vars should not be used, instead use ___ that are passed to callee functions? 3] What is shadowing? 4] What are all the possible places Arrays, structs & vars can be stored in memory? 5] Can a ptr var store address to any part of memory? What happens if you dereference an address outside of process's memory segment?

1] Global var is declared outside of function & accessible to all functions. Static local var is declared in function with static modifier, & accessible only with function (after declaration). Both provide storage for entire program. 2] Global vars shouldn't be used, instead use local vars that are passed to callee functions 3] When a local var blocks access to a global var with the same name. To avoid shadowing don't use same identifiers 4] Stack, Heap & Data Segment. 5] A ptr CAN store any address, but if you dereference & its outside of process's memory segment you get SEG FAULT.

1] What is string.h? 2] What do strlen, strcmp, *strcpy & *strcat functions? 3] What is buffer overflow? 4] Given: char *sptr = "CS 354"; & char str[9] = "CS 354"; are the following valid: a.) strcpy(str, "folderol"); b.) strcpy(str, "formication"); c.) strcpy(sptr, "vomitory"); 5] What functions are used in replacement to assignment for a copy a "C" string from one array to another?

1] Header file with prototype to many string functions 2] int strlen (const char *str) → returns length of str up to *but not including the null character* int strcmp (const char *str1, const char *str2) → compares the string pointed to by str1 to string pointer to by str2, returns a negative if str1 comes before str2, 0 if same, positive if str1 comes after str2 int *strcpy (char *dest, const char *src) → copies src string into dest str & terminates by null-term char. Caveat: compiler doesn't tell you if dest is big enough & you risk overriding other parts of memory. Solution: strncpy (char* destination, const char* source, size_t n) → is safe version where you only copy over n chars. char *strcat (chat *dest, const char *src) → appends src string into dest string & terminates by null-term char. Caveat: Same risk, if dest not big enough you risk overriding some other memory - *main takeaway*: Ensure destination char array is large enough for result, including null terminating char. 3] Exceeding bounds of array 4] a. is OK, because str array on stack is mutable, can't change what its addressing (length of array), but can change chars in array b. No, because "formication" > length 9, exceeds bounds of array = buffer overflow c. String literal is immutable so it gives SEG FAULT (don't have permission to access code segment) 5] strcpy or (strncpy - safer version)

1] What are the 3 metrics for cache performance? Explain each of them. 2] Using these metrics how do the following impact cache performance? Larger Blocks (S & E unchanged - Sets & # Lines/Set) 3] More Sets (B & E unchanged - # Bytes/Block & # Lines/Set) 4] More Lines E Per Set (B & S unchanged - # Bytes/block & # sets)

1] Hit rate - # of hits / # of memory accesses (want more hits, less miss) Hit time - time to determine a cache hit (shorter = better). Can be measured by complexity rather than # seconds Miss Penalty - additional time to process a miss, (have this metric because we have to look larger & further to determine a miss vs a hit directly) 2] a.) Hit Rate - Pro: larger blocks = more spatial locality per block (ie bigger blocks copy more data from main memory so less likely to miss) b.) Hit Time - Same: E is still same - line matching stays same c.) Miss Penalty - Con: larger block = larger stride = worse temporal locality for a miss *Therefore block sizes tend to be small (32/64 bytes) 3] a.) Hit Rate - Pro: more blocks = so then less conflict misses = less misses = more # hits - temporal locality???? b.) Hit Time - Con: worse because more sets = slower set selection (harder to identify which set) c.) Miss Penalty - Same: because block size remains constant = spatial locality doesn't change *Therefore faster caches have fewer sets (L1), & slower caches are larger with more sets 4] a.) Hit Rate - Pro: More lines = more blocks in a set = less likely to have a miss = more hits - temporal locality??? b.) Hit Time - Con: worse because you need to traverse more blocks/lines c.) Miss Penalty - Con: worse *Therefore faster caches have fewer lines per set

1] Unary operations apply for memory & registers. List them all & explain what they do. 2] List all binary operations & explain what they do. 3] What do shift operations do? 4] List the 2 shift operations, and what they do. What does << & >> do in C?

1] INC D: D <-- D + 1, DEC D: D <-- D - 1, NEG D: D <-- -D (2s complement flip bits add 1), NOT D: D <-- ~D (~ means not = flip all bits) 2] ADD S, D: D <-- D + S SUB S, D: D <-- D - S IMUL S, D: D <-- D * S XOR S, D: D <-- D ^ S (exclusive or - both T then F) OR S, D: D <-- D | S (bit-wise or) AND S, D: D <-- D & S (bit-wise and) - Note: XOR, OR & AND are bit-wise operations, that means 1 = true and 0 = false, mem locations are in hex so it converts to binary, then compares each bit for source & destination individually 3] move bits left or right by k positions, for fast multiplication & divisions by power of 2 (0110 = 6, left shift 1 - 1100 = 12, right shift 1 - 0011 = 3) 4] Logical shift (ZERO FILL): SHL k, D: D <-- D << K ( shifts left K units) SHR k, D: D <-- D >> K (shifts right K units) Arithmetic shift (SIGN-FILL): SAL k, D: D <-- D << K [Shift Arithmetic Left] SAR k, D: D <-- D >> K [Shift Arithmetic Right]

1] List all the classes of exceptions. 2] What is an interrupt? Is it synchronous/asynchronous. What does it return to? What does it do. 3] What is a Trap? Is it synchronous/asynchronous. What does it return to? What does it do. 4] What is a Fault? Is it synchronous/asynchronous. What does it return to? What does it do. 5] What is an Abort? Is it synchronous/asynchronous. What does it return to? What does it do.

1] Interrupt, Trap, Abort, Fault 2] Interrupt enables device to signal it needs attention (signal from external device). Asynchronous - unrelated to current Instruction. Returns to next instruction. Step 1] Device signals interrupt Step 2] Finish current instruction Step 3] transfer control to appropriate exception handler Step 4] transfer control back to interrupted process's next instruction 3] Trap enables process to interact with OS, its an intentional exception (process causes) - ie function call to kernel. Synchronous - related to current instruction. Returns to next instruction. Step 1] Process indicates it needs OS service - int (interrupt instruction) ~ interrupts the OS to do process system call Step 2] transfer control to the OS system call handler Step 3] transfer control back to process's next instruction 4] Fault handles "problem" with current instruction (potentially recoverable error - Page Faults & technically Seg Faults - default is crash). Synchronous - related to current instruction. Might return to next instruction & re-execute it (otherwise abort). 5] Abort cleanly ends a process (non-recoverable fatal errors). Is synchronous - related to current instruction. Doesn't return anything - Aborts!

1] Why don't consecutive heap alloc'd result in contiguous payloads? 2] Why, don't assume heap memory initialized to 0? 3] Why are all memory leaks bad? 4] Do memory leaks consist when program ends? 5] What errors results in reading/writing data in freed heap blocks? Applies if your pointer is pointing to memory that has been freed. 6] How do you check if your program has run out of memory? 7] Why don't we change heap memory outside of our payload?

1] Interspaced by padding, headers, etc 2] While O.S does clear heap & pages for security, recycled memory can still contain old data (unless calloc) 3] Kill performance = clutter heap with garbage blocks 4] No, heap pages are reclaimed by O.S 5] Intermittent error 6] Check return value for malloc, calloc & realloc 7] Trashes internal structure of heap (by modifying/overwriting memory of other payloads)

1] What does a function call (instruction call) do in assembly? How do you do a function call in x86 assembly? 2] In assembly what does the call instruction do under the hood (for both forms of call, caller & callee)? *Hint: Broken into 2 instructions* 3] What does function return (instruction ret) do in assembly? 4] What does ret instruction do under the hood? *Hint: There is an equivalent instruction, explain what that instruction does* 5] Does the callee's stack frame store the callers ebp (base pointer)? Explain.

1] It transfers control to another part of instruction (jmp to different part of code). 2 Ways: a. call *Operand - indirect call (place you want to jmp to is stored somewhere else like a register) b. call Label - direct call 2] First, push return address onto stack - pushl %eip - this is because once the function call returns something, we need to know where to return to (where it should go) Second, jump to start of callee function - jmp *Operand/label 3] Lets you know its ready to return something (so you can pop it off the stack and save returned value in callers stack frame) 4] jump to return address that is popped off stack -> popl %eip 5] Yes, because ____explain___

1] What is job control? 2] What is Linux in context to Process & Job Control? 3] What do the following commands do: ps, jobs, & (goes infront of linux command), ctrl+z, bg, fg, cltr+c, top 4] What is the background and foregroud? 5] What does it mean when a process gets "suspended"? 6] What is the command to get program size? 7] How does Linux enable Virtual Address Space Maps? 8] What does /proc do?

1] Job control is the process of managing many process's 2] Linux is a multi-tasking operating system, where you run multiple processes concurrently (ie use multiple cores on CPU) 3] ps - lists snap shot of user processes, jobs - lists processes started from command line (all running processes), & infront of a command puts process in background, cltr+z - suspends running process bg - puts suspended process in background fg - puts suspended process in foreground cltr+c - stops running foreground process top - table of resource usage 4] Background is where processes go if they can run asynchronously/independently, and foreground is for processes that require user input. User can enter commands while background process is running, but user has to wait for foreground to finish running before entering commands (because foreground dependent on that user input/commands). 5] A process "suspended" means it gets halted/temporarily stopped. 6] size <executable_object_file> 7] Linux enables you to see V.A.S (memory maps) of each process (by doing ps to get process ID) you can kill a process using process ID (so it relates to VAS because each process gets its own memory - page per se, and when you kill a process you are only freeing that memory and giving it back to Kernel) 8] /proc is a virtual filesystem that reveals kernel data in ASCII (shows all processes)

1] What does Jump instruction do? Why does it exist? What is target? 2] What is indirect jump? What is the difference from: jmp *%eax and jmp *(%eax) commands? 3] What is direct jump? 4] What is syntax for assembly language label? 5] What are conditional jumps? 6] How do conditional jumps help implement if statements & repetition (control flow)?

1] Jumps instructions, enables selection (if/switch) & repetition (loops), by Jumping back to an instruction. Target is the desired location you're jumping to. 2] Indirect jump: target is address in memory, jmp *Operand, %eip (instruction pointer) has operand value. jmp *%eax - register value is target (jumping to the value inside the register, whether thats another register or mem location) jmp *(%eax) - register value is memory address with target (jumps to the memory address of the register itself) - Note: Called indirect because we jmp to register which has the value (and do indirect accessing) for example. Operand can be any operand mode: register, mem loc, etc 3] Direct jump: target address is in instruction (address to jump is encoded in the instruction) ex: jmp Label, %eip has Label's address: jmp .L1 4] .__Label Name__: (code next line) 5] Jumps if condition is met CC (condition code) of previous instruction, can ONLY be direct jumps. *Note: endings have same meaning as set* both: je/jne/js/jns Label - equal, not equal, signed (neg), not signed (pos) unsigned: jb/jbe/ja/jae Label - below, below or equal, above, above or equal signed: jl/jle/jg/jge Label - less, less or equal, greater, greater or equal 6] In unconditional jumps you just jump no matter what. Conditional jump means you jump based off CC (condition code - register) which allows us to use selection to determine to jump or not. So if we make a condition false we jump out of a code segment (logic behind loops).

1] What do exceptions transfer control to? 2] How do you transfer control to an Exception Handler? Give all the steps. 3] What is an exception table? 4] What is an exception number?

1] Kernel 2] Step 1] push return address of next/current instruction onto stack (stores next/current instruction for return address), Step 2] push interrupted process state (so it can be restarted??) - Note: For these push steps kernel's stack is used, once it takes control it does: Step 3] indirect function call: which runs appropriate exception handler. The indirect function call - EHA = M[R[ETBR] + ENUM] - ETBR is for exception table base register, ENUM is for exception number, EHA is for exception handler's address 3] Exception table is a jump table for exceptions, is allocated in memory by Operating System at boot time [Each process doesn't have exception table, O.S as a whole has an exception table]. - Exception Table Base Reg points to Exception Table (has 0 to N-1 for N = 2^32 columns all allocated contiguously), each column points to Kernel Exception Handler (code to handle some exception) 4] Exception number is a unique non-negative integer associated with each exception type. The "index" (column number) in the exception table.

1] How does the kernel use signals? What is a signal? 2] What does $kill -l do? What does signal(7) do? 3] Why do we have signals?

1] Kernel uses signals to notify user processes of exceptional events. A signal is a small message sent to process via Kernel. 2] Lists signal names & numbers. Signal(7) does man 7 signal (does manual section 7 signal, gives more than 30 standard signal types, where each has unique non-negative ID, for linux). 3] So kernel can modify processes (low-level hardware exceptions, high-level software events (in kernel) or from user processes), to enable user process to communicate with each other, to implement a high-level software form of Exceptional Control Flow (ECF)/Exception Handling???

1] Another way of address arithmetic: &A[i] = a + i = xa + L * i where xa is starting address, L is element size & i is index. What is L for int I[11] and write in address arithmetic? 2] Is start of an array the lowest or highest address? Is the stack TOP at the lowest or highest address (remember stack grows downward). Where do A[0] & A[n-1] go in stack, stack TOP or BOTTOM? 3] Assume array's start address in %edx & index in %ecx. Given: movl (%edx, %ecx, 4), %eax write in address arithmetic.

1] L = 4. xi + (4 * i) 2] Start of array is the lowest address (similar to little endian). The stack TOP is the lowest address (technically the bottom). The stack BOTTOM is the highest address (technically the top). A[0] goes to stack TOP, & A[n-1] goes to stack BOTTOM. 3] Let xa = edx, L = 4, i = ecx. m[xa + 4*i] in C is *(A+i) = A[i]

1] What is a union? 2] Why do we have unions? What does it allow?

1] Like a struct, but fields share same memory (by passing C's type checking). Allocates only enough memory for largest field. 2] Allows data to be accessed as different types (??), used to access hardware, implements low-level "polymorphism"??

1] What parts of the machine does C hide from us? 2] What is assembly? Hows it different than C? 3] What does assembly remove from C source? 4] What Instruction Set Architecture are we studying? 5] How many bytes long are IA-32 instructions? 6] Why learn assembly?

1] Machine instructions, addressing nodes (???), registers, condition code 2] Human readable representation of machine code & is machine dependent. C is higher level language, and it is not machine dependent. 3] high level language constructs - logical control (if, switch, for, while), local vars + data types, composite structs (arrays in structs, etc) 4] IA-32 running on x86-64 machine 5] 1 to 15 bytes long 6] to understand the stack, identify inefficiencies, & understand compiler optimization

1] What happens in assembly when you allocate a stack frame? 2] In assembly we say the stack is dynamically growing and shrinking. But we know that heap dynamically grows & shrinks based off programmer alloc'd & free'ing memory. How is the stack dynamically growing/shrinking? 3] What happens in assembly when you free a stack frame? What is the instruction that does this, explain it, and explain the steps the instruction does.

1] No special instructions, subl #x, %esp where #x is some arbitrary number we are subtracting from stack pointer (this moves the stack pointer down & remember stack grows downward). 2] Its growing/shrinking based off the code (specifically the call sequence of functions) 3] leave - free's the callee's stack frame. step 1] Want to remove all of callee's stack-frame/S.F, except callers ebp/base pointer, because we want to jump back to it & restore base pointer for caller/fix stack --> movl %ebp, %esp but its 1 word off because of saved callers ebp step 2] restore callers S.F --> popl %ebp, removes saved ebp in callee S.F & restores ebp to base/bottom off caller S.F. Updates esp "down" 1 word to top to callers S.F.

1] What are the 2 goals for Allocator design? 2] What are all the requirements of a heap allocator?

1] Note: Trade Off - typically increasing one decreases other a. Maximize throughput: the malloc + frees that can be handles (higher is better, keep in mind free() = O(1) & malloc() = O(N), want to maximize efficiency) - Consider larger but less blocks, improves efficiency as you traverse less blocks when malloc() b. Maximize memory utilization: % of memory used for payload, (memory requested/heap alloc'd) - BUT with larger blocks = more padding when alloc'ing = worse mem util 2] Allocators use heap, provide immediate response (minimize delay = maximize efficiency = throughput), must handle arbitrary sequence (random order of alloc's & free's), CANNOT move or change previous alloc'd block, must follow memory alignment requirements

1] What are the problem with multiple functions using registers? How to solve them? 2] Explain your solvency (ie Caller Save, Callee Save)?

1] Only few, shared registers so conflicts can result. Need to exchange values from registers (put them in main memory). Caller & callee must have consistent approach to using registers or data gets overwritten. 2] *If you use these registers, save them* Caller-save: %eax, %ecx, %edx - registers caller saves before calling a function - because callee will use %eax/accumulator, %ecx/counter, %edx/data Callee-save: %ebx, %esi, %edi - registers callee saves

1] What are the 3 replacement policies? Explain each one of them 2] Using/exploiting replacement policies helps improve ___? If it does, then why don't programmers use it all the time?

1] Random replacement, LRU - least recently used, LFU - least frequently used Random Replacement - randomly removes blocks, until you have x many remaining blocks for x lines. rules: cannot have an empty line or have duplicates (obviously: because then why are you doing replacement) LRU - Load x blocks into x lines. Track when line was last used, & use LRU queue. When line used, move to front & use status-bit to track LFU - track how often a line is used, each line has a counter & is updated to 0 when line gets a new block or incremented when line is accessed. If tie, choose randomly for replacement. 2] performance, and because it's hard to implement.

1] Describe a Linear Memory Diagram: how much each rectangle is in #byte/bits, what is most/least significant byte? 2] What is byte addressability? 3] What is endianess? Little endian? Big endian? 4] Pointer variable is a scalar var whose value is a ___? 5] Why use a pointer instead of address?

1] Reach rectangle on linear memory diagram is 1 byte or 8 bits. Most significant byte comes first, ex: 0x1100FF28, 11 is most sig byte, and 28 is least significant byte (comes last) 2] Byte addressability means each address identifies 1 byte 3] Endianess is byte ordering for variables with more than 1 byte. Little endian means least significant byte in lowest address, big endian means most significant byte in lowest address 4] Pointer variable is a scalar var whose value is a memory address. Similar to Java reference but you don't see where references are stored in Java. 5] Gives indirect access to memory (stored address somewhere, and function has to go there to access it), indirect access to function, common in c libraries, for access to memory-mapped hardware

1] What does reading data do in cache? What does writing data require? 2] What is reading data into cache? What is writing data into cache?

1] Reading data copies a block of memory into cache, writing data requires that these copies are consistent (maintain up to date copies of data). 2] Reading data into cache means data you're trying to access is not in cache, results in a miss, then you fetch that data from main memory and add it to cache. Writing data into cache means your updating data already in the cache (so when you say you want to make sure copies are consistent you want to make sure the changes you make to data in cache are also reflected in main memory which is why you have write through & write back that trickles data down into main memory)

1] What are registers? 2] Name all the general registers and what they do 3] All extended registers have non-extended versions (16-bit). But what registers don't have its "high bit" & "low bit" counterpart registers (in other words, only have its extended & non-extended versions)? 4] What is the register for program counter? 5] What are condition code registers? List them all.

1] Registers are the fastest memory (directly accessed by ALU), can store 1,2 or 4 bytes of data or 32-bit addresses (4 bytes) 2] remember: source index, destination index, stack pointer, base pointer don't have low/high 8-bit registers "e" means extended registers (32-bits): %eax - accumulator, %ecx - count, %edx - data, %ebx - base, %esi - source index, %edi - destination index, %esp - stack pointer, %ebp - base pointer 16-bit registers: %ax, %cx, %dx, %bx, %si, %di, %sp, %bp 8-bit (high bit - 8 to 15) registers: %ah, %ch, %dh, %bh 8-bit (low bit - 0 to 7) registers: %al, %cl, %dl, %bl 3] a. Registers with only extended & non-extended bit versions include - source index, destination index, stack pointer & base pointer (all related to addresses, which before could only be stored as 16 bits, now require 32 bits). b. Registers with "high bit" & "low bit" include - accumulator, count, data & base 4] %eip - instruction pointer to next instruction (stores address of next instruction) 5] 1-bit registers that store status of most recent operation ZF (Zero Flag), SF (Sign Flag), OF (Overflow Flag), CF (Carry Flag) --> used by instructions for conditional execution

1] What are operand specifiers? Why do you need them? 2] List all the operand specifiers for IMMED, REGISTER, MEMORY, etc

1] S - source, specifies location of source (read) - value to be used. D - destination, specify location of destination (write) - to store result. - They enable instructions to specify constants (Immediate value), registers, memory locations. - In other words allow you to move stuff in memory (whether memory location specified by address or register) ??? 2] Note: operand value refers to value after operation a.) IMMED specifies an operand value thats a constant, $ is the specifier, $Imm - gives Imm as operand value b.) Register specifies an operand value thats in a register, % is the specifier, operand value: R[%Ea] c.) Memory specifies an operand value thats in memory at effective address following separated as: specifier, operand value, effective address, addressing mode name: - Imm, M[EffAddr], Imm, Absolute - (%Ea), M[EffAddr], R[%Ea], Indirect - Imm(%Eb), M[EffAddr], Imm + R[%Eb], Base + offset (displacement) - (%Eb, %Ei), M[EffAddr], R[%Eb] + R[%Ei], Indexed (base + index) - Imm(%Eb, %Ei), M[EffAddr], Imm + R[%Eb] + R[%Ei], Indexed + offset - Imm(%Eb, %Ei, s), M[EffAddr], Imm + R[%Eb] + R[%Ei]*s, scaled index (offset + base + index*S.F - Scale Factor) - (%Eb, %Ei, s), M[EffAddr], R[%Eb] + R[%Ei]*s, no offset (base + index*S.F) - Imm(,%Ei, s), M[EffAddr], Imm+R[%Ei]*s, no base (offset + index*S.F) - (,%Ei, s), M[EffAddr], Imm+R[%Ei]*s, no base no offset (index*S.F) s - scale factor - only be 1, 2, 4 or 8. Eb - base register (starting address), Ei - index register, Imm is offset value

1] What is scalar variable? 2] What does a basic memory diagram consist of? 3] What is identifier, value, type, address, & size? 4] What is a scalar variable used as source operand? Give example 5] What is a scalar variable used as a destination operand? Give example

1] Scalar variable is primitive unit of storage whose contents can change. *Variable that can change* 2] var type, label/identifier, the value itself (inside a box) 3] identifier = is name, value = is the data stored, type = is representation of "the bit pattern" (specifies int from double, prevents compiler from getting confused), address = is memory address (which is bit in memory, "starting location"), size = is # of bytes. 4] Means you read value, read the var, ex: printf("%i\n", i); 5] Means you write value to storage. ex: i = 11;

1] What is the heap? 2] What is a block? payload? overhead? allocator? 3] What are the two allocator approaches, and explain them.

1] Segment of process/part of VAS (Virtual Address Space) for dynamically alloc'd mem (mem requested while program is running). - It is also a collection of various sized blocks of memory managed by the ALLOCATOR 2] block is a contiguous chunk of memory that contains payload, & overhead payload - part requested & available for user (unusable by process) overhead - part of block used by allocator to manage block allocator - CODE that allocates & frees blocks (manages dynamic memory allocation) 3] Implicit & Explicit. Implicit (Java & Python): compiler implicitly determines amount of bytes needed (via "new" operator). Also contains a garbage collector (automatically frees unused memory). Explicit (C): explicitly told # bytes needed (Malloc, Calloc, Realloc), and explicitly call: Free, to free memory

1] What are the 3 phases of signaling? 2] What is sending a signal? 3] What is delivering a signal? What is a pending signal? What is bit vectors? 4] What is receiving a signal? 5] What is blocking?

1] Sending, delivering, receiving 2] When the kernels exception handler runs in response to Exception Event. Is directed to destination (process). 3] When the kernel records a sent signal for its destination procedure. -> Pending signal - delivered but not received. Each process has a bit vector to read pending signal -> Bit vectors - a kernel data structure where each bit has a distinct meaning. Bit K in vector is set to 1 when signal K is delivered. 4] When the kernel causes destination procedure to react to pending signal. Happens when kernel transfers control back to process, multiple pending signals are received in order low to high signal. 5] Blocking prevents a signal from being received. Enables a process to control which signal it pays attention to. Each process has a second bit vector for blocking signal.

1] What does instruction - SET do? 2] What does following instructions do: note: sete & setz refer to the same thing but with different Mnemomics a.) sete/setz D - e for equal, z for zero b.) setne/setnz D - ne for not equal, nz for not zero c.) sets D - s for signed d.) setns D - ns for not signed

1] Set a byte register to 1 if a condition is true, 0 if false (condition determined by CC) 2] a.) D <-- ZF set if == equal (ZF = 1) - sets destination to 1 if ZF = 1 (is "set") b.) D <-- ~ZF set if != not equal (ZF = 0) - set destination to 1 if ZF = 0 (not "set") c.) D <-- SF set if < 0 signed (negative) SF = 1 - set destination to 1 if "negative" so sign flag is "set" (SF = 1) d.) D <-- ~SF set if >= not signed (nonnegative) SF = 0 - set destination to 1 if "positive" so sign flag is "not set" (SF = 0)

1] What are the two key memory segments used by a program? Explain each. 2] Why use heap memory? 3] What are the 3 MAIN functions/operators utilized for heap allocation? 4] Write code do dynamically allocate an integer array named a with 5 int elements. 5] Consider this code: *a = 0; //dereferences a[1] = 11; //indexing *(a + 2) = 22; //address arithmetic int *p = a; *(p + 3) = 33; Write code that free's array a's heap memory.

1] Stack - static (fixed in size) allocations, allocation size known during compile time, Heap - dynamic allocations at run-time (find out how much memory & where to store) 2] More memory than in compile-time, can have more blocks of memory alloc'd & free'd while program is running. 3] #include <stdio.h> - standard input/output library function: void *malloc (size_in_bytes) → reserves a block of heap memory for specified size. returns a generic pointer that can be assigned to any pointer. function: void free (void* ptr) → frees heap block that pointer points to operator: sizeof(operand) → returns size in bytes of operand 4] int *a = malloc(sizeof(int) * 5); 5] free (a); //dangling pointer a & p so we: a = NULL; p = NULL;

1] What is Standard I/O? What is Standard Input, what is Standard Output? 2] What does getchar, gets, fgets, scanf functions? 3] What is a format string, list all the format identifiers, & white space (input separators)? 4] What does putchar, puts, printf functions? 5] Standard Error: explain what perror does.

1] Standard I/O is Standard Input & Output (keyboard input, and terminal output). Standard input is keyboard input, standard output is terminal output. 2] getchar reads 1 char, gets reads 1 string ending in new line char (\n), Buffer Might Overflow, fgets is the safer version of gets int scanf(const char *format_string, &v1, &v2, ...) → reads formatted input from console keyboard, returns number of inputs stored, or EOF (End Of File ~ pre-defined value, occurs when you're expecting more input) if end-of-file occurs before any inputs. 3] format string includes format identifiers and characters to skip/print. Format identifiers: %d %i - for int (difference with d & i is i includes int values in hex & octal), %c - for char, %s - for string, %p - pointer (in hex because memory addresses are in hex). Whitespace input separators: " " - 1 space, "\t" - tab, "\n" - new line. Note: Cant read past new line for standard input from keyboard, but can for file. 4] putchar writes 1 char, puts writes 1 string (both are alternatives to printf - which prints entire line). int printf(cons char *format_string, v1, v2, ...) → writes formatted output to the console terminal window. Returns number of chars written, or a negative if error 5] Standard error outputs to terminal, void perror (const char *str) → writes formatted error output to console terminal window (allows programmer to choose which output stream)

1] For caching: An address identifies which byte in VAS (Virtual Address Space) to access. It does this by dividing into parts to access memory in steps. What are these steps? 2] When is a cache level searched? What is the problem with this? How do you improve this? 3] What is a set? 4] The block number bits of an address are divided into 2 parts (what are these 2 parts)? 5] You are given # of sets in Cache? How do you calculate s-bits? What are s-bits? 6] What is the problem with using most significant bits for s bits? 7] How many blocks map to each set for 32-bit AS and a cache with 1024 sets? 8192 sets?

1] Step 1 - Identify which block in VAS, Step 2 - Identify which word in block (3 bits in address breakdown given to word), Step 3 - Identify which byte in word (2 bits in address breakdown given for this/the rest of B-bits) 2] Only in unrestricted placement policy (because in restricted placement policy - %16). - Problem: Slow, O(N), where #N is #locations in Cache (L1), Improvement: Limit/restrict where each block can be stored (L1) 3] Where block is uniquely mapped to in a cache 4] The 2 parts are set - maps blocks to specific set in cache & tag - uniquely identify block in set 5] where S = # sets in cache, s is s-bits: S = 2^s. S-bits are bits that identify which set the block maps to. 6] Lose spatial locality? 7] 2^32 / 2^5 = 2^27 (# blocks) / 2^10 (1024 sets) = 2^17 blocks/set 2^27 (# blocks) / 2^13 (8192 sets) = 2^14 blocks/set

1] How is a struct laid out on stack (ie where does the first "data member" of struct go, where does "last member" of struct go). struct ICell { int x; //first data member .... } 2] How does the compiler associate data members of a struct? how does it access data members of a struct?

1] Struct lays out data members contiguously, first data member is at Stack TOP - esp (bottom), and last data member is at Stack BOTTOM (top) - ebp. 2] It associates it with its offset from the start of struct. Uses that offset with address arithmetic to access a specific data member. - Assembly code to access a struct does not have data member names or types

1] Are struct passed-by-value to a function? Are arrays passed-by-value to a function? Whats the difference? 2] Why use pointers to structs/structures? 3] Declare a pointer to a Pokemon and dynamically allocate it's structure. 4] Assign a weight to Pokemon 5] Assign a name and type to the Pokemon 6] Assign a caught date to the Pokemon 7] Deallocate Pokemon's memory 8] Consider function: void printPm (Pokemon pm) {...} & void printPm (Pokemon *pm) {...}. Which one is better & why? 9] Consider: int main (void) { Pokemon pm = {"Abra", "Psychic", 30, {1, 21, 2017} }; print pm (& pm); // OPTION A print pm (* pm); // OPTION B } Consider optimal implementation of printPm in Q8. Which option is correct?

1] Structs are passed-by-value to a function, which copies the entire struct, SLOW! Arrays are also passed-by-value to a function, but only starting address is passed, FAST! 2] Avoids copying overhead of pass-by-value (doesnt copy everything individually), allows function to change struct's data member (because your accessing the memory location), allows heap allocation of struct vars, enables linked struct 3] Pokemon *pmptr; pmptr = malloc(sizeof(Pokemon)); if (pmptr == NULL) { print(...); exit; } //error check 4] (*pmptr).weight = 43; //paranthesis + dot operator, OR pmptr -> weight = 43; //uses points-to operator which dereferences, then selects data member 5] strcopy(pmptr -> name, "Abra"); 6] pmptr -> caught.month = 9; pmptr -> caught.day = 22; pmptr -> caught.year = 2023; 7] free(pmptr); //Pokemon is struct, data mems in Pokemon dont have any pointers to heap memory, so you don't need to free anything inside of struct. If struct on stack, it automatically gets dealloc'd when it goes out of scope (whenever compiler finishes compiling whatever function struct is inside of) 8] void printPm (Pokemon *pm) {...} because you leverage using pointer which prevents overhead of pass-by-value (copying entire struct/all its data mems one-by-one individually) 9] Option A, can't dereference because it's not a pointer (and the function pm param type is a pointer, which is a memory address/location), so we cast to that type by using address of operator. If pm was a struct, then we just pass pm (pointer itself containing mem address)

1] Can you nest a struct with another struct or array? 2] Can you have array of structs? 3] Consider: typedef struct { ... } Date; typedef struct { char name[12]; char type[12]; float weight; Date caught; } Pokemon; a. Statically allocate array named Pokedex with 2 pokemon. b. Write code to change the weight to 22.2 for Pokemon at index 1 c. Write code to change the month to 11 for Pokemon at index 0 4] Complete function below so it prints/displays Date structure: void printDate (Date date) { //mm/dd/yy //insert code }

1] Structs can contain other structs & arrays, nested as deeply as you want 2] Arrays can have structs for elements 3] a.) Pokemon pokedex[2] = { {"Abra", "Psychic", 43.0, {1, 21, 2023}}, {"oddish", "grass", 33.2, {9, 22, 2023}} }; b.) pokedex[1].weight = 22.2; c.) pokedex[0].caught.mon = 11; 4] printf("%2i/%02i/%i \n", date.month, date.day, date.year); format identifiers: - 2 integers / 2 integers max, if less pad with leading 0's / integer (no width specifier)

1] What is temporal locality? 2] What is spatial locality? 3] What is stride? 4] Consider: int sumArray(int a[], int size, int step) { int sum = 0; for (int i = 0; i < size; i += step) sum += a[i]; return sum; } a.) List the variables that clearly demonstrate temporal locality. b.) List the variables that clearly demonstrate spatial locality. 5] What does the CPU use locality for? How does it do it? 6] Programs with good locality run faster since they work better with cache system. Why?

1] Temporal locality is when a recently accessed memory location is repeatedly accessed in the "near future" (refers to scope - loop, or entire program, defined by programmer) 2] Spatial locality is when a recently accessed memory location has a nearby memory locations being accessed in near future 3] Stride is step size in words (4 bytes) between sequential access in the block (32 bytes) 4] a.) i, sum, step, size (because we repeatedly access them in for loop) b.) a, a[i] because we repeatedly access nearby location (neighboring element). Note: "good spatial locality" depends on step size 5] The caching system uses locality to predict what the CPU will need in the near future. - For temporal, it anticipates data will be reused so it copies it into cache memory, and for spatial: anticipates nearby data will be used so it copies a BLOCK into cache memory. 6] Because programs with good locality maximize use of data at the TOP of memory hierarchy (closer to CPU)

Designing a Cache: Blocks 1] What are the bits of an address used for? 2] How do you figure out how many bytes are in an address space? 3] How big should a cache block be? Vaguely explain, consider optimal efficiency. 4] If given b-bits how do you find out many bytes are in block, or if given number of bytes per block, how do you find b-bits? 5] What are b-bits? What is word offset? What is byte offset? 6] What happens if you use the most significant bit (left side) for the b-bits in an address breakdown? 7] How many 32-byte blocks of memory are in a 32-bit address space? 8] Other than b-bits, the remaining bits of an address are for?

1] The bits of an address are used to determine if block containing that address is in cache 2] If given the number of bits in address (m) for ex: IA-32 is 32, then M (# bytes) is 2^m, OR if given the number of bytes in an address space, m = log2(M) or solve for m in M = 2 ^m. 3] Cache blocks must be big enough to capture spatial locality but small enough to minimize latency (delay) 4] Given bytes of block (B) b-bits (b) => B = 2^b, vice versa given b-bits, solve for B => B = 2^b - Ex: 32 byte block = 5 b-bits 5] B-bits are the number of address bits to determine which byte in block, word offset identifies which word in block, and byte offset identifies which byte in word 6] Think of the cache design, you have a cache level, inside the cache there's sets (which contain lines), and lines that contain blocks, so if you use the most significant bits for the smallest unit of cache memory, it makes it more spaced out (ie *lose/bad Spatial Locality*) 7] 2^32 (address space) / 2^5 (block size) = 2^27 = 2^20 (1 Mb) x 2^7 (128) = 128 Mb or 134,217,728 blocks 8] The remaining bits encode the block number

1] What bit is used to differentiate different blocks that map to the same set? 2] When a block is copied into a cache its t-bits are stored as its ___? 3] What is a line? 4] How do you know if a line in the cache is used or not? 5] How big is basic cache given S sets with blocks having B bytes? 6] How does a cache process a request for a word at a particular address?

1] The t-bit, which is the remaining bits in an address breakdown (that are not b-bits & s-bits). t-bits of address identify which block in the set itself. 2] Tag, which is used by cache as an identifier, when tag is compared to memory address & tag matches (then its a hit) & if it doesn't match means its a miss, so it needs to be fetched from lower memory (main memory) 3] Line is a location in cache that stores 1 block of memory, and is composed of storage for blocks data + info needed for cache operations (ie block of memory + tag/t-bit + v-bit) 4] Use a status bit/v-bit, if its 1 then cache block is copied to cache line? 5] C = S (# sets) x B (bytes/block), note: tag & v-bit not included in cache size 6] Does set selection (identifies set - extract s-bits & use as index), then line matching (extract t-bits, compare t-bits with stored tag in line of set) & IF: a. no match or valid bit (v-bit) = 0, CACHE MISS b. match & valid bit (v-bit) = 1, CACHE HIT *For L1 Cache only, must row extract word from block using word offset*?????

1] An allocator using EFL keeps a list of free blocks, this can be integrated into the heap using a special layout. Describe this layout (ie explain the 2 free-block links) 2] Is footer still useful in this case, since you have direct links to next & previous blocks? 3] Does the order of free blocks in free list need to be same order as found in address space?

1] There is a predecessor link - address of previous block, and successor link - address of next free block. - The layout is: 4 bytes for header (block size + 0pa) + 4 bytes for predFree Block + 4 bytes for succFreeBlock + 4 bytes for footer (only block size) = 16 bytes Common Implementation: is to see Free List as Doubly-Linked List of Nodes 2] Yes, allows faster coalescing with previous free block 3] No, doesn't matter because of the links to predecessor (previous) & successor (next) free blocks.

1] Make a 2D array pointer named m 2] Assign m to an "array of arrays" (allocate to 1D array of integer pointers of size 2 = # of rows) 3] Assign each element in the "array of arrays" its own row of integers (each row has size 4) 4] Write code to free m 5] What is address arithmetic for m[i][j]? 6] What's address arithmetic for m[0][0]?

1] int **m; 2] m = malloc(sizeof(int*) * 2); error check: if (m == NULL) { print ("mem alloc failure"); exit() } 3] *(m+0) = malloc(sizeof(int) * 4); //error check: if (m[0] == NULL) {...} m[1] = malloc(sizeof(int) * 4); //error check: if (m[1] == NULL) {...} 4] free(m[0]); free(*(m+1)); free(m); - the order in which you free the rows don't matter - free the arrays each element in m points to, before freeing m (pointer array) - to avoid memory leaks; free the components of heap 2D array in reverse order of allocation 5] *(m[i] + j), (*(m+i))[j], *(*(m+i) + j) - most commonly used 6] *(*(m+0) + 0) = *(*(m)) = **m

1] What is abstraction? 2] What are the 3 faces of memory? 3] What is the goal of Virtual Memory? What is the goal of Hardware Memory? 4] What is VAS? Virtual Address? PAS? Physical Address? 5] What is goal of system view - illusionist? What are pages? What is page table?

1] Tool that allows us to manage complexity by focusing on relevant details 2] Process View = Virtual Memory, System View = Illusionist, Hardware View = Physical Memory 3] Goal of Virtual memory is to provide a simple view to programmers, while the goal of physical memory is to keep the CPU busy (want to leverage memory closer to CPU - ie registers & cache) 4] VAS or virtual address space is illusion that each process has its own contiguous memory space. A virtual address is memory address generated by process that provides location in V.A.S but not actual computer hardware. PAS or Physical Address Space is a multi-level hierarchy (L0: Registers -> L1-L3: Cache -> L4: Main Memory -> L5: Local Secondary Storage -> L6: Network Storage) which ensures frequently access data is next to CPU. A Physical Address is used to access machine memory. 5] Illusionist goal is to make memory shareable but also safe (ie stop you from overwriting some other process) A page is a 4 Kb unit = 4096 bytes A page table is a data structure in operating system, which maps virtual page to physical page (has 2 goals in this process: ensure that processes can't interfere with each other, & only virtual pages have physical pages)

1] Visualize the stack (callers & callee functions). What does the compiler need to do to make function calls work? 2] What is a stack frame at the low-level? In IA-32 what must it be a multiple of? What 2 registers are used for stack frames? *Remember when you visualize the stack it grows bottom to top where bottom is caller, but in memory its different* 3] A callee's args are in ____ stack frame? 4] What is the offset from the %ebp to get to a callee's first argument? 5] When are local vars alloc'd on the stack? (After a function call, why use the stack instead of register?)

1] Transfer control to callee & remember next address, handle passing arguments, allocate & free stack frames, allocate & free params & local vars in those stack frames, handle return values, overhead ("other stuff") 2] A stack frame (aka activation record/frame) is a block of memory used for a function call, in IA-32 its a multiple of 16, and uses %ebp & %esp registers. - %ebp is the base pointer (points to bottom of stack - "technically top since stack grows downward") - %esp is stack pointer (points to to of stack - "bottom") 3] Callers, because caller calls callee function with those args 4] +4 to skip base pointer (%ebp itself) & +4 to skip return address = +8 in total 5] a.) registers are too small (32 bits - 4 bytes), can fit only 1 integer for example so int array, struct, union (aadr) go to stack b.) Not enough registers (for all local vars of function) c.) When addressed (address of - &) operator is used (needs to be DIRECTLY in memory, not INDIRECTLY in register)

1] pointer type - __ *. What is the pointee type used for in assembly? 2] What is the pointer value used for in assembly: 0x____, 0x000 (NULL) 3] Address of: &i, is what instruction in assembly? 4] Dereferencing: *i, is what instruction in assembly?

1] Used to determine the scaling factor, if you want to access memory location in scaled index for example you need scaling index (array - element type is SF) 2] Address used with addressing mode to specify effective address in assembly 3] leal - calculates & stores effective address 4] mov - gets memory/data at effective address & stores that

1] What is a struct? 2] Why use a struct? 3] Write out the 2 ways to define a struct. 4] What is dot operator? 5] Consider code: struct Date { int month; int day; int year; }; Create a Date variable containing today's date. 6] Consider code: typedef struct { int month; int day; int year; } Date; Create a Date variable containing today's date. 7] A struct's data members are initialized/uninitialized by default? 8] A struct's identifier used as source operand reads entire struct/part of struct? 9] A struct's identifier used as destination operand write entire struct/part of struct?

1] User defined type (like a class), compound unit of storage (like array) but with data members of different types, access via identifies & data member name, allocated as contiguous fixed size block of memory. 2] Enables organizing complex data as a single entity 3] struct <typename> { <data-member-declaratns>; }; typedef struct { <data-member-declaratns>; } <typename>; 4] used for member selection of struct 5] struct Date today; today.month = 12; today.day = 10; today.year = 2023; 6] Date today = {9, 21, 2023}; 7] Uninitialized 8] entire struct 9] entire struct, ex: struct Date tomorrow; tomorrow = today; //copies each data member of today to tomorrow, SLOW!

1] What is the simple view of the heap? 2] What is free block organization? 3] Explain what an EFL (explicit free list) is, and its space & time complexity. 4] Explain what an IFL (implicit free list) is, and its space & time complexity. 5] Why do we use EFL or IFL?

1] Uses double-word alignment, so when you allocate blocks you are alloc'ing payload + padding ONLY 2] In simple view of heap, have no way of knowing size (# bytes in block, payload + overhead) or status (is block free'd - 0, or alloc'd - 1) for each block. 3] EFL is when allocator uses separate data structure with free blocks only (list). - Code: only need to track size, because status implied - (Con) Space: potentially more space required - (Pro) Time: a bit faster, only search free blocks 4] IFL uses heap block to track size & status - Code: must track size & status, and check each block - (Pro) Space: potentially less mem required (don't need separate data structure) - (Con) Time: more time required to skip alloc'd blocks as we're looking in heap & not EFL 5] To get free block of right size (maximize mem util & minimize wasted mem/padding)

1] Larger caches want better hit rate or hit time? What about smaller caches?

1] We know hit rate is: # hits/# memory accesses (to get less misses we want more stuff in the cache). Hit time: is the amount of time it takes to get a hit (so if theres more stuff in cache need to traverse more things). Smaller caches have to be fast, so they prioritize hit time (tend to be small), and larger caches prioritize hit rate (tend to be large). 2] ???

Explain each term/function: 1. brk 2. int brk (void *addr) 3. void *sbrk(intptr_t incr) 4. errno 5. What is the caveat with using both malloc/calloc/realloc and break functions (int brk & void *sbrk)?

1] pointer to end of program (end of code seg + data), at top of heap. 2] Sets top of heap to specified addr, returns 0 if successful, else -1 and sets errno 3] intptr_t is a signed integer Changes programs top of heap by incr # of bytes, returns old brk if successful, else -1 and set errno. Is safer option than int brk() function, because instead of directly setting brk point, wherever it is just make heap bigger/smaller. Remember: + (post increment) for expand/bigger, - (neg increment) smaller/shrink 4] (field, value, register) that is set by operating system to communicate error. In Code: #include <errno.h> printf("Error: %s\n", stream(errno)); 5] Its undefined (cause unpredictable behavior)

1] What is Buffer Overflow? 2] What happens to the stack if lets say you add a stack frame for an array (you make a SAA) and then exceed its buffer. 3] What is stack smashing? 4] Did this happen? What did it do? 5] Can this still happen?

1] When we exceed the bounds of array, more dangerous for SAA: stack allocated arrays 2] Once it finishes the data in the callee's stack frame at Stack TOP (so bottom), then it writes to data in the caller's stack frame and keeps going until it hits Stack BOTTOM (so top) & entire stack is now corrupted. - This overwrites the state of execution (because it overwrites caller's S.F's) 3] Step 1] Get "exploit code" in - Enter input crafted to be machine instruction (Buffer Overflow occurs when you unintentionally override some other part of memory, but here you're doing it intentionally) Step 2] Get "exploit code" to run - overwrites return addresses with address of buffer Step 3] Cover your tracks - restore stack so execution continues as expected Note: Basically overwriting someone elses code 4] In 1988 the Morris Worm brought down the Internet. 5] No, loader/operating systems (O.S's) people created sandboxes that don't allow you to access somebody else's code.

1] What is Write Misses? 2] Should space be allocated in this cache for the block being changed? 3] For typical designs Write Through is paired with what? Consider most optimal design. 4] Write Back is paired with what? Consider most optimal design. 5] Out of Q3 & Q4 which pair better exploits locality?

1] Write miss occurs when writing to a block not in the cache system (no cache level) 2] a.) No Write Allocate: write directly to next lower cache level bypassing current cache level - (Con): must wait for lower-level write - (Pro): less bus traffic (memory being transferred in the cache system, because your not writing it to more than 1 cache level) b.) Write Allocate: read block into cache first (miss), then write to it (hit) - (Con): must wait to read from lower level (???) - (Con): more bus traffic (because writing to multiple levels) 3] Write through is paired with no write/allocate (because write through writes to current & lower cache level & no write allocate writes directly to lower cache level, we minimize redundancy in higher cache levels/have less blocks/memory) 4] Write back is paired with write allocate (because write back writes to lower level only when current level gets evicted & write allocate reads a block first, so it CHECKS if its already in cache, if eviction, then only moves it down). This is a good pair because it keeps memory higher in the memory mountain (closer to CPU - therefore better locality). 5] Write back + write allocate because its more concentrated at the top, (only use when you're not worried about getting it to main memory).

1] Is this a SAA 2D array: void someFunction() { int m[2][4] = { {0,1,2,3,4} , {4, 5, 6, 7} }; 2] How are SAA 2D arrays laid out in memory? 3] Given m array above, what is type & scale factor (S.F) for following: a.) **m = *(*(m)) = *(*(m+0)+0) = m[0][0] b.) *m = *(m+i) c.) m[0]? m[i]? d.) m? 4] In 2D Stack Arrays ONLY what are m & m* 5] What is address arithmetic for SAA: m[i][j]

1] Yes because a) its localized, and its dimensions are in declaration (array size declared at assignment, not dynamic) 2] 2D arrays allocated on stack are laid out in row-major order as single contiguous block 3] a. type: int, S.F: none (not jumping between anything, no byte offset) b. type: int *, S.F: Stack - 16 bytes = 4 elements * 4 bytes/element (because its laid out in row-major order, need to traverse each element in row). Heap - 4 bytes = 4 bytes/element = next row int * c. Same as above (b) just using indexing instead of address arithmetic d. type: int **, S.F: Stack - element size * #columns, Heap - size of (int *) 4] Both are pointers to first element in array 5] m[i][j] = *(*(m+i) + j) = *(*m + cols * i + j)

1] What is the cache? 2] What are the 3 memory units for cache, explain each of them? 3] What are CPU Cycles? 4] What is latency?

1] a smaller but faster memory (pallet) that acts as a staging area for data stored in larger slower memory (warehouse) 2] word: 4 bytes, size used by CPU, transfer between L1 Cache & CPU block: 32 bytes, sized used by Cache, transfer between Cache levels (C) & Main Memory (MM). *Is contiguous block of bytes* (AKA Cache Block) page: 4 Kb = 4096 bytes, size used by Main Memory, transfer between Main Memory (MM) & Secondary Storage (SS) 3] CPU cycles are used to measure time 4] Latency is memory access time (delay)

1] The last word of each free block is a __a__ which contains __b__. Why do they exist? Why don't alloc'd blocks need footers? 2] If only free blocks have footers, how do we know if previous block will have a footer? 3] Free & alloc'd block headers encode _____? 4] What integer value will header have for alloc'd block that's: a.) 8 bytes in size + prev block free b.) 8 bytes in size + prev block alloc'd 5] Given pointer to payload, how do you get to header of previous block if it's free? 6] When you allocate a block you need to update the next blocks __ bit?

1] a) footer b) free block size. They exist to assist with coalescing, alloc'd blocks don't need to be coalesced because they aren't free so don't have footers 2] Need a p-bit (previous bit) to track if previous block alloc'd/free (remember 1 means alloc'd & 0 means free 3] size + p-bit + a-bit 4] a) 8 + 0 + 1 = 9 b) 8 + 2 (10 in binary = 2 in decimal) + 1 = 11 5] Step 1] ptr - sizeof(HDR) //gets current block size_status Step 2] check p bit of size_status, if its 0 then previous block is free & we can check footer. Do ptr - 8 to get to footer. From footer we can get previous block size. Step 3] ptr - 4 - prevBlockSize //ending equation 6] p-bit (previous bit)

1] Explain the follow parts of an Implicit Free List implementation: - The first word of each block is a __a__ - The header stores __b__ 2] Given a pointer to first block in heap, how is next block found? 3] Explain all 3 placement policies (First Fit - FF, Best Fit - BF, Next Fit - NF) and memory utilization & throughput for each. Assume heap is divided into blocks, going from small too large. 4] Too maintain double-word alignment, payload address has to be multiple of 8, but the first header is only 4 bytes? What is done to keep this convention?

1] a) header, b) both size & status as a single integer 2] (void*) ptr + current.block.size - current.block.size gives int 3] First Fit (FF): starts from beginning of heap, & stops at 1st block big enough, fails if reaches END_MARK (end of heap) - Mem Util (Pro): likely to choose block of right size (because starting from small blocks) - Throughput (Con): since it starts at beginning, need to step through many blocks to find larger sizes Next Fit (NF): starts from block most recently alloc'd, stops at 1st block big enough, goes to END_MARK & wraps around and fails if it hits the first block we checked - Mem Util (Con): may pick block too large (creates internal fragmentation) because not starting at beginning of heap - Throughput (Pro): Faster because it doesn't start from beginning of heap BF (Best Fit): starts from beginning of heap, stops at END_MARK & chooses best fit (closest to requested size), stops early if exact match found, fails if no block is big enough - Memory Util (Pro): closest in size - Throughput(Con): Slowest in general, must search entire heap 4] Skip the first 4 bytes in heap

Explicit Free List Improvements 1] What are the 2 types of free list orderings (explain each one of them)? 2] What is Free List Segregation? What is simple segregation? Fitted segregation?

1] a.) Address order: order free list from lowest to highest address (Con) Free is O(N) where N is # of free blocks in EFL, because its a linked list, thus we need to traverse through links from beginning to wherever your adding block (Pro) malloc with FF (First Fit) has better memory utilization than "last-in" order b.) Last-in Order: place most recent free block at start of list (Pro) malloc with FF (First Fit) - program with same size memory blocks (doesn't change) (Pro) Free - O(1) just link free block at the start of EFL 2] Array of free lists, where each element inside the array has a size range and we put blocks corresponding to that size range in that element - Simple Segregation: 1 EFL for each size heap block structure: simple - no head, block has only successor ptr malloc: fast - O(1) choose 1st free block in Linked list of free blocks - if free list is empty, get more heap memory from O.S, and divide into correct sized blocks for Array. Free: fast O(1), link free block to struct of appropriate size list Problem: fragmentation - Fitted Segregation: one free list for each size range: small, medium, large. (Pro) memory utilization, can be as good as best-fit. (Pro) throughput, since EFL no search through alloc'd blocks Fitting: do FF search on appropriate list, if FAIL look in longer-list Splitting: put new free block in appropriate list after split (becomes smaller) Coalescing: put new free block in appropriate list (becomes larger)

1] a.) Explain set associative cache, for set associative cache: b.) What is the address breakdown id blocks are 32 bytes & there are 1024 sets? c.) Let E be number/# of lines/sets. Name the cache if E=4 & E=1? d.) C = (S, E, B, m), let C be cache size (in bytes) - ie C = S x E x B. How big is a cache given (1024, 4, 32, 32)? e.) What happens when E+1 different memory blocks map to same set

1] a.) Has s sets with E lines/set, memory blocks map to 1 set & can be in any line in that set. - (Pro): Fast O(1) - set selection, O(E) - line matching - (Pro): reduces conflict miss - (Con): more circuitry (not as much as fully associative) b.) 32 = 2^5, b-bits = 5, S = 1024, s-bits = 10. 32 - 10 - 5 = 17 are for t-bits c.) E=4 -> 4 Way Set Associative, E=1 -> Direct-Mapped Cache d.) C = 1024 x 4 x 32 = 2^10 x 2^2 x 2^5 = 2^17 = 128 Kb e.) You cannot add a new line to a set, so we leverage replacement policy (to pick which block to evict from line, replace old data with new block's data, t-bits/tag & v-bit - metadata remains the same)

Linux: 1] Explain the commands cd, cd .., ls, ls -al, man <command>, cp <file> <destination>, rm <file>, mkdir <directory>, rmdir <directory>, mv <file> <destination> 2] What are the 4 steps of the build process for c files? 3] Explain what all the switches in this command: gcc -E decode.c -Wall -m32 -std=gnu99 -o decode.i do? 4] What are the 2 ways to debug?

1] cd changes directory, cd .. changes to parent directory, ls shows files in directory, ls -al shows file & details for files in directory, man gives manual for a command, cp copies a file to a destination, rm removes a file or directory (need to remove all files in directory first), mkdir makes a directory, rmdir removes a directory, mv moves a file to a location 2] Pre-processing Phase (prepares for compilation), Compilation Phase (turns into assembly), Assembling Phase (turns into binary) & Linking Phase (links source files, libraries, etc) 3] gcc is used to make c-executable. -E starts processing compilation, -Wall generates all warnings, -m32 generates code in 32 bit environment, -std=gnu99 specifies language to C, -o renames the output 4] Print statements and using GDB

1] What happens if free block chosen is bigger than requested? 2] What happens if there isn't a large enough free block to satisfy request? 3] What are the type of coalescing mechanism? 4] Is coalescing constant or linear (depends on # heap blocks)? 5] What is scale factor of void *? 6] Given pointer to payload, how do find its block header? Find block header off next block? Previous block? 7] Why Do We Use Type Casting?

1] either A) Use entire block (Pro) Throughput: Fast (bigger blocks = less blocks to traverse), & simple to code. (Con) Memory Utilization: more internal fragmentation (because larger block = more padding). (Con) must pre divide heap into various sized blocks (__)?? B) Split into 2 blocks, 1st alloc'd block & 2nd free block (Pro) Memory Utilization: less internal fragmentation (Con) Throughput: Creating more blocks so slower to search all blocks (Pro) Splitting is fast ~ O(1) operation 2] A)You can coalesce (merge) adjacent free blocks, B) ask kernel for more heap memory, C) if allocation fails (return NULL) 3] Immediate - coalesce next & previous (left & right neighbors) on free operation, Delayed - coalesce only when needed by alloc operation (ie helper function coalesce() that gets called to see if theres enough space) 4] Constant - O(1) 5] 1 6] HDR PTR = (void *) ptr - 4 OR ptr - sizeof(blockHDR) - next block: (ptr - 4) + "size of curr block" - previous block: (ptr - 4) - "size of prev block" 7] To set correct scale factor, ie (void *) S.F & (char*) S.F = 1

1] In linux for standard input, output redirection, explain what the following do: a.out < in_file, a.out > out_file, a.out >> append_file 2] What is File I/O? What is File Input (explain functions: fgetc, fgets, fscanf)? 3] What is File Output (explain functions: fputc, fputs, fprintf)? 4] What are the 3 predefined File Pointers?

1] input redirection: a.out < in_file, output redirection: a.out > out_file (overwrites existing file), a.out >> append_file (doesn't overwrite existing file, instead appends it) 2] File Input: fgetc: reads 1 char at a time fgets: reads 1 string terminates with new line char or EOF int fscanf(FILE *stream, const char *format_string, &v1, &v2, ...) → Note: FILE * is ptr to mem location of where you are in file. Reads formatted input from stream & returns number of inputs stored, or EOF if error/end-of-file occurs before any inputs 3] File Output: fputc: writes 1 char at a time fgets: reads 1 string terminates with new line char or EOF int fprintf(FILE *stream, const char *format_string, v1, v2, ...) → write formatted output to stream, returns number of chars written or negative if error (same as sprintf but add on file stream) 4] stdin - is console keyboard, stdout - is console terminal window, stderr - is console terminal window, second stream for errors - Note: printf("Hello\n") = fprintf(stdout, "Hello\n")/just does printf with file

1] Identify which C loop statement (for, while, do-while) correspond to each code fragment below: //A: loop1: loop_body t = loop_condition if (t) goto loop1: //B t = loop_condition if (!t) goto done; loop2: loop_body t = loop_condition if (t) goto loop2 done: //C loop_init t = loop_condition if (!t) goto done: loop3: loop_body loop_update t = loop_condition if (t) goto loop3 done: 2] Most compilers (gcc included) have base loop assembly code on ___ loop form (as shown above)

A: Do While do { loop body } while loop condition B: While //does body 0 or more times while (loop condition) { loop body } C: For Loop //does loop body fixed # of times for (loop init; loop cond; loop update) { loop body; } 2] Do-while

What is m in terms of s, b, t, S, B, E? E = # lines/set S = # sets B = # bytes/block t = t-bit s = s-bit b = b-bit

M is address breakdown so s + b + t (set + b-bit + tag)


Kaugnay na mga set ng pag-aaral

AMNH genetic and genomic in nursing

View Set

Vocabulary Definitions: "All Summer in a Day" by Ray Bradbury

View Set

HUN2202 Nutrition Ch 12 Water and the Major Minerals

View Set

Chapter 13: Implementing Basic Differences Tests

View Set

EMT Chapter 16: Cardiovascular Emergencies

View Set