CS 354 Exam 2

Ace your homework & exams now with Quizwiz!

How many blocks map to each set for a 32-bit AS and a cache with 1024 sets?

(2^(32 -5) ) / 2^10= 2^27 / 2^10 = 2^17 blocks/set -this is 2^t where t -bits are the bits of the addr that identify which block in set

given pointer to the first block in heap, how is next block found

(void *)ptr + current block size

2^10 bytes

1 kilobyte (kb)

2^100 bytes

1 megabyte (mb)

The caching system uses locality to predict what the cpu will need in the near future: 1. temporal 2. spatial

1. anticipates data will be reused so it copies values into cache 2. anticipates nearby data will be used so it copies a block

1. Eb 2. Imm 3. s

1. base register (starting address) 2. offset value 3. scale factor can be 1,2,4,8

1. word offset 2. byte offset

1. identifies which word in block 2. identifies which byte in word

Rethinking Addressing: -An address identifies which byte in VAS to access -An address is divided into parts to access memory in steps Step 1: ? Step 2: ? Step 3: ?

1. identify which block in VAS 2. Identify which word in block (3 bits) 4. Identify which byte in word (2 bits)

The block number bits of an address are divided into 2 parts 1. set 2. tag

1. maps block to specific set in cache 2. uniquely identifies block in the set

Goals of Allocator design: 1.maximize throughput 2.maximize memory utilization

1. number of malloc and frees handled (/time) 2. memory requested/ heap allocated

Free List Ordering 1. address order 2. last-in order

1. order free list from low to high address (+) malloc with FF has better memory utilization (-) free slower O(N) where N is # of free block 2. place most recently freed block at front of EFL (+) malloc with FF for programs that request same size (+) free O(1) just link at head of EFL (+) O(1) -> coalesce with footers

Memory Units 1. word 2. block 3. page

1. size used by CPU (L1 -CPU) 2. size used by C (C levels & MM) 3. size used by MM (transfer between MM && SS)

%ax

2 bytes

How many 32 -byte blocks in a 32-bit address space?

2^32 / 2^5 = 2^27 = 2^7 * 2^20 = 128mB

word double word

4 bytes 8 bytes

Cache Block: How big is a block? Let B be number of bytes per block, IA-32 is 32B/block

B = 2^b 32 = 2^5 b = number of addr bits to determine which byte in block

C = (S,E,B,m)

C(size in bytes of cache) = SxExB S = number of sets E = number of lines / set B = bytes/block m = bit address space, number of bits required to access mem locations

Explicit Free List Layout

Header = block size + 0pa <- pred Free Block (addr of free block before) succ Free Block (addr of free block after)-> possibly more free words Footer = only block size

placement policies

L1 - unrestricted L2 - restricted (block % 16)

Cache Block: How many bytes in an address space? The bits of an address are used to determine if block containing that addr is in cache. Let M be # of bytes in AS, IA-32 is 4GB

M = s^m m is number of bits in an address 4GB = 2^32

void *realloc(void *ptr, size_t size)

Reallocates to size bytes a previously allocated block of heap memory pointed to by ptr, or returns NULL if reallocation fails.

Cache size =

S(number of sets) x B(number of bytes)

void *sbrk(intptr_t incr)

SAFER attempts to change the programs's top of heap by incr(+/-) bytes. Return the old brk if successful, else -1 and sets errno

Free List Segregation

Use an array of free lists segregated by size malloc chooses appropriate free list based on required size simple: one free list per size fitted: one free list for each size range

fully associative cache

a cache having one set with E lines per set where mem block stored in any line -good for small caches

miss penalty

additional time to process a miss

which of the following contribute to external fragmentation A. block padding B. block headers C. adjacent free block D. adjacent allocated blocks E. block payloads F. non adjacent free blocks

adjacent free blocks, non adjacent free blocks

void *calloc(size_t nItems, size_t size)

allocates, clears to 0, and returns a block of heap memory of nItems * size bytes. Or returns NULL if allocation fails

cache hit

block is found in cache

cache miss

block not found in cache

victim block

cache block chosen to be replaced

direct mapped cache

cache have S sets with one line per set where mem blocks map to one set -good for big caches

an alllocater _____ reorder allocate requests to improve heap memory utilization

cannot

called coalescing

coalesce only if use calls coalesce function

delayed coalescing

coalesce only when need by an alloc operation

immediate coalescing

coalesce with next and previous on free operation

unistd.h

collection of system call wrappers

Explicit Free List

data structure with list of free blocks -components point to payload addresses (-)space (+)time

%eax

e = extended 4 bytes

External Fragmentation

enough heap memory but divided into blocks that are too small

register

fastest memory directly accessed by ALU can store 1,2,4 bytes

Memory Hierarchy: CPU L0 -> registers L1 - L3 -> cache L4 -> main memory L5 -> Local Secondary Storage (SS) L6 -> Network Storage

gives illusion of having lots of fast memory

false fragmentation

have large enough contiguous free blocks, but are divided into blocks that are too small

set associative cache

having S sets with E lines per set, mem blocks map to 1 set and can be in any line within that set.

Internal Fragmentation

heap memory is in a block used for overhead

Cache Performance: More Lines

hit rate: better more temporal locality hit time: worse, slower line matching miss penalty: worse, harder/slower to detect miss -therefore faster caches have fewer lines per set

Cache Performance: Larger Blocks

hit rate: better, more spatial locality per block hit time: same miss penalty: worse, more time to transfer larger block -therefore block sizes are small, 32 bytes or 64 bytes

Cache Performance: More Sets

hit rate: better, more temporal locality hit time: worse, more sets slow set selection miss penalty: same -therefore, faster caches have fewer sets

Associativity of a set

how many lines per set (E)

%eip

instruction pointer to next instruction

DIY Heap via Posix Calls

int brk(void *addr) void *sbrk(intptr_t incr) errno

cache line

location in cache that can store one block of memory composed of storage for the block of data and info needed for cache operation

latency

memory access time (delay)

Stride Misses

min( 1, (wordsize * k) / B) * 100 where k is stride lenght in words and B is blocksize in bytes

Least Frequently Used Replacement

must track how often a line is used each line has a counter -zeroed when line gets new block -incremented when line is accessed -if tie, choose randomly for replacement

hit rate

number of hits / number of memory accesses

Write Hits

occur when writing to a block that is in this cache

Write Misses

occur when writing to a block that is not in this cache

Memory operand specifier: Imm

operand value: M[EffAddr] effective address: Imm Addressing mode: Absolute

Memory operand specifier: Imm(%Eb)

operand value: M[EffAddr] effective address: Imm + R[%Eb] addressing mode: base + offset

Memory operand specifier: Imm(%Eb,%Ei)

operand value: M[EffAddr] effective address: Imm + R[%Eb] + R[%Ei] addressing mode: indexed + offset

Memory operand specifier: Imm(%Eb,%Ei,s)

operand value: M[EffAddr] effective address: Imm + R[%Eb] +R[Ei]*s addressing mode: scaled index

Memory operand specifier: (%Ea)

operand value: M[EffAddr] effective address: R[%Ea] Addressing mode: Indirect

Memory operand specifier: (%Eb,%Ei)

operand value: M[EffAddr] effective address: R[%Eb] + R[%Ei] addressing mode: indexed (base + index)

Memory operand specifier: (%Eb,%Ei,s)

operand value: M[EffAddr] effective address: R[%Eb] +R[%Ei]*s addressing mode: no offset

Write Allocate

read block into cache first then write to it (-)must wait to read from lower level

word offset

remainder of b bits that remain after taking the 2 least significant bits as byte offsets of word

cold miss

room in cache but block not there

errno

set by OS function to communicate error #include <errno.h> printf("Error. %s\n", strerror(errno));

working set

set of blocks used during some interval of time

int brk(void *addr)

sets the top of heap to the specified address addr. Returns 0 if successfull, else -1 and sets errno

temporal locality impacts:

size

cache

smaller faster mem that acts as a staging are for data stored in a larger slower mem

Memory Mountain

smaller size, smaller stride => faster throughput (MB/s)

immediate operand

specifies an operand value that's a constant Specifier = $Imm Operand Value = Imm

Register operand

specifies an operand value that's in a register Specifier = %Ea Operand Value = R[%Ea]

Memory operand

specifies an operand value that's in memory effective address

destination (D)

specify location for destination (write)

source (S)

specify location of source (read)

Best-Fit

start from beginning stop at END_MARK and choose best fit closest to required size or stop early if exact match fail if no block is big enough

First-Fit

start from beginning of heap stop at first block that is big enough fail if reach the END_MARK

Next-Fit

start from block most recently allocated stop at first free block that is big enough fail if reach first block we checked (wrap around)

stride

step size is measure in words (4 bytes) good spatial locality when stride is about 1 word

spatial locality impacts:

stride

hit time

time to determine cache hit

Least Recently Used Replacement

track when line was last used use LRU queue - when line is used move to front use status bits to track

conflict miss

two or more blocks map to the same location

cache block

unit of memory transferred between main memory and cache level ex. 32 bytes/block in IA-32

%ah, %al

upper and lower 8-bit halves of the 16-bit %ax register, respectively.

Implicit Free List

use heap blocks to track size and status (+)space (-)time

how do you know if a line in the cache is used or not?

use status bit, v-bit, if v = 1, cache block is copied to cache line

cpu cycles

used to measure time

temporal locality

when a recently access memory location is repeatedly accessed in the near future

spatial locality

when a recently accessed memory location is followed by nearby memory locations being accessed in the near future

capacity miss

when cache is too small for working set

set

where block is uniquely mapped to in a cache

No Write Allocate

write directly to next lower level bypassing this cache

Write Back

write to next lowe level only when changed

Write Through

write to this cache and next lower level cache


Related study sets

POB module 1: homework questions

View Set

intro to unix second half of semester 10-18

View Set

the menstrual cycle practice quiz

View Set

Final Exam Questions: Homework 10

View Set

Introduction to SEO - lecture notes

View Set

STUDY GUIDE MODULE 3 SPR 223 BIO 266

View Set

TIA - CH4 Application Software Programs That Let You Work and Play Quiz

View Set

Biology - week 6 respiration and enzymes

View Set