ECE 30 -- Computer Engineering -- Memory Hierarchy

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Miss Penalty (general)

The time required to fetch a block into a level of the memory hierarchy from the lower level: (Memory Access Time from cache to memory) + (Address Transfer Time) + (Word Transfer Time from memory back to cache)

In a fully associative cache, the block offset (set containing a memory block) is given by...

all tags of all the blocks in the cache must be searched. The block can go anywhere.

Fully Associative Cache: Tag

floor( (Block address/number) modulo (Number of blocks in the cache) ) # of tag bits = number of block-address bits

Index for two-way set associative cache w/ 16 words, two-word blocks

floor((address % (# blocks)) / (set #))

Tag for two-way set associative cache w/ 16 words, two-word blocks

floor(address / #blocks)

Miss Rate

misses per instruction

disk access transfer time

the time to transfer a block of bits

Block Number

(Block Address) mod (# blocks in cache)

Block offset for two-way set associative cache

(Block Address) mod (set#)

Cache Set

(Block address) mod (# Set)

Fully Associative Cache: Block Offset

(Block address/number) modulo (Number of blocks in the cache)

In a direct-mapped cache, the block offset (the position of a memory block) is given by...

(Block address/number) modulo (Number of blocks in the cache)

In a set-associative cache, the block offset, the set position containing a memory block, is given by...

(Block address/number) modulo (Number of sets in the cache)

valid bit

A field in the tables of a memory hierarchy. [...] indicates that the associated block in the hierarchy contains true data.

reference bit/use bit

A field that is set whenever a page is accessed and that is used to implement LRU or other replacement schemes.

Write Buffer

A queue that holds data while the data is waiting to be written to memory.

Least Recently Used (LRU)

A replacement scheme in which the block replaced is the one that has been unused for the longest time.

Cache Miss

A request for data from the cache that cannot be met because the data is not currently in the cache.

Split Cache

A scheme in which a level of the memory hierarchy is composed of two independent caches that operate in parallel with each other, with one handling instructions and one handling data.

Write-through

A scheme in which writes always update both the cache and the next lower level of the memory hierarchy, ensuring that data is always consistent between the two.

Write-back

A scheme that handles writes by updating values only to the block in the cache, then writing the modified block to the lower level of the hierarchy when the block is replaced.

Describe a direct-mapped cache: Hits and Misses

A word can go in exactly one location and there is a separate tag for every word.

Physical Address

The address in main memory

Address Translation/Mapping

The process by which a virtual address is mapped to an address used to access memory.

Hit Time

The time required to access a level of the memory hierarchy, including the time needed to determine whether the access is a hit or a miss.

tag bit

-Contains address information -Required to identify whether the associated block in the hierarchy corresponds to a requested word -A field in a table

write-back pros

-copying back an entire page is much more efficient than writing individual words back to the disk -

write-back cons

-costly -complications for caches that are not present for reads -the policy on write misses -efficient implementation of writes in write-back caches

Set Entry

...

Example: Mapping an Address to a Multiword Cache Block Consider a cache with 64 blocks and a block size of 16 bytes. To what block number does byte address 1200 map?

11

Example: Bits in a Cache How many total bits are required for a direct-mapped cache with 16 KB of data and 4-word blocks, assuming a 32-bit address?

16 KB is 4K (2^12) words. With a block size of 4 words (2^2), there are 1024 (2^10) blocks. Each block has: (4 × 32 or 128 (2^7) bits of data) + (a tag, which is 32 - 10 - 2 - 2 bits) + (a valid bit = 1) Thus, the total cache size is 2^10 × (4 × 32 + (32 - 10 - 2 - 2) + 1) = 210 × 147 = 147 Kbits or 18.4 KB for a 16 KB cache. For this cache, the total number of bits in the cache is about 1.15 times as many as needed just for the storage of the data.

Write-stall cycles

= ( Writes / Program × Write miss rate × Write miss penalty) + Write buffer stalls

CPU Time

= (CPU execution clock cycles + Memory-stall clock cycles) × Clock cycle time

Memory-stall clock cycles

= (Memory accesses / Program) × Miss rate × Miss penalty = (Instructions / Program) × (Misses / Instruction) × Miss penalty

Average Memory Access Time (AMAT)

= (Miss rate x Miss penalty) + Time for a hit

Read-stall cycles

= (Reads / Program) × Read miss rate × Read miss penalty

dirty bit

= 1 when any word in a page is written. If the OS chooses to replace the page, it indicates whether the page needs to be written out before its location in memory can be given to another page. A modified page is often called a [...] page.

Memory-stall clock cycles

= Read-stall cycles + Write-stall cycles

Fully Associative Cache

A cache structure in which a block can be placed in any location in the cache.

Set-associative Cache

A cache that has a fixed number of locations (at least two) where each block can be placed. A [...] is the middle range of designs between direct mapped and fully associative.

Virtual Address

An address that corresponds to a location in virtual space and is translated by address mapping to a physical address when memory is accessed.

Page fault

An event that occurs when an accessed page is not present in main memory.

Byte Address

Block Address x Bytes per block

Block Address

Byte address / Bytes per block

write allocate/no write allocate

Consider a miss in a write-through cache. [...]: to allocate a block in the cache (most common). The block is fetched from memory and then the appropriate portion of the block is overwritten. no [...]: update the portion of the block in memory but not put it in the cache.

Describe a direct-mapped cache: Cache/Memory Consistency Alternatives

Direct-mapped cache: A write-through scheme can be used, so that every write into the cache also causes memory to be updated. The alternative to write-through is a write-back scheme that copies a block back to memory when it is replaced.

Describe a direct-mapped cache: Bandwidth

Direct-mapped cache: To avoid performance loss, the [...] of main memory is increased to transfer cache blocks more efficiently. Common methods for increasing [...] external to the DRAM are making the memory wider and interleaving. DRAM designers have steadily improved the interface between the processor and memory to increase the [...] of burst mode transfers to reduce the cost of larger cache block sizes.

Describe a direct-mapped cache: Spacial Locality

Improves the efficiency of the cache. To take advantage of [...], a cache must have a block size larger than one word. The use of a larger block does the following: 1. decreases the miss rate 2. reduces the amount of tag storage relative to the amount of data storage in the cache Drawbacks: Although a larger block size decreases the miss rate, it can also increase the miss penalty. If the miss penalty increased linearly with the block size, larger blocks could easily lead to lower performance.

The location of a memory block whose address is 12 in a cache with eight blocks for fully associative placement is...

In a fully associative placement, the memory block for block address 12 can appear in any of the eight cache blocks.

The location of a memory block whose address is 12 in a cache with eight blocks for set-associative is...

In a two-way set-associative cache, there would be four sets, and memory block 12 must be in set (12 mod 4) = 0; the memory block could be in either element of the set.

The location of a memory block whose address is 12 in a cache with eight blocks for direct mapped is...

In direct-mapped placement, there is only one cache block where memory block 12 can be found, and that block is given by (12 modulo 8) = 4.

Interleaved Memory Organization

Instead of making the entire path between the memory and cache wider, the memory chips can be organized in banks to read or write multiple words in one access time rather than reading or writing a single word each time. Each bank could be one word wide so that the width of the bus and the cache need not change, but sending an address to several banks permits them all to read simultaneously. This scheme retains the advantage of incurring the full memory latency only once.

Fully Associative Cache: Index

No Index. Must search every block for match.

block (or line)

The minimum unit of information that can be either present or not present in a cache.

write-back operation

Values only saved to block in cache, then updates the lower-level memory when block is replaced.


Ensembles d'études connexes

Nrsg III general material from my personal notes

View Set

PSY 201 Chapter 7 practice questions

View Set

Chapter 2: Estates in Real Property and Forms of Ownership

View Set

Acct Theory Exam 2 chpts 9,10,11,12,13

View Set