Chapter 5: Cache Memory
First in First out (FIFO) cache replacement algorithm
Replace the block in cache that has been in the cache longest with no reference to it.
Least recently used (LRU) cache replacement algorithm
Replace the block in cache that has been in the cache the longest with no reference to it.
Least Frequently Used (LFU) cache replacement algorithm
Replace the block with the least references to it.
Smaller caches are _________ than larger ones
Smaller caches are faster than larger ones because less addresses are needed.
Set Associative Mapping
The cache consists of a number of sets, each of which consists of a number of lines. Each block of memory is mapped to a specific set. However, within each set, associative mapping is used to place a block within a specific place in the set
Write back
Updates are made only to the cache. When a write occurs, a dirty bit aka use bit associated with the line is sent to memory
Unified vs split cache
Usually a unified cache is preferable, where data and instructions are both in the same cache, unless superscalar execution and pipelining are used.
If a write is attempted on something that isn't in the cache, two strategies include:
Write allocate: bring the data to the cache and then edit No write allocate: make the change in main memory
Physical cache
a cache between the MMU and main memory. Physical addresses are used.
Logical cache aka virtual cache
a cache between the MMU and processor. The processor accesses the cache directly using virtual addresses without going through the MMU.
Exclusive policy
a piece of data in one cache is guaranteed not to be found in lower levels of cache
Inclusive policy
a piece of data in one cache is guaranteed to also be found in lower levels of cache.
non-inclusive policy
a piece of data in one cache may or may not be found in lower levels of cache
Tag
a portion of a cache line used for addressing purposes
Write through
all write operations are made to main memory as well as cache, ensuring that main memory is always valid.
Line/block size is usually
between 8 and 64 bytes. A larger block means more locality but also more often replacements in cache
Frame
can be used to describe a block size
High-performance computing (HPC)
deals with supercomputers and their software, especially for scientific applications that involve large amounts of data.
Direct mapping
each block of main memory is mapped into only one possible cache line. The most significant bits of the memory address become the tag
line size
the length of a line
Block
the minimum unit of transfer between cache and main memory. A block refers to both the unit of data and the physical location in main memory or cache
Hardware memory management unit (MMU)
translates each virtual address into a physical address in main memory
Location of L3 cache
usually off-chip
Content Addressable Memory (CAM) aka associative storage
when a bit string is supplied, the CAM searches its entire memory in parallel for a match in only one clock cycle and returns the result.
critical word first technique
when there is a cache miss, the hardware sends the requested data to the processor before loading the cache.
Methods of maintaining coherence between multiple processor's cache
-Bus watching with write through (all caches are updated when one change is made) -Hardware transparency (additional hardware) -Noncacheable memory (a portion of MM cannot be cached)
Line (cache)
A portion of cache memory capable of holding one block of main memory. Each line holds one block.
Associative mapping
Each cache tag field uniquely identifies a block of main memory. To determine if a block is in the cache, the cache control logic must check every line's tag. The advantage is that not the whole cache has to be replaced at once. (Any one block of main memory can be put in any line in the cache)
Three layers of cache
L1, L2, and L3. L1 is the fastest
Cache replacement algorithms
Least Recently used (LRU), FIFO, and Least frequently used