Ch 4: Cache
true
multilevel cache complicates all of the design issues related to cache
replacement algorithms
must be implemented in hardware to achieve high speed
magnetic surface memories
nonvolatile
cache
not usually visible to the programmer or processor
associative mapping
number of lines in cache is not determined by the address format
capacity
obvious characteristic of memory
secondary memory
optical and magneto-optical
victim cache
originally proposed as an approach to reduce the conflict misses of direct mapped caches without affective its fast access time
true
originally, typical system had a single cache
associative mapping
permitting each main memory block to be loaded into any line of the cache disadvantage: complex circuitry required to examine the tags of all cache lines in parallel
organization
physical arrangement of bits to form words
main memory
portion of this can used as a buffer to hold data temporarily that is to be read out to disk
associative access
cache memories may employ this accessing method
lines
cache that consists of m blocks
split cache
cache that supports pipelines
noncacheable memory
chip-select logic or high-address bits
memory cycle time
concerned with the system bus, not the processor
external memory
consists of peripheral storage devices, such as disk and tape, that are accessible to the processor via IO controllers
capacity and performance
from a user's point of view, the two most important characteristics of memory are
victim cache
fully associative cache
two lines per set
most common set-associative organization significantly improves hit ratio over direct mapping
external mass storage devices
most common: hard disk and removable media
true
most contemporary cache designs include both on-ship and external caches
LRU
most effective and most popular replacement alg
Unified Cache
Which has a higher hit rate? Unified or split cache?
sequential access
access must be made in a specific linear sequence, memory is organized into records
performance parameters
access time(latency), memory cycle time, and transfer rate
hardware transparancy
additional hardware is used to ensure that all updates to main memory via cache are reflected in all caches
true
all accesses to shared memory are cache misses
true
all write operations are made to main memory as well as cache in write through operations
write through
all write operations are made to main memory as well as to the cache, ensuring main memory is always valid
cache
another form of internal memory
32 KB
beyond this size, increase in cache size brings no significant increase in performance
thrashing
blocks will be continually swapped in the cache and the hit ratio will be low
approaches to cache coherency
bus watching with write through hardware transparency non-cacheable memory
problems with write policy
1) more than one device may access memory causing corresponding memory word to be invalid if cache is only altered 2) if a word is altered in one cache, it could conceivably invalidate a word in other caches
access time
for random access memory, time it takes to perform a read or write operation
MMU
for reads to and from main memory, a hardware this translates each virtual address into a physical address in main memory
true
more than one device may have access to MM
cache hit
data and address buffers are disabled and communication is only between processor and cache, with no system bus traffic
bus organization where cache and main memory are shared
data in one cache is altered, then invalidates the corresponding word in main memory, but also same word in other caches
blocks
data often transferred in much larger units than a word, called
memory hierarchy
decreasing cost per bit increasing capacity increasing access time decreasing frequency of access of the memory by the processor
bus watching with write through
depends on the use of a write-through policy by all cache controllers
cache memory
designed to combine the memory access time of expensive, high speed memory combined with the large memory size of less expensive, lower speed memory
cache miss
desired word if first read into the cache and then transferred from cache to processor
cache
device for staging the movement of data between main memory and processor registers to improve performance
mapping function
dictates how the cache is organized
disk
direct access
Mapping Techniques
direct, associative, set associative
write back
disadvantages are that portions of main memory are invalid and makes for complex circuitry and potential bottleneck
virtual memory
disk
way
each direct mapped cache is referred to as this, consisting of v lines
control bit
each line has these, used to indicate whether the line has been modified since being loaded into the cache
set-associative mapping
each word maps into all the cache lines in a specific set, so that main memory block B0 maps to set 0 and so on used for small degrees of associativity
split cache design
eliminates contention for the cache between the instruction fetch/decode unit and the execution unit
size of word
equal to the number of bits used to represent an integer and to the instruction length
unit of transfer
equal to the number of electrical lines into and out of the memory module
L2 cache
external cache
secondary storage or auxiliary memory
external, nonvolatile memory
registers internal to the processor
fastest, smallest, and most expensive type of memory
unit of transfer
for external memory, data are often transferred in blocks
unit of transfer
for main memory, number of bits read our of or written into memory at a time
access time
for non random access memory, time it takes to position the read-write mechanism at the desired location
transfer rate
for random access memory, equal to 1/(cycle time)
organization
for random access memory, this is a key design issue
write through
generates substantial memory traffic and may create a bottleneck
faster access time
greater cost per bit
unified cache
has a higher hit rate than split aches because it balance the load between instruction and data fetches automatically
noncacheable memory
identified using chip-select logic or high-address bits, only a portion of main memory is shared by more than one processor
tag
identifies which particular block is currently being stored
write policy
if the oldest block in the cache has not been altered, then it may be overwritten with a new block without first writing out the old block
addressable units
in some systems, this is the word. 2^A = N
volatile memory
info lost when power turns off
nonvolatile memory
info recorded remains until deliberately changed, ROM, semiconductor memory
L1 cache
internal cache
direct access
involves shared read-write mechanism
direct access
involves shared read-write mechanism, however, blocks/records have unique address based on physical location
true
it is impossible to arrive at an optimum cache size
the larger the cache
larger number of gates involved in addressing the cache
cache size
like the cache to be small enough so that the overall avg cost per bit is close to that of main memory alone and large enough so that the overall avg access time is close to the of the cache alone
characteristics of memory
location, capacity, unit of transfer, method of accessing, performance, physical tape, physical characteristics, and organization
creates memory traffic and may cause bottleneck
main disadvantage of write through
random access
main memory and cache sometimes use this accessing method; each addressable location in memory has a unique, physically wired-in addressing mechanism
volatile and employ semiconductor technology
main memory and processor registers are usually of these forms
random access
main memory and some cache systems are this accessing method
records
memory organized into units of data called
write back
minimizes memory writes, updates are made only in the cache, and dirty bit used to allow write back to main memory
memory cycle time
primary applied to random access memory and consists of the access time plus any additional time required before a second access can commence
main memory
principle internal memory system of the computer
locality of reference
processor primarily works with fixed clusters of memory references
associative access
random access type of memory that enables one to make a comparison of desired bit locations within a word for a specified match
transfer rate
rate at which data can be transferred into or out of the memory unit
on-chip cache
reduces the processor's external bus activity and therefore speeds up execution times and increases overall system performance
approach to lower the miss penalty
remember what was discarded in case it is needed again
random replacement
replacement alg that provides only slightly inferior performance to an algorithm based on usage
true
same virtual address in two different applications refers to two different physical addresses
either volatile or nonvolatile
semi conductor memory
expanded storage
semi conductor memory that is slower and less expensive than MM
physical tapes
semiconductor memory, magnetic surface memory, optical, and magneto-optical
tape
sequential access
direct mapping
simple and inexpensive to implement disadvantage: fixed cache location for any given block, may lead to thrashing
direct mapping
simplest mapping technique
large cache
slightly slower than small cache
greater capacity
slower access time
greater capacity
smaller cost per bit
logical cache aka virtual cache
stores data using virtual addresses, processor accesses cache directly without going through the MMU
word
the "natural" unit of organization of memory
true
the address field of machine instructions contains virtual addresses
zero-wait state transaction
the fastest type of bus transfer
destroy storage unit
the only way to alter non erasable memory
associative mapping
to determine whether a block is in the cache, the cache control logic must simultaneously examine every line's tag for a match used for higher degrees of associativity
capacity, access time, and cost
tradeoffs among three characteristics of memory
true
unit of transfer need not be equal to a word or an addressable unit
disk
used to provide extension to main memory or virtual memory
main memory
usually extended with higher speed, smaller cache
true
when a processor attempts to read a word of memory, a check is made to determine if the word is in the cache
location
whether memory is internal or external to computer
associative access
word is retrieved based on a portion of its contents rather than its address. cache memories may employ this access method