Cache Coherence
What types of caches are there?
Registers ─ a cache for first-level cache First-level cache ─ a cache for second-level cache Second-level cache ─ a cache for main memory Main memory ─ a cache for hard disk (virtual memory).
What is a cache?
Small, fast memory used to improve memory system performance
Describe Exclusive caches
The blocks in different cache levels are mutually exclusive. More efficient utilization of cache space.
What does a cache exploit?
The spatial and temporal locality of typical programs.
What does the MESI Protocol stand for?
The state of every cache line is marked as Modified, Exclusive, Shared, or Invalid.
Is it Invalidate or Update Protocols which may generate many unnecessary cache updates?
The update protocol.
How does the Write-Through Policy work for multiprocessors?
There is additional inconsistency due to other cache copies of the same memory location. The processors must therefore monitor main memory traffic to keep local cache up to date.
What is A directory is used for?
To collect and maintain information about copies of data in caches.
How does Write Update SP work?
Updated word is distributed to all other processors.
What's the Write-BackPolicy?
Updates initially made in cache only which make then not consistent. Update bit for cache slot is set when update occurs. If a block is to be replaced, write it back to the main memory if its update bit is set.
What are the two approaches of Snoopy Protocols?
Write invalidate and write update.
Is Snoopy caches easy to implement and how to you do that?
Yes, implement on a bus-based system.
Snoopy caches: Each coherence operation is sent to ... which ... and that is an ...
all other processsors generates large traffic inherent limitation.
Write Invalidate SP; When one of the caches wants to write to the line, it first ...
issues a notice to invalidate the line in all other caches. (The writing processor then has exclusive access until the line is required by another processor.)
Snoopy caches are not feasible for machines with ...
memory distributed across a large number of sub-systems.
Additional hardware is required to coordinate access to data that might have ...
multiple cache copies.
Write Update SP works well with ...
multiple readers and writers.
Write Invalidate SP is Suitable for ...
multiple readers, but one writer at a time.
Memory references are checked against ...
the directory.
Directory caches: what is the need for a broadcast media replaced by?
A directory.
Directory caches: What must The underlying network must also be able to carry out?
All the coherence request.
What's the Write-Through Policy?
All writes go to main memory as well as cache which make them consistent. This slows down the main memory access.
How can data integrity be ensured in a multiprocessor?
An update by a processor at time t should be available for other processors at time t+1.
If two processors make interleaved reads and updates to a variable, what protocol is better and why not the other?
An update protocol is better since an invalidate protocol may lead to many memory accesses.
What are the cache coherence techniques?
Apply only to caches connected to a bus or other interconnection mechanism - typically L2 caches. However, a processor often has L1 cache that is not connected to a bus, therefore no snoopy protocol can be used.
Which protocols suffer from false sharing overheads?
Both Invalidate and Update Protocols.
Different processors may access values at same memory location. How to the cache solve that?
By having multiple copies of the same data in different caches.
How does the Write-Back Policy work for multiprocessors?
Data in the other caches will also be inconsistent (must be cache coherence). I/O must access main memory also through cache.
Describe the Snoopy Protocols.
Distribute cache coherence task among all cache controllers. If the initial cache controller recognizes that a line is shared, the updates will be announced to all other caches. Each cache controller "snoop" on the network to observe these broadcasted notifications, and react accordingly. Ideally suited to a bus based multiprocessor system.
Directory caches: What is often used when the directory becomes a point of contention?
Distributed directory schemes.
Describe Inclusive caches
Each block existing in the first level also exists in the next level. When fetching a memory block, place it in all cache levels.
How can Write Update SP generate many unnecessary updates?
If a processor just reads a value once and does not need it again or if a processor updates a value many times before it is read by the other processors (bad programming).
Explain some Cache Coherence Operations concerning when Node 1 directory keeps note that node 2 has copy of data.
If the data is modified in the cache, it is broadcasted to other nodes. Local directories will monitor and purge (invalidate) local caches if needed. The local directory, which owns the address, will: monitor changes in remote caches and marks the memory location as invalid until written back and force write back if the memory location is requested by another processor.
When are directory protocols effective?
In large scale systems with complex interconnection schemes, such as NUMA.
Where is the directory stored?
In the memory system.
What protocol do most modern computer use and why?
Invalidate protocols, since we have usually the situation of one writer with many readers.
What's the solution for L1-L2 Cache Consistency?
L1 line should keep track of the state of the corresponding L2 line, and L1 should write-through to L2.
What does the L1-L2 Cache Consistency Solution require?
L1 must be a subset of L2. The associativity of the L2 cache should be equal or greater than that of the L1 cache. Ex. if L2 is 2-way set associate while L1 is 4-way set associate, it doesn't work.
Directory caches: What's the cons for the additional information stored in the directory?
May add significant overhead.
L1-L2 Cache Consistency Solution: How does the interaction between L1 and L2 affected if L1 has a write-back policy?
More complex.
Describe Non-inclusive caches
No guarantee for inclusion or exclusion. - Simpler design, ex. most Intel processors
Why is Cache coherence in multiprocessor systems is an important issue to be considered?
Otherwise, performance will suffer.
How can the central bottleneck in directory protocols be solved?
Partially by having multiple directories.
