Operating Systems: Memory Management
First in First Out (FIFO)
Based and awesome sauce algorithm because it runs the processes based on which ones came first (FCFS). Evict the page brought in the longest time ago.
What is a Conflict Miss?
Caches limit blocks at level k + 1 to a small subset. Conflict misses occur when the level k cache is large enough, but multiple data objects all map to the same level k block.
Physical Memory (DRAM Cache)
Contents of the array on disk are cached here and these cache blocks are called pages (size is P = 2^p bytes).
T or F: Segmentation can lead to internal fragmentation.
False.
Capacity Miss
Occurs when all the lines of cache are filled. Data is larger than cache capacity and the cache needs to evict some blocks. Set of active cache blocks (working set) is larger than the cache.
Conflict Miss
Occurs when an associativity set is filled. In direct mapping (FA), once a set is filled, a conflict miss would occur if you tried accessing that same set with a different tag.
Cache Hit
Occurs when the data we want is already inside the cache.
What happens if the size of the TLB is smaller than the size of a process working set?
We keep getting TLB misses as no matter what PTE entries we keep in the TLB, we will be accessing others that are not there since our working set is larger than the TLB size.
Belady's Algorithm
We will replace the page that will not be used for the longest time in the future. It is an upper bound to compare implementations of page replacement algorithms with the optimal to gauge room for improvement.
Memory
Larger, slower cheaper memory viewed as partitioned into "blocks".
Fast Translation
Lookups need to be fast.
Virtual Address
Makes it easier to manage the memory of processes running in the system. Independent of physical location of the data references. OS determines location of data in physical memory. Instructions issue virtual addresses. Translated by hardware into physical addresses.
Variable Partitions Advantages
No internal fragmentation. - Allocate just enough for the process.
Working Set Size
Number of pages in a working set. An interval is (t, t - w). Poor locality means you reference more pages. WSS is larger.
Paging Disadvantages
You can still have internal fragmentation. - Process may not use memory in multiples of a page. Memory reference overhead. - 2 references per address lookup (page table, then memory). Memory required to hold page table can be significant - Need one PTE per page. - 32 bit address space w/ 4KB pages = 2^20 PTEs. - 4 bytes / PTE = 4 MB / page table. - 25 processes = 100 MB just for page tables!
Linear Virtual Address Space
OS translates to a physical address space, which not does not have to be linear. VA space can be much larger than the physical address space it maps to, since the goal is to have different processes use different virtual addresses to map to or access the same shared memory or physical address.
Page Fault Frequency (PFF)
Variable space algorithm that uses a more ad-hoc approach. If fault rate is above a high threshold, give more memory. EXCEPT: (FIFO, Belady's Anomaly). If fault rate is below a low threshold, take away memory.
Paging
We can split the pages in virtual memory to multiple partitions in physical memory.
Variable Partitions: Limit Register
Why do we need a limit register? For protection: if (VA > limit) then fault.
Spatial Locality
Accessed items are sometimes close in space. (iterating through arrays)
Physical Address
All programs use addresses from these directly.
Virtual Memory for Caching
An array of N contiguous bytes stored on disk.
Page Table
An array of page table entries (PTE) that maps virtual pages to physical pages.
Loading
execve{} allocates virtual pages for .text and .data sections = creates PTEs marked as invalid. The .text and .data sections are copied, page by page, on demand by the virtual memory system.
Sharing
Can map shared memory at same or different virtual addresses in each process' address space. - Different: -> 10th virtual page in P1 and 7th virtual page in P2 correspond to the 2nd physical page. -> Flexible (no address space conflicts), but pointers inside the shared memory segment are invalid. - Same: -> 2nd physical page corresponds to the 10th virtual page in both P1 and P2. -> Less flexible, but shared pointers are valid.
Variable Partitions
Checking to see if the VA is trying to access data within bounds of its allocated memory. Done by comparing address against a limit register. Prevents internal fragmentation, since we allocate just enough for our process. Causes external fragmentation, which refers to unallocated holes in main memory that result from process loading and unloading.
Consider a memory system with a virtual address space of 32 bits, a physical address space of size 30 bits, and a page size of 4Kbytes. Given two pointers (i.e., virtual addresses) VA and VB with values 0xff00423 and 0xff01321 respectively, which of the following is true about their physical addresses. Assume that both pages are in memory. A. VA's physical address is > than VB's physical address. B. VB's physical address is > than VA's physical address. C. It depends on whether we have a TLB hit or miss. D. It's impossible to tell.
D. It's impossible to tell.
DRAM Cache Organization
DRAM is about 10x slower than SRAM. Disk is about 10,000x slower than the DRAM. - Large page (block) sizes: typically 4-8 KB. - FA. - Write-back rather than write-through.
Page Replacement Policy
Determine which page to remove when we need a victim. Policies differ from mechanisms.
Address Spaces
Distinction between data (bytes) and attributes (addresses). Each object can have multiple addresses. Every byte in main memory has at least one physical address or one or more virtual addresses.
Fixed Space Algorithm
Each process is given a limit of pages it can use. When it reaches this limit, it replaces from its own pages.
Linking
Each program has similar virtual address space. Code, stack and shared libraries always start at the same address.
Paging Advantages
Easy to allocate memory. - Memory comes from a free list of fixed size chunks. - Allocating a page is just removing it from the list. - External fragmentation is no longer a problem, as it was with Variable Partitions. Simplifies Protection. - All chunks are the same size. - Like fixed partitions, don't need a limit register. Simplifies Virtual Memory.
TLB Hit
Eliminates a memory access.
Memory Management
Every process has its own virtual address space. - Can be viewed as a simple linear array. - Mapping function scatters addresses through physical memory. -> Well chosen mappings simplify memory allocation and management.
Least Recently Used (LRU)
Evict a page that has not been used for the longest time in the past. It uses reference information to make a replacement decision. Costly because we are maintaining a stack.
Variable Partitions Disadvantages
External fragmentation. - Job loading and unloading produces empty holes scattered throughout memory.
T or F: Journaling FS improve the performance of FFS
False.
T or F: Multi-level page tables improve the performance of address translation.
False.
T or F: Page faults are handled by hardware.
False.
T or F: Swap can be smaller than physical memory.
False.
T or F: The working set of a process includes all its data pages.
False.
Page Replacement
Faulted page from disk to a page frame of memory. OS must replace a page for each page faulted in when all frames are used.
What is making the common case fast?
Focus your optimizations and investments on the part of the system that is needed often and/or takes a lot of time. Example: Memory accesses occur very frequently, so translation has to be very efficient.
Page Table Entry (PTE)
Holds the mapping between a virtual address of a page and the address of a physical frame.
Reloading the TLB
If TLB does not have mapping: - 1.) MMU loads PTE from page table. -> HW managed TLB, OS not involved. -> OS already set up page tables so HW can access it directly. - 2.) Trap to the OS. -> Software managed TLB, OS intervenes. -> OS does lookup in page table, loads PTE into TLB. -> OS returns from exception, TLB continues. A machine will only support one method or the other. At this point, there is a PTE for the address in the TLB.
Demand Paging (OS)
If memory is full, evict pages. Pages are loaded from disk when referenced, and page faults trigger paging operations.
Cache Miss
If the data we do not want is inside the cache, this is what we refer to it as.
Virtual memory (VM)
Improves memory management speed and efficiency. Stored in a linear fashion inside physical memory. Storing things linearly can cause efficiency and performance issues because space and memory are limited.
TLB Miss
Incurs an additional memory access. Rare.
Disadvantages of Fixed Partitions
Internal Fragmentation - Memory in a partition not used by a process is not available to other processes. Partition Size - One size does not fit all (what if you had a very large process?)
Limitations of VA's
It needs protection, fast translation, and fast change.
Benefits of Cache Hits
It saves time and increases performance.
T or F: Swap resides on disk.
True.
What are lazy versus aggressive policies?
A lazy policy delays work until it is absolutely necessary. For aggressive policies, work is done as soon as possible. Lazy policies could save the system some work if the delayed work is never needed or if it can combine with other work that arrives later. Example: Write back (lazy) versus write through (aggressive) in caches. Pre-paging (aggressive) versus demand paging (lazy).
What causes a page fault?
A page miss causes a page fault (an exception) You must have the page fault handler select a victim to evict. Offending instruction is restarted: page hit!
Cache
A small but fast memory device that stores copies of data from larger and slower memory.
Memory Allocation
A virtual page can be mapped to a physical page. A virtual page can be stored in different physical pages at different times. Sharing code and data among processes - Map virtual pages to the same physical page.
Which of the following are true about contiguous allocation for files? A. It leads to better random access performance than linked allocation. B. It minimizes external fragmentation. C. The index size increases with the size of the file.
A. It leads to better random access performance than linked allocation.
Consider a memory system with a virtual address space of 32 bits, a physical address space of size 30 bits, and a page size of 4Kbytes. Consider now the following two addresses PA and PB with values 0xff00423 and 0xff00321 respectively. A. VA's physical address is > VB's physical address. B. VB's physical address is > VA's physical address. C. It depends on whether we have a TLB hit or miss. D. It's impossible to tell.
A. VA's physical address is > VB's physical address.
Should something be implemented in software or hardware?
Any functionality can be generally implemented in hardware or software. However, the decision depends on whether the case is common enough and critical enough in term of time available to do it to warrant hardware investment. We usually invest in hardware for the most common cases. Example: Translation from virtual to physical memory has to be in hardware because it occurs very often. We typically focus on mechanisms rather than policies in hardware. On the other hand, page replacement policies occur very infrequently (when there is a page fault) are are not critical in time since the disk access will take millions of cycles anyway - so they are implemented in software. In the middle, things like T LB misses are in the gray area and some systems implement them in software and others in hardware.
What hit ratio on the TLB is needed to bring the translation time to 10% of its value without a TLB, assuming TLB accesses are free.
Average access time is h * 0 + (1 - h) * m where m is the memory access time. We want average access time to be 0.1m, so 0.1m = (1 - h)m and h = 0.9.
Which of the following are true about mechanisms and policies? A. A good design customizes (rather than separates) the mechanisms to the policies. B. Page faults are mechanisms. C. Clock LRU is a mechanism.
B. Page faults are mechanisms.
Which of the following are true about RAID file systems? A. The hard drives used in a RAID system must be special drives. B. RAID can provide higher performance than normal drives. C. RAID can offer higher reliability than normal drives.
B. RAID can provide higher performance than normal drives.
Given a system where the page size is 4KBytes and given two variables x and y who are placed in memory at addresses 0x1230 and 0x2340 respectively, which statement(s) are true? A. The physical address of y is larger than the physical address of x. B. We cannot tell which physical address is larger until we translate the addresses. C. It depends on whether we have a TLB hit or not.
B. We cannot tell which physical address is larger until we translate the addresses.
Suppose that we know that the i-nodes are stored on a few tracks near the beginning of the disk. We know that for our file system, about half the requests go to the i-nodes. What suggestions can you make to improve the performance of the disk?
One option is to change the file system layout so that the i-nodes are collocated with the data blocks such as the solution with FFS. Another option is to use disk scheduling. Something like SCAN or C-SCAN should work, but perhaps we can be more intentional and schedule a period for i-nodes followed by a period for data. Other ideas may also work. This is a design question.
Global Replacement
One process can ruin it for the rest.
Linear Address Space
Ordered set of contiguous non-negative integer addresses: {0, 1, 2, 3, ... }
Page Faults
PTE indicates a protection fault. - R/W/E - operation not permitted. - Invalid - VP not allocated, or page not in physical memory. TLB traps to the OS (software takes over). - R/W/E - OS usually will send fault back to process, or might be playing games (CoW, mapped files). - Invalid -> VP not allocated in address space. --> Segmentation fault. -> Page not in physical memory --> OS allocated frame, reads from disk, maps PTE to physical frame.
Thrashing
Page replacements algorithms want to avoid this. Occurs when most of the time is spent by the OS in paging data back and forth from disk. Not making progress.
Segmentation
Partitions memory into logically related units. Natural extension of variable-sized partitions. Fixed number of sizes. Hardware support. Internal fragmentation can be solved, and now we will need multiple limit and base registers for each segment.
Demand Paging (Process)
Process is created, create a new page table with all valid bits off. Instructions fault on code and data pages, faulting stops when necessary code and data pages are in memory.
Variable Space Algorithms
Processes set of pages grows and shrinks dynamically.
Working Set
Programs tend to access a set of active virtual pages called this, programs with better temporal locality will have smaller working sets. if (WS size < MM size) - Good performance for one process after compulsory misses. if ( SUM(WS sizes) > MM size) - Thrashing: Performance meltdown where pages are swapped (copied) in and out continuously.
Temporal Locality
Recently accessed items might be accessed again soon. (while loops)
Page Hit
Refers to VM word that is in physical memory (DRAM cache hit).
Page Fault
Refers to VM word that is not in physical memory (DRAM cache miss).
CPU-Memory Gap
Refers to the inability of DRAM memory speed to keep up with CPU or microprocessor speed.
Locality
Repeated instructions to the same or nearby addresses. Programs tend to use data and instructions with addresses near or equal to those they have used recently.
Protection
Restrict which addresses jobs can use.
Physical Address Space
Set of M = 2^m physical addresses. {0, 1, 2, 3, ..., M - 1}
Virtual Address Space
Set of N = 2^n virtual addresses. {0, 1, 2, 3, ..., N - 1}
Translation Lookaside Buffer (TLB)
Small hardware cache in MMU. Maps virtual page numbers to physical page numbers. Contains complete page table entries for small number of pages.
Local Replacement
Some processes may do well while some others might suffer.
Fixed Partitions
Splitting main memory into partitions. Easy to implement and gives us a fast context switch, but we still have internal fragmentation. Unused memory of an individual process takes up space that could have been used for another active process. Size of partition might not be intuitive, since processes vary in size and might be too big for the partition.
Consider a memory system with a virtual address space of 32 bits, a physical address space of size 30 bits, and a page size of 4Kbytes. Can the virtual address of a global array change over the lifetime of a program (without the program copying/moving it)? Can the physical address?
The VA of a data structure won't change unless you move it. The PA can change after a page moves to swap then back in.
Belady's Anomaly
The fault rate might increase when an algorithm is given more memory.
Consider a memory system with a virtual address space of 32 bits, a physical address space of size 30 bits, and a page size of 4Kbytes. Someone argues that the clock algorithm is supposed to be an approximation of LRU - Take a position for or against this statement, and briefly justify it.
The reference bit being accesses allows a page not to be replaced for a full round. So, the page that is being replaced has not been accessed for a full round. Note that this is approximately LRU in the sense that removed page was not recently accesses but it may not be the one that was LRU. Also, in situations where most pages are getting references, clock becomes almost FIFO.
Consider a memory system with a virtual address space of 32 bits, a physical address space of size 30 bits, and a page size of 4Kbytes. Given the virtual address 0xff0beef, what is the value of the VPN and the offset?
The value of the VPO will be 0xeef and the value of the VPN will be 0xff0b. VPO = 2^12 bytes because page size is 4Kbytes. Offset is last 12 bits.
Functionality of PTEs
They are cached in L1 like any other memory word. - PTEs may be evicted by other data references. - PTE hit still requires a small L1 delay.
Advantages of Fixed Partitions
They are easy to implement. - You need a base register. - You need to verify that the offset is less than the fixed partition size. Fast context switch.
Cold Miss
This occurs when we are accessing data that hasn't been fetched from main memory yet. It's "compulsory" since you will always miss the first time you try and access any address. Cache is empty.
What is the primary purpose of a TLB?
To speed up the translation by caching page table entries.
T or F: Explicit free list is better than implicit free list for heap management.
True.
T or F: Page Tables can be bigger than physical memory.
True.
T or F: Page replacement can require invalidating a TLB entry.
True.
T or F: free() does not need a system call.
True.
Fast Change
Updating memory hardware on context switch.
Why should we use Virtual Memory (VM)?
VM is taking the page concept to a whole new level. - Allow pages to be on a disk. Motivation? - Uses main memory efficiently. - Use DRAM as a cache for the parts of a virtual address space. Simplifies memory management. - Each process gets the same uniform linear address space. - With VM, this can be huge!
What is locality?
VM works because of locality. Programs tend to access a set of active virtual pages called a working set.
