7. Paging - page table - can be very large, why? - e.g. 32-bit address (2^32 bytes), the size of a page is 4 KB (2^12 bytes) - the number of page is (2^32/2^12 = 2^20) - assume that a page table entry is 4 bytes - the size of a page table is 2^20 x 4 bytes = 4MB per process - hundreds of processes -> hundreds of MB for page table - could we increase the size of a page to reduce the size of the page table? - a 32-bit address, and increase the size of a page from 4 KB to 16 KB - we will have 2^18 entries in a page table when a page table entry (PTE) is 4 bytes - what is the size of a page table? - 2^18 x 4 bytes = 1MB - what is the problem when we increase the size of a page? - what are advantages when the size of a page is large? - may be more efficient for disk access (block size of disk) - TLB entries capture more addresses per entry -> fewer misses (why?) - multi-level page tables - turn the page table into a tree structure - only allocate "using" page table space - compact and support sparse address space - what is inside a multi-level page table? (see slide 8) - chop the page table into page-sized units - page directory tells where a page of the page table is - a number of page directory entries (PDE) - a page frame number (PFN), and a valid bit - important formula for the page table - virtual address space size = 2^n bytes - number of pages = (virtual address space size) / (page size) - the size of a page table = number of PTE x the size of a PTE - the size of the physical memory is 8GB, the size of a page is 8KB, 46 bit virtual address, the size of a PTE is 4 bytes how many levels of page tables would be required? - the size of a page is 8KB = 2^13 bytes - the size of virtual address space size is 2^46 bytes - PTE = 4 bytes = 2^2 bytes - the number of pages in a page table is 2^46 / 2^13 = 2^33 bytes - the size of a page table is 2^33 x 2^2 bytes = 2^35 bytes - keep creating the level of the page table when the size of the page table > the number of the page - the lst level (2^35 > 2^13)-> 2^25/2^13 = 2^22 - the 2nd level, (2^24 > 2^13) -> 2^24 bytes / 2^13 bytes = 2^11 the size of the page table = 2^22 x 2^2 bytes = 2^24 bytes - the 3rd level, the size of the page table = 2^11 x 2^2 bytes = 2^13 bytes - demand paging - reduce the loading of unnecessary pages - pages are loaded from disk to RAM, only when needed - how does demand paging work? - using the present bit in the process page table - present bit indicates if the block is in the RAM or not - loads the page into RAM and mark the present bit to 1 - removes another block from the RAM if no pages on RAM are free - which page that needs to be loaded from the memory? - page fault is slow if we always load the data from disk to RAM when the time to use - pre-paging -> predict which pages will be used and swap them into the RAM - need lots of locality -> otherwise run at disk speeds - if most accesses are to adata alreadly in DRAM -> good - page fault - steps of the page fault (see slide 18) - effective access times (EAT) - in a case: L1 cache: 2 cycles, L2 cache: 10 cycles, main memory: 150 cycles, disk -> 30,000,000 cycles on 3.0 GHz processor 98% access handled by L1 cache, 1% handled by L2 cache 0.99% handled by DRAM, 0.01% cause page fault - what is the average access latency ? - 0.98 x 2 + 0.01 x 10 + 0.99 x 150 + 0.0001 x 30,000,000 = 1.96 + 0.1 + 1.485 + 3000 = about 3000 cycles / access - need LOW page fault rates to sustain performance - page selection policy - when do we need to load a page? - prefetch pages in advance of access - hard to predict accurately - mispredictions can cause useful pages to be replaced - Belay's anomaly - More pages in main memory can lead to more page faults - e.g. FIFO replacement - reference string: A B C D A B E A B C D E - 9 faults when the size of a page is 3 - 10 faults when the size of a page is 4 - Adding more memory might not help for page faults in some replacement algorithms - Thrashing - if all working sets do not fit in memory - one hot page always replaces another - increasing the times of page faults - how to resolve the trashing? - swap out entire processes - invoked when page fault rate exceeds some bound - Linux invokes the out-of-memory (OOM) killer - inverted page tables - aims to reduce the access latency of the multi-level page table - observation - the size of the physical memory is much smaller - the number of page table entries (PTE) = the number of physical frames - each PTE contains - scan through entire page table to find a match - hashed inverted page table - accelerate the lookup time - using hashed function (see slide 32) - sharing memory - mmap(void *start, size_t length, int prot, int flags, int fd, off_t offsize) - used to save memory - two processes read shared read-only data from the same pages in memory - bypass expensive read, write or ioctl calls - in the flag (MAP_SHARED) - shmget(key, size, flags) - create a shared memory segment - shmat (shmid, addr, flags) - attach shmid shared memory to address space of the calling process - shmdt (shmid) - detach shared memory