Title: Memory Management with respect to Page replacement in the Linux Kernel
1Memory Management with respect to Page
replacement in the Linux Kernel
- Kernel reference tree-2.4.20
- By
- A.R.Karthick
- (karthick_r_at_infosys.com)
2Memory Hierarchies
L1 cache
T I M E
L2 cache
RAM
Hard Disk
3Page Tables
- Define the virtual to physical mapping
- Page directory,page mid level directory,page
table entry define the course of translation - Example
- PGD 10 bits PTE 10 BITS (PMD folded in 32
bit) - 0000 0000 00 00 1000 0000 1100 0000 1111 -gt
(0x00080c0f) - pgd index(0) pte_index(1 ltlt 7) , pmd is folded to
pgd
4Page Table Entry Status Bits (PTE Entry)
11 10 9 8 7 6 5 4 3 2 1 0
PAGE_PRESENT PAGE_RW PAGE_USER PAGE_RESERVED PAGE_
ACCESSED PAGE_DIRTY INTERNAL_STATUS
5Page Fault
- Processor Exception raised when there is a
problem mapping the virtual address to physical
address. - Handled by do_page_fault in arch/i386/mm/fault.c.
- Write protection faults or COW faults map to
do_wp_page. - For pages in swap, do_swap_page is called.
- For pages not found, do_no_page is called that
either faults in an anonymous zero page or an
existing page. - Page faults populate the LRU cache.
6 Page Replacement Algorithms
- Optimal Replacement ? Not possible
- Not Recently Used (NRU) ? Crude hack
- FIFO ? Inefficient
- Second Chance ? Better than above
- Clock Replacement ? Efficient than above
7Page Replacement Algorithms
- LRU Least Recently used replacement
- NFU Not Frequently Used replacement
- Page Ageing based replacement
- Working Set algorithm based on locality of
references per process - Working Set based clock algorithms
- LRU with Ageing and Working Set algorithms are
efficient to use and are commonly used
8Page replacement handling in Linux Kernel
- Page Cache
- Pages are added to the Page cache for fast
lookup. - Page cache pages are hashed based on their
address space and page index - Inode or disk block pages, shared pages and
anonymous pages form the page cache. - Swap cached pages also part of the page cache
represent the swapped pages. - Anonymous pages enter the swap cache at swap-out
time and shared pages enter when they become
dirty.
9LRU CACHE
- LRU cache is made up of active lists and inactive
lists. - These lists are populated during page faults and
when page cached pages are accessed or
referenced. - kswapd is the page out kernel thread that
balances the LRU cache and trickles out pages
based on an approximation to LRU algorithm. - Active lists contains referenced pages. This list
is monitored for Page references through
refill_inactive - Referenced pages are given a chance to age
through Move To Front and unreferenced pages are
moved to the inactive list - The inactive lists contains the set of Inactive
clean and inactive dirty pages. - This set is monitored on a timely basis when
pages_high threshold is reached for free pages on
a per zone basis is crossed.
10Kswapd or Kernel Page out daemon
- Kswapd performs zone balancing based on
pages_high, pages_low and pages_min - The page replacement policy is an LRU
approximation that is empirical in nature - It doesnt follow strict page ageing based on
page reference _times_ - The shrink cache routine is the principal page
replacement routine - It launders the dirty pages inode or disk block
pages by scheduling write backs - It swaps out pages based on the anonymous pages
in the inactive list
11Page replacement code snippet
if (unlikely(TryLockPage(page))) if
(PageLaunder(page) (gfp_mask __GFP_FS))
page_cache_get(page) spin_unlock(pagem
ap_lru_lock) wait_on_page(page) page_cac
he_release(page) spin_lock(pagemap_lru_lock)
continue if (PageDirty(page)
is_page_cache_freeable(page) page-gtmapping)
int (writepage)(struct page
) writepage page-gtmapping-gta_ops-gtwritepage
if ((gfp_mask __GFP_FS) writepage)
ClearPageDirty(page) SetPageLaunder(pag
e) page_cache_get(page) spin_unlock(pag
emap_lru_lock) writepage(page) page_cach
e_release(page) spin_lock(pagemap_lru_lock)
continue
12Page replacement code snippet
- if (!page-gtmapping !is_page_cache_freeable(page
)) -
- spin_unlock(pagecache_lock)
- UnlockPage(page)
- page_mapped
- if (--max_mapped gt 0)
- continue
- /
- Alert! We've found too many mapped pages on
the - inactive list, so we start swapping out
now! - /
- spin_unlock(pagemap_lru_lock)
- swap_out(priority, gfp_mask, classzone)
- return nr_pages
13MM for MIR Kernel
- MIR An Open Sourced Kernel by Indians
- Support for two level page tables
- Bitmap allocator for PAGE_SIZEd allocations
- Slab cache allocator for less than PAGE_SIZEd
granularity - vmalloc allocations for virtually contiguous,
physically discontiguous allocations - Basic support for per process virtual address
space management - Todo
- Support COW, mmap functionality and full fledged
virtual address space management - Implement a LRU based page replacement policy
14References
- Linux Kernel Source Code
- www.surriel.com
- www.google.com
-