LRU & MGLRU
Current kernels maintain two LRU lists to track pages of memory, called the "active" and "inactive" lists.
The former contains pages thought to be in active use, while the latter holds pages that are thought to be unused and available to be reclaimed for other uses. Pages are taken off the tail of the active list if they have not been accessed recently and placed at the head of the inactive list. If some process accesses a page on the inactive list, it will be promoted back to the active list.
A fair amount of effort goes into deciding when to move pages between the two lists.
It's worth noting that there are actually two pairs of lists, one for anonymous pages and one for file-backed pages. If memory control groups are in use, there is a whole set of LRU lists for each active group.
The multi-generational LRU generalizes that concept into multiple generations, allowing pages to be in a state between "likely to be active" and "likely to be unused".
Pages move from older to newer generations when they are accessed; when memory is needed pages are reclaimed from the oldest generation.
Generations age over time, with new generations being created as the oldest ones are fully reclaimed.
MGLRU
Zhao's patch set identifies a number of problems with the current state of affairs.
- The active/inactive sorting is too coarse for accurate decision making, and pages often end up on the wrong lists anyway.
- The use of independent lists in control groups makes it hard for the kernel to compare the relative age of pages across groups.
- The kernel has a longstanding bias toward evicting file-backed pages for a number of reasons, which can cause useful file-backed pages to be tossed while idle anonymous pages remain in memory. This problem has gotten worse in cloud-computing environments, where clients have relatively little local storage and, thus, relatively few file-backed pages in the first place.
- Meanwhile, the scanning of anonymous pages is expensive, partly because it uses a complex reverse-mapping mechanism that does not perform well when a lot of scanning must be done.
The multi-generational LRU patches try to address these problems with two fundamental changes:
- Add more LRU lists to cover a range of page ages between the current active and inactive lists; these lists are called "generations".
- Change the way page scanning is done to reduce its overhead.
Any that have remained idle are moved to the next older generation.这种方式看起来和两个 generation(active/inactive)的情况很相似。两个 generation 的可以通过在 list 中的顺序来查看哪一个 page 相对来说更加 aging 一些,而 MGLRU 是通过 different generation 来区分,那么看起来不需要引入 MGLRU 也能够实现判断哪些 page 相对来说更加 aging?
When the time comes to reclaim pages, only the oldest generation need be considered. The "oldest generation" can be different for anonymous and file-backed pages; anonymous pages can be harder to reclaim in general (they must always be written to swap) and the new code retains some of the bias toward reclaiming file-backed pages more aggressively. So file-backed pages may not escape reclaim for as many generations as anonymous pages do.
The multi-generational LRU [LWN.net]
Multi-generational LRU: the next generation [LWN.net]
Lruvec
What does "aging" mean in LRU/MGLRU?
The aging produces young generations. Given an lruvec, the aging scans page tables for referenced pages of this lruvec. Upon finding one, the aging updates its generation number to max_seq. After each round of scan, the aging increments max_seq
. The aging maintains either a system-wide mm_struct
list or per-memcg mm_struct lists and tracks whether an mm_struct is being used on any CPUs or has been used since the last scan. Multiple threads can concurrently work on the same mm_struct list, and each of them will be given a different mm_struct belonging to a process that has been scheduled since the last scan.