page reclaim 02:实现 2017-02-07

本文简化描述, 忽略compound page, 忽略HUGEPAGE, 未开启CONFIGMEMCG, 未开启CONFIGSWAP. 若无特别说明, 本文仅讨论ARM体系的情况.

1. 数据结构

per zone : active list, inactive list

struct zone {
    struct lruvec       lruvec;
};

struct lruvec {
    struct list_head lists[NR_LRU_LISTS];
    ...
};

per node :kswapd

kswapd_init -> for_each_node_state(nid, N_MEMORY) kswapd_run(nid)
kswapd_run -> pgdat->kswapd = kthread_run(kswapd, pgdat, "kswapd%d", nid);

page reclaim 05:page count 2017-02-07

1. 问题引入

linux-3.10.86/mm/vmscan.c

static inline int is_page_cache_freeable(struct page *page)
{

    /*
     * A freeable page cache page is referenced only by the caller
     * that isolated the page, the page cache radix tree and
     * optional buffer heads at page->private.
     */
    return page_count(page) - page_has_private(page) == 2;

}

为何是== 2?

2. 解

page reclaim 01:概述 2017-02-07

1. 背景/问题引入

本文不讨论 swapping (swap out to disk).

If a seldom-used page is backed by a block device (e.g., memory mappings of files) then the modified pages need not be swapped out, but can be directly synchronized with the block device. The page frame can be reused, and if the data are required again, it can be reconstructed from the source. If a page is backed by a file but cannot be modified in memory (e.g., binary executable data), then it can be discarded if it is currently not required. 通过Writing back cached data即可将这些page释放.

If a page is backed by a file but cannot be modified in memory (e.g., binary executable data), then it can be discarded if it is currently not required.

将 暂时不用的 或 很少使用的 内存回笼/回收, 给后续其他人使用. 那么, 如何界定 暂时不用 或 很少使用 呢? 这些已分配出去的内存都散落在哪里?

2. 散落在哪

如何找到这些内存呢? 啥链表吗?
答:see addtopagecachelru(), pageaddnewanonrmap()

方式1: addtopagecachelru 把page添加到 both the page cache and the LRU cache. Most importantly, it is used by mpagereadpages and dogenericmappingread, the standard functions in which the block layer ends up when reading data from a file or mapping. 当然, 实际是先添加到per cpu的struct pagevec中, 等满了再转移到global的lru中.

page reclaim 00:相关资料 2017-02-07

  1. [Professional Linux Kernel Architecture]

  2. Reducing Memory Access Latency by Satoru Moriya http://events.linuxfoundation.org/sites/events/files/lcjp13_moriya.pdf

  3. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/

  4. [Understanding the Linux Kernel, 3rd Edition]

page reclaim 06:ARM和L_PTE_YOUNG 2017-02-07

1. pagereferencedone对硬件页表的影响

linux-3.10.86/include/asm-generic/pgtable.h

page_referenced_one -> ptep_clear_flush_young_notify -> ptep_clear_flush_young -> ptep_test_and_clear_young
{
//pte_mkold 实现上是:PTE_BIT_FUNC(mkold,     &= ~L_PTE_YOUNG);
set_pte_at(vma->vm_mm, address, ptep, pte_mkold(pte));
}

linux-3.10.86/arch/arm/include/asm/pgtable.h

static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
                  pte_t *ptep, pte_t pteval)
{
    unsigned long ext = 0;

    if (addr < TASK_SIZE && pte_present_user(pteval)) {
        __sync_icache_dcache(pteval);
        ext |= PTE_EXT_NG;
    }
    /*
    @pteval 是linux版的pte
    硬件版的会根据ext来生成
    */
    set_pte_ext(ptep, pteval, ext);
}