2017-02-07

1. page_referenced

linux-3.10.86/mm/rmap.c

 * Quick test_and_clear_referenced for all mappings to a page,
 * returns the number of ptes which referenced the page.
这个注释过时了, 实际功能并不是返回指向该page的pte的个数. 

 问题:这个函数的用途?
 答:用来反应在inactive list中的page的活跃程度.
 返回数值1和返回数值2是有区别的, see page_check_references
 {
     referenced_ptes = page_referenced(...);
     if (... || referenced_ptes > 1)
        return PAGEREF_ACTIVATE;

 }
 */
int page_referenced(struct page *page,
        int is_locked,
        struct mem_cgroup *memcg,
        unsigned long *vm_flags)

2. LPTEYOUNG和 markpageaccessed, 用哪个呢?

有些page是先kmap, 接着 _copyto_user, 最后kunmap, 这种使用之后 page就被 unmap 掉, 如果是highmem, 通过pte young标记就没有意义, 故采用mark page accessed来提升活跃度. 这种类型的page, 没有被进程引用, 回收时也无需修改页表. 或者说, 该page没有和进程关联, 故不通过页表来标记活跃度.

对于和进程关联的page, LPTEYOUNG没有置位时, 访问会发生异常, 在fault handler中会置位LPTEYOUNG, 完成活跃度的标记. 被扫描到时, 判断活跃度信息后, 总是会被清掉LPTEYOUNG, 如果有LPTEYOUNG的话.

3. pagecheckreferences

活跃程度 体现在 两个方面, 一个是 该page可能在多个进程/线程中被引用, 这些进程中该page的pte的young有所体现,
另一个是 struct page的flag PGreferenced, PGactive.

shrinkinactivelist -> shrinkpagelist -> pagecheckreferences

static enum page_references page_check_references(struct page *page,
                      struct scan_control *sc)
{
    int referenced_ptes, referenced_page;
    unsigned long vm_flags;

    referenced_ptes = page_referenced(page, 1, sc->target_mem_cgroup,
                      &vm_flags);
    referenced_page = TestClearPageReferenced(page);


    if (referenced_ptes) {
            SetPageReferenced(page);

        if (referenced_page || referenced_ptes > 1)
            return PAGEREF_ACTIVATE;
        ...
        return PAGEREF_KEEP;
    }

}

问题:如何理解这里的 SetPageReferenced?
答: pagereferenced()总是会清掉pte的young, 如果这里(shrinkinactivelist -> .. -> pagecheck_references)不SetPageReferenced, 相当于一下子把较活跃的page的活跃度全部抹掉, 似乎并不合适, 故这里SetPageReferenced给page保留一点活跃度.

4. mm: don't markpageaccessed in fault path

linux-3.10.86/mm/filemap.c

filemap_fault() is invoked via the vma operations vector for a mapped memory region to read in file data during a page fault.

filemap_fault  
|--count_vm_event(PGMAJFAULT);  
|--no_cached_page: page_cache_read(file, offset);  
|   |--page_cache_alloc_cold  
|   |--add_to_page_cache_lru -> lru_cache_add_file  
|   |   |--__lru_cache_add(page, LRU_INACTIVE_FILE)  
|   |--mapping->a_ops->readpage  
|--retry_find:  page = find_get_page(mapping, offset)  
|--mark_page_accessed  

mm: don't markpageaccessed in fault path https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=bf3f3bc5e734706730c12a323f9b2068052aa1f0

Doing a markpageaccessed at fault-time, then doing SetPageReferenced at unmap-time if the pte is young has a number of problems.

本补丁之前的方式是, 如果 fault时markpageaccessed, 然后, unmap时若pte young则SetPageReferenced, 这种方式会导致一些问题.

So calling markpageaccessed not only adds extra lru or PGreferenced manipulations for pages that are already going to have pteyoung ptes anyway, but it also adds these references which are difficult to work with from the context of vma specific references (eg. MADVSEQUENTIAL pteyoung may not wish to contribute to the page being referenced).

我们反正就要给pte置位young了, 就不要再弄个额外的page flag了. 而且还会给 谁谁谁 添堵.

Then, simply doing SetPageReferenced when zapping a pte and finding it is young, is not a really good solution either. SetPageReferenced does not correctly promote the page to the active list for example. So after removing markpageaccessed from the fault path, several mmap()+touch+munmap() would have a very different result from several read(2) calls for example, which is not really desirable.

之前的处理, 在第2回的 munmap会把上回的信息丢失掉, 故活跃度和 read的结果不同.

4.1 unmap时的处理

linux-3.10.86/mm/memory.c

zap_pte_range
{
    if (PageAnon(page))
        ...
    else{
        if (pte_present(ptent)) {
                if (pte_young(ptent) &&
                    likely(!VM_SequentialReadHint(vma)))
                    /*问题:我们是在unmap, 为何要mark accessed?*/
                    mark_page_accessed(page);
        }
    }
}

问题:我们是在unmap, 为何要mark accessed?
答:由于unmap, 导致原先通过young标记的活跃度信息将会消失, 故改用markpageaccessed来反映该信息.

这是 非if (PageAnon(page))的情况, 这样即使脱离pte, 还可通过radix tree找到该page, 进而使用之前的page内容, 这样, 标记活跃度才有意义.

本文地址: https://awakening-fong.github.io/posts/mm/reclaim_03_activity

转载请注明出处: https://awakening-fong.github.io


若无法评论, 请打开JavaScript, 并通过proxy.


blog comments powered by Disqus