SLUB 02:frozen 2017-01-14

本文假定没有打开CONFIGSLUBDEBUG, 没有配置CONFIG_NUMA.

1. 什么是frozen

linux-3.10.86/mm/slub.c

 *   If a slab is frozen then it is exempt(免除) from list management. It is not
 *   on any list. It is not on any list. The processor that froze the slab is the one who can
 *   perform list operations on the page. Other processors may put objects
 *   onto the freelist but the processor that froze the slab is the only
 *   one that can retrieve the objects from the page's freelist.

cpu01的c->page是frozen, 那么cpu01可以从该page中取出或放回obj, cpu02不能从该page中取obj, 只能把obj还给该page.

2. cpu partial上的page都是frozen状态

linux-3.10.86/include/linux/slub_def.h

struct kmem_cache_cpu {
...
    struct page *partial;   /* Partially allocated frozen slabs */
    ...
};

SLUB 03:partial和cpu partial 2017-01-14

主要介绍partial和cpu partial的产生.
内核未定义 CONFIG_NUMA

partial没有指明是node partial还是cpu partial时, 则指的是node partial.

1. node partial的产生

在cpu0上执行newslabobjects -> new_slab, 由于可能睡眠, 之后可能运行在cpu1上. 这时若cpu1的c->page非空, 则根据情况, 可能将其放入node partial中.

linux-3.10.86/mm/slub.c

__slab_alloc
|--local_irq_save(flags);
|--freelist =new_slab_objects //buddy的page的首个obj
|    |--page = new_slab(s, flags, node); //在cpu0上
|    |--c = __this_cpu_ptr(s->cpu_slab) //进程可能迁移到cpu1上
|    |--if (c->page) flush_slab(s, c);

flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
|-- deactivate_slab(s, c->page, c->freelist)


static void deactivate_slab(struct kmem_cache *s, struct page *page, void *freelist)
{

    /*
     while循环前:
     obj_t2 -> obj_t1 -> NULL
     ^
     @page->freelist

     obj_03 -> obj_02 -> obj_01 -> NULL
     ^
     @freelist


     循环单次后:   
     obj_03   ->  obj_t2  -> obj_t1 -> NULL
     ^
     page->freelist 


     obj_02 -> obj_01 -> NULL
     ^
     freelist

     再来一回:
     obj_02 -> obj_03   ->  obj_t2  -> obj_t1 -> NULL
     ^
     page->freelist


     obj_01 -> NULL
     ^
     freelist

     问题:为何不直接修改指针把两个链表串起来, 而是要一个一个object的放到链表中?
     答:因为要修改counters, 所以要一个一个数.
    */
    while (freelist && (nextfree = get_freepointer(s, freelist))) {
        ...
    }

    ....

    /*
    如果这里freelist为NULL, 说明@freelist传入时就为NULL
    */
    /*把the last one也接上*/
    if (freelist) {
        new.inuse--;
        /* 
        new->freelist
        v
        obj_01 -> obj_02 -> obj_03   ->  obj_t2  -> obj_t1 -> NULL
        ^
        freelist
        */
        set_freepointer(s, freelist, old.freelist);
        new.freelist = freelist;
    } else
        ...

    new.frozen = 0;

    //有M_FREE, M_PARTIAL等情况, 这里只看M_PARTIAL
    ...

    if (m == M_PARTIAL) {
        /*问题:为何不是put_cpu_partial()?*/
        add_partial(n, page, tail);

    }

}

SLUB 01:the SLUB allocator的相关数据结构和主要流程 2017-01-12

文来自本人旧博客: blog.163.com/awaken_ing/blog/static/1206131972016316114114855

平台 ARM Versatile Express for Cortex-A9 SMP
内核版本 3.10.86 (未定义 CONFIG_NUMA)

1. 概览

The SLUB allocator 相比SLAB, 试图 remove the metadata overhead inside slabs, reduce the number of caches, and so on. The only metadata present in the SLUB allocator is the in-object “next-free-object” pointer, which allows us to link free objects together. struct kmem_cache 的成员 int offset 用来指明 指针在 object 中的偏移量, 这个指针指向的是下一个可用的object. How does the allocator manage to find the first free object? The answer lies in the approach of saving a pointer to such an object inside each page struct associated with the slab page. SLUB allocator 没有SLAB的full list 和 empty list.

图片来自 http://events.linuxfoundation.org/sites/events/files/slides/slaballocators.pdf

linux-3.10.86/mm/slub.c

slab_alloc() -> slab_alloc_node()
slab_alloc_node(struct kmem_cache *s, ...
{
    struct kmem_cache_cpu *c=__this_cpu_ptr(s->cpu_slab);
    object = c->freelist;
    if fastpath {
        void *next_object = get_freepointer_safe(s, object);
        c->freelist=next_object; //via  this_cpu_cmpxchg_double()
    }else
        ...
    return object;
}
static inline void *get_freepointer(struct kmem_cache *s, void *object)
{
    return *(void **)(object + s->offset);
}

arm linux的switch_to 2017-01-11

文来自本人的旧博客: blog.163.com/awaken_ing/blog/static/12061319720158310574442/

<< Professional Linux Kernel Architecture >> page105标题为 Intricacies of switchto (intricacies:错综复杂的) , 絮絮叨叨着switchto.

下面看下arm linux的情况.

context_switch(struct rq *rq, struct task_struct *prev,
           struct task_struct *next)
{
...
switch_to(prev, next, prev);
barrier();
 /*
假定进程A被调出去, 进程B运行, 即switch_to(A, B, A)
那么, 这里是刚才被调度走的进程A 恢复运行后, 会执行这个地方, 
要验证的话, 可以看看进程A让出CPU前, 最后一条修改lr的指令.
*/
 finish_task_switch(this_rq(), prev);
}

调度器, 从lost wake-up problem说起 2017-01-11

文来自本人的旧博客 blog.163.com/awaken_ing/blog/static/1206131972016124113539444/

0. 引言

The lost wake-up problem 请参考 http://www.linuxjournal.com/article/8144
本篇主要解释为何修改后的代码没有问题. 修改后的代码为:

1  set_current_state(TASK_INTERRUPTIBLE);
2  spin_lock(&list_lock);
3  if(list_empty(&list_head)) {
4         spin_unlock(&list_lock); //如果这里面让出cpu?
//如果在这个点被生产者唤醒会如何?
5         schedule();
6         spin_lock(&list_lock);
7  }
8  set_current_state(TASK_RUNNING);
9
10 /* Rest of the code ... */
11 spin_unlock(&list_lock);