RCU常见问题
2018-03-22
1. 没有锁保护
有如下结构体:
struct your_obj {
struct hlist_node obj_node_hlist;
struct rcu_head rcu_head;
atomic_t refcnt;
int id;
};
错误写法:
hlist_del_init_rcu(&your_obj->obj_node_hlist);
正确写法:
spin_lock(&obj_hash_lock[hash]);
hlist_del_init_rcu(&your_obj->obj_node_hlist);
spin_unlock(&obj_hash_lock[hash]);
错误原因:
rcu_read_lock
hash = xxx
hlist_for_each_entry_rcu(tpos, pos, &obj_hlist[hash], obj_node_hlist) {
xxx
}
rcu_read_unlock();
假如我们使用 hlist_for_each_entry_rcu
来遍历链表, 在没有锁保护时, 当有两个hlist_del_init_rcu
在并行执行时, 可能导致我们遍历出错.
2 内存释放01
错误写法:
void your_obj_unregister(struct your_obj *your_obj)
{
unsigned int hash = 0;
hash = your_hash(your_obj->id);
spin_lock(&obj_hash_lock[hash]);
hlist_del_init_rcu(&your_obj->obj_node_hlist);
spin_unlock(&obj_hash_lock[hash]);
your_obj_put(your_obj);
return;
}
int your_obj_get_if_registered(struct your_obj *your_obj)
{
unsigned int hash = 0;
struct hlist_node *pos;
struct your_obj *tpos;
int is_registered = 0;
rcu_read_lock();
hash = your_hash(your_obj->id);
hlist_for_each_entry_rcu(tpos, pos, &obj_hlist[hash], obj_node_hlist) {
if (tpos == your_obj) {
is_registered = 1;
your_obj_get(your_obj);
break;
}
}
rcu_read_unlock();
return is_registered ? 0 : 1;
}
your_obj_put(struct your_obj *t)
{
if(atomic_dec_and_test(&t->refcnt)){
kfree(t);
}
正确的写法:
void your_obj_free_rcu_callback(struct rcu_head *h)
{
struct your_obj *t = container_of(h, struct your_obj, rcu_head);
if (!atomic_read(&t->refcnt)) {
t->obj_node_hlist.next = NULL;
kfree(t);
}
}
your_obj_put(struct your_obj *t)
{
if(atomic_dec_and_test(&t->refcnt)){
call_rcu(&t->rcu_head, your_obj_free_rcu_callback);
}
错误原因:
没有使用call_rcu()
时,
遍历链表时, 允许hlist_del_init_rcu
把节点从链表上脱离, 并释放内存.
int your_obj_get_if_registered(struct your_obj *your_obj)
{
...
if (tpos == your_obj) {
//这时考虑发生 your_obj_put -> kfree
is_registered = 1;
your_obj_get(your_obj); //修改已经释放的内存, 触发BUG
break;
}
}
3 内存释放02
考虑如下序列:
int your_obj_get_if_registered() | your_obj_unregister
{ | {
rcu_read_lock(); |
|
hlist_for_each_entry_rcu( ... ) { |
if (tpos == your_obj) { |
is_registered = 1; |
|
| hlist_del_init_rcu
| your_obj_put
| }
your_obj_get(your_obj); //后续导致rcu callback重复注册, 第2个callback释放已释放的内存, 故BUG.
break;
}
}
rcu_read_unlock();
}
使用引用计数时, get大概率是错误的, 需要改成kref_get_unless_zero()
.
4 模块
模块内有
if(atomic_dec_and_test(&t->refcnt)){
call_rcu(&t->rcu_head, your_obj_free_rcu_callback);
那么, 模块卸载前需要rcu_barrier(). 否则造成BUG, 因为callback是模块内的函数, 而模块已卸载了, 故BUG.
5. 相关资料
RCU Usage In the Linux Kernel: One Decade Later https://pdos.csail.mit.edu/6.828/2017/readings/rcu-decade-later.pdf
https://www.kernel.org/doc/Documentation/RCU/checklist.txt
本文地址: https://awakening-fong.github.io/posts/other/rcu
转载请注明出处: https://awakening-fong.github.io
若无法评论, 请打开JavaScript, 并通过proxy.
blog comments powered by Disqus