2023-11-21

内容摘自 <<Rust Atomics and Locks>>.

1 release-acquire 和 fence

1.1 拆分

The store of a release-acquire relationship,

a.store(1, Release);

can be substituted by a release fence followed by a relaxed store:

fence(Release);
a.store(1, Relaxed);

类似的, The load of a release-acquire relationship,

a.load(Acquire);

can be substituted by a relaxed load followed by an acquire fence:

a.load(Relaxed);
fence(Acquire);

For example, if we load a pointer from an atomic variable using acquire memory ordering, we could use a fence to apply the acquire ordering only when the pointer is not null:

Using an acquire-load:

let p = PTR.load(Acquire);
if p.is_null() {
    println!("no data");
} else {
    println!("data = {}", unsafe { *p });
}

改写成 Using a conditional acquire fence:

let p = PTR.load(Relaxed);
if p.is_null() {
    println!("no data");
} else {
    fence(Acquire);
    println!("data = {}", unsafe { *p });
}

This can be beneficial if the pointer is often expected to be null, to avoid acquire memory ordering when not necessary.

awakening-fong注释: fence 不影响其前面指令的读取结果.

1.2 对应的汇编

pub fn a() {
    fence(Acquire);
}

Compiled x86-64:

a:
    ret

Compiled ARM64:

a:
    dmb ishld
    ret

Unsurprisingly, release and acquire fences on x86-64 do not result in any instruction. x86-64上, fence(Acquire)并不生成任何指令. We get release and acquire semantics “for free” on this architecture. Only a SeqCst fence results in an mfence (memory fence) instruction. This instruction makes sure that all memory operations before it have been completed before continuing. 而fence(SeqCst)确保其之前的内存操作已全部完成.

dmb ish 的含义是 data memory barrier, inner shared domain.

2 cas

Remember that stxr is allowed to have false negatives; it might fail here even if the five wasn’t overwritten. That’s okay, because we’re using compare_exchange_weak, which is allowed to have false negatives too. awakening-fong注释:x.compare_exchange_weak(5, 6, Relaxed, Relaxed); 会错误的认为旧值在执行过程中被他人修改了, 而实际并没有被修改.

on most processor architectures, the instruction(s) of compare_exchange will claim exclusive access of the relevant cache line regardless of whether the comparison succeeds or not. This means that it can be beneficial to not use compare_exchange (or swap) in a spin loop like we did for our SpinLock in Chapter 4, but instead use a load operation first to check if the lock has been unlocked. That way, we avoid unnecessarily claiming exclusive access to the relevant cache line.

3 其他

Pipelining: Before an instruction finishes executing, the processor might already start executing the next one. awakening-fong注释: 流水线, 自然是一道工序完成后, 才会交给下一道工序. 所以, 单条指令的执行 被拆分成更多步骤/工序. Modern processors can often start the execution of quite a few instructions in series while the first one is still in progress. awakening-fong注释: 这里说的和 多核无关.

mutex 实现上, 可以先spin一会儿,还是没拿到锁,再改成wait方式.

The most interesting part about the RCU pattern is the last step, which does not have a letter in the acronym: deallocating the old data. awakening-fong注释:RCU这几个字母缩写没有提及释放, 而释放却是最interesting的部分. After a successful update, other threads might still be reading the old copy, if they read the pointer before the update. You’ll have to wait for all those threads to be done before the old copy can be deallocated.

本文地址: https://awakening-fong.github.io/posts/other/rust_atomics_and_locks

转载请注明出处: https://awakening-fong.github.io


若无法评论, 请打开JavaScript, 并通过proxy.


blog comments powered by Disqus