2019-09-13

目录:

  • 1. 避免突发大量压缩, 导致延迟过大
  • 2. seek 和 compaction
  • 2.1 量化
  • 3. BloomFilter

  • 1. 避免突发大量压缩, 导致延迟过大

    DBImpl::MakeRoomForWrite
    {
    
      if (allow_delay &&
        versions_->NumLevelFiles(0) >= config::kL0_SlowdownWritesTrigger) {
      // We are getting close to hitting a hard limit on the number of
      // L0 files.  Rather than delaying a single write by several
      // seconds when we hit the hard limit, start delaying each
      // individual write by 1ms to reduce latency variance. 与其在达到hard limit后, 延后好几秒, 我们选择在达到soft limit后, 对每个独立的写延后1ms. Also,
      // this delay hands over some CPU to the compaction thread in
      // case it is sharing the same core as the writer. 这样, 也将一些cpu让给压缩线程.
      env_->SleepForMicroseconds(1000);
    
    }
    

    2. seek 和 compaction

    代码 Version::Get()

    if (last_file_read != nullptr && stats->seek_file == nullptr) {
      // We have had more than one seek for this read.  Charge(记在账上) the 1st file.
    
      stats->seek_file = last_file_read;
      stats->seek_file_level = last_file_read_level;
    }
    

    last_file_read 的 FileMetaData, 给出的key范围 能够囊括 要查找的key (smallest < targe key < largest), 但结果并没有匹配的key. 这样, 我们就会继续查看其他文件的FileMetaData.

    DBImpl::Get
    |--current->Get(options, lkey, value, &stats); // Version::Get
    |--current->UpdateStats(stats)
    | |--FileMetaData* f = stats.seek_file;
    | |--f->allowed_seeks--;
    |--if (f->allowed_seeks <= 0)
    | |--file_to_compact_ = f;
    | |--MaybeScheduleCompaction();
    | | |--env_->Schedule(&DBImpl::BGWork, this)
    | | | |--DBImpl::BackgroundCall() -> BackgroundCompaction
    

    某个层级的某个文件无效seek过多,需要compaction.

    2.1 量化

    leveldb/db/version_set.cc:

    // We arrange to automatically compact this file after
    // a certain number of seeks.
    

    问题: 为何要自动压缩 触发是根据 seek? (通常, 压缩是 由 内存写入磁盘 触发的)

    答: 记录seek访问的文件, 可是认为是经历了false hit, 如果多次false hit, 那么, 我们就把这个文件和其他文件合并, 避免无谓的访问. 产生这种seek的原因是: level n和level n + 1, 存在overlap.


    leveldb/db/version_set.cc

    // We arrange to automatically compact this file after
    // a certain number of seeks. 一定数量的seek后, 启动自动压缩. 要计算出多少次seek后压缩比较合适. Let's assume:
    //   (1) One seek costs 10ms
    //   (2) Writing or reading 1MB costs 10ms (100MB/s)
    //   (3) A compaction of 1MB does 25MB of IO:
    //         1MB read from this level
    //         10-12MB read from next level (boundaries may be misaligned)
    //         10-12MB written to next level
    // This implies that 25 seeks cost the same as the compaction
    // of 1MB of data.  25个seek耗时:25*10ms == 读或写25MB == 压缩1MB.   I.e., one seek costs approximately the
    // same as the compaction of 40KB of data.  We are a little
    // conservative and allow approximately one seek for every 16KB
    // of data before triggering a compaction.
    

    3. BloomFilter

    判断某个元素是否属于集合. 不会发生false negatives.

    本文地址: https://awakening-fong.github.io/posts/database/leveldb_05_perf_latency

    转载请注明出处: https://awakening-fong.github.io


    若无法评论, 请打开JavaScript, 并通过proxy.


    blog comments powered by Disqus