Outline
0. Overview
1. Amplification
2. Size-tiered Compaction
3. Leveled Compaction
4. Summary
5. Lucene Merge Policy
6. Reference
Overview
Compaction operations are expensive in terms of CPU, memory, and Disk I/O,而由于immutable特质,该操作在LSM架构上有必不可少。
data过来之后会写到memory table (MemTable),当mem满了之后,会flush到disk形成不可变的immutable Sorted String Table (SSTable)。当SSTable太多,os所打开的文件句柄也会过多,所以此时需要将多个同质的SSTable合并
成一个SSTable。
Amplification
- 写放大:一份数据被顺序写几次(还是那一份数据,只是需要一直往下流)。第一次写到L0,之后由于compaction可能在L1~LN各写一次
- 读放大:一个query会在多个sstable,而这些sstable分布在L0~LN
- 空间放大:一份原始数据被copy/overwritten(一份用于服务query,一份用于snapshot之后用于compaction)
Size-tiered Compaction
- Triggered when the system has enough similarly sized SSTables, merged together to form one larger sstable. A disadvantage of this strategy is that very large SSTable will stay behind for a long time and utilize a lot of disk space (recommended for write-intensive workloads)
- 每一个tiers的单片大小逐渐变大,但是每一个tiers的sstables数量一致
- 如果某一个tier满了(即sstables数量达到阈值)就会进行compaction,从而将该tier的所有数据merge为一个然后丢给下一个tier作为下一个tier的一个sstable。而在这个merge的过程,会copy一份原数据snapshot用于merge,merge之后再删除
Leveled Compaction
- Triggered when unleveled SSTables (newly flushed SSTable files in Level 0) exceeds 4 (recommended for read-intensive workloads)
- 每一个tier里面的 sstable大小都是一致的,区别是每一个tier的sstable数量是逐渐变大的(一个数量级)
- tier1里面的sstables会跟tier2的sstables一起进行merge操作,最终在tier2(量大者)上形成一个有序的sstable
Summary
- data in one SSTable which is later modified or deleted in another SSTable wastes space as both tables are present in the system
- when data is split across many SSTables, read requests are processed slower as many SSTables need to be read
Lucene Merge Policy
这里同样有三个放大问题,
- 写放大(doc在segment之间的迁移)
- 读放大(doc不同版本在不同segment,而打开一个segment需要一个indexReader)
- 空间放大(segments tmp空间)
lucene write-once vs. random write