LSM-Tree（57）

5. Cost-Performance Comparisons with Other Access Methods（3）

**Definition 5.1. **We say that the index structure of a disk based access method has the prop- erty of being a Continuum Structure if the indexing scheme provides for immediate placement of a newly inserted index entry in its ultimate collation order, based on key-value, with all other entries already present.
我们说，基于磁盘的访问方法的索引结构具有连续体结构的特性，如果索引方案提供了将新插入的索引项按其最终排序顺序(基于键-值)立即放置在所有其他条目已经存在的情况下。（有道翻译）

Recall that successive transactions in the TPC benchmark application have Acct-ID values generated at random from each of 100,000,000 possible values. By Definition 1.1, each new entry insert of an Acct-ID||Timestamp index will be placed in a pretty much random position on one of 2.3 million pages of entries that already exist. In a B-tree, for example, the 576,000,000 accumulated entries will contain on the average 5.76 entries for each Acct-ID; presumably each entry with the same Acct-ID has a distinct Timestamp. Each new entry insert will therefore be placed on the right of all entries with the same Acct-ID. But this still leaves 100,000,000 points of insert randomly chosen, which certainly implies that each new insert will be on a random one of the 2.3 million pages of existing entries. In an extendible hashing scheme [9], by contrast, new entries have a collation order calculated as a hash value from the Acct-ID||Timestamp key-value, and clearly any placement of a new entry in sequence with all entries already present is equally likely.
回想一下，TPC基准应用程序中的后续事务都有从1亿个可能值中随机生成的Acct-ID值。根据定义1.1，每个Acct-ID||Timestamp索引的新条目插入都将被放置在已经存在的230万页条目中的任意一页上。例如，在b -树中，5.76亿个累积条目平均每个Acct-ID包含5.76个条目;假设具有相同Acct-ID的每个条目都有不同的Timestamp。因此，每个新条目插入将被放置在具有相同Acct-ID的所有条目的右侧。但这仍然留下了1亿个随机选择的插入点，这当然意味着每个新插入将在现有的230万页条目中随机选择一个。相比之下，在可扩展的哈希方案[9]中，新条目的排序顺序是按照Acct-ID||Timestamp键-值的哈希值计算的，而且很明显，新条目与已经存在的所有条目顺序放置的可能性是相同的。（有道翻译）

todo：自己翻译

LSM-Tree（57）

5. Cost-Performance Comparisons with Other Access Methods（3）

推荐阅读更多精彩内容