MMR是什么
merkle tree一种二叉树也是区块链中一种常见的数据结构,其特性就是树的根及中间节点主要是由其左右子树的Hash构成。Parent = H(0,1),其以密码学保证其安全性,以相同顺序插入才能计算出最终一致的树根。
而mmr(Merkle Mountain Range)是Peter Todd提出的一种Merkle tree,长相类似一组连续的山峰组成,其被设计为节点插入后就不能被修改,支持动态插入。
对于普通Merkle树对于每个新加入节点都需要重新计算merkle root,如果节点数量很大的话这个计算量会非常巨大,而mmr支持动态加入新节点并计算root。
MMR的组成
上图mmr包含了3个叶子节点,叶子节点的插入顺序就是叶子节点的标识(节点的存储位置),插入
节点1
后由于出现相同高度的树,则将节点0,1合并(2 = merge(0,1))生成父节点2
,然后再插入节点3
由于高度不一一致,无需树的合并,最后生成如图所示的MMR。
由上图可以发现,以存储索引位置作为其坐标的二叉树,都有左子树与父节点的距离(offset)为offset=2^Height
,兄弟节点之间的距离为offset=2^Height - 1
,这样就可以计算出任意节点的兄弟节点与父节点的坐标。
另外如果我们能够计算出任意节点的高度,我们就能计算出任意节点的父节点及兄弟节点的坐标了,将节点坐标从1
开始并以二进制
来表示。如图:
任意高度的
最左侧节点
的1
的数量就是其高度,由于最左侧子树每次合并右子树生成父节点增加高度,实际上是左节点坐标n计算出父节点坐标n<<1 + 1
,这样计算任意节点的高度就可以表示为将节点向最左侧移动
,即当其坐标全为1
时,其数量就时其高度。如下代码获取任意坐标的所在高度:
func countZore(num uint64) int {
return bits.UintSize - bits.OnesCount64(num)
}
func leadingZeros(num uint64) int {
return bits.LeadingZeros64(num)
}
func allOnes(num uint64) bool {
return num != 0 && countZore(num) == leadingZeros(num)
}
func jumpLeft(pos uint64) uint64 {
bitLength := 64 - leadingZeros(pos)
mostSignificantBits := uint64(1) << uint64(bitLength-1)
return pos - (mostSignificantBits - 1)
}
func pos_height_in_tree(pos uint64) int {
pos += 1
for !allOnes(pos) {
pos = jumpLeft(pos)
}
return 64 - leadingZeros(pos) - 1
}
现在我们可以顺序追加节点了,我们只需要判断下一个节点的高度,如果大于当前高度则需要合并左右子树,方法如下:
func (m *mmr) append(n *Node) *Node {
height, pos := 0, m.cur_size
n.index = pos
m.values = append(m.values, n)
for pos_height_in_tree(pos+1) > height {
pos++
// calculate pos of left child and right child
left_pos := pos - parent_offset(height)
right_pos := left_pos + sibling_offset(height)
left, right := m.values[left_pos], m.values[right_pos]
parent := &Node{index: pos}
merge(parent, left, right)
m.values = append(m.values, parent)
height++
}
m.cur_size = pos + 1
return n
}
MMR的证明
由图2可以知道,MMR可能会有多个山峰
,而MMR的root是由最右侧的山峰依次向左合并,直到最后形成root,这个操作也被称为山峰的拱起
操作。图2中的root=Hash(Hash(18,17),14)
。
MMR的root是由山峰的拱起
得到,那么最左侧的山峰一定一个完全的二叉树,节点数量为2^Height - 1
,由此我们可以在固定节点数量下(Count)不断尝试左侧山峰的高度,找到2^Height - 1 < Count
的最大的树,如下:
func left_peak_pos_by_height(height int) uint64 {
// -2 表示获取从0开始的山峰的坐标
return (uint64(1) << uint64(height+1)) - 2
}
func left_peak_height_pos(mmrSize uint64) (int, uint64) {
height := 0
prev_pos := uint64(0)
pos := left_peak_pos_by_height(height)
//不断增加山峰的高度,直到最后一次山峰的坐标超过节点总数
for pos < mmrSize {
height += 1
prev_pos = pos
pos = left_peak_pos_by_height(height)
}
return height - 1, prev_pos
}
在计算出左侧山峰后,可以以此为坐标,依次计算出右侧的所有山峰,如下:
func get_right_peak(height int, peakPos, mmrSize uint64) (int, uint64) {
//jump to right sibling
peakPos += sibling_offset(height)
//jump to left child
for {
if peakPos <= mmrSize-1 {
break
}
if height == 0 {
//no right peak exists
return height, 0
}
height -= 1
peakPos -= parent_offset(height)
}
return height, peakPos
}
func get_peaks(mmrSize uint64) []uint64 {
res := make([]uint64, 0, 0)
height, pos := left_peak_height_pos(mmrSize)
res = append(res, pos)
for height > 0 {
height, pos = get_right_peak(height, pos, mmrSize)
if height == 0 && pos == 0 {
break
}
res = append(res, pos)
}
return res
}
获取到所有山峰后,就可以对所有山峰,由左到右依次拱起
,最后得到MMR的root。如下:
func (m *mmr) getRoot() Hash {
if m.cur_size == 0 {
return Hash{0}
}
if m.cur_size == 1 {
return m.values[0].getHash()
}
return m.bag_rhs_peaks(0, get_peaks(m.cur_size))
}
func (m *mmr) bag_rhs_peaks(pos uint64, peaks []uint64) Hash {
rhs_peak_hashes := make([]Hash, 0, 0)
for _, v := range peaks {
if v > pos {
rhs_peak_hashes = append(rhs_peak_hashes, m.values[v].getHash())
}
}
for len(rhs_peak_hashes) > 1 {
last := len(rhs_peak_hashes) - 1
right := rhs_peak_hashes[last]
rhs_peak_hashes = rhs_peak_hashes[:last]
last = len(rhs_peak_hashes) - 1
left := rhs_peak_hashes[last]
rhs_peak_hashes = rhs_peak_hashes[:last]
rhs_peak_hashes = append(rhs_peak_hashes, merge2(right, left))
}
if len(rhs_peak_hashes) == 1 {
return rhs_peak_hashes[0]
} else {
return Hash{0}
}
}
merkle proof
构造叶子节点的 merkle proof,分三个步骤:
构造该叶子到山峰的proof
将右侧的山峰加入proof
将左侧的山峰从右到左加入proof.
如下:
func (m *mmr) gen_proof(pos uint64) *MerkleProof {
proofs := make([]Hash, 0, 0)
height := 0
for pos < m.cur_size {
pos_height, next_height := pos_height_in_tree(pos), pos_height_in_tree(pos+1)
if next_height > pos_height {
// get left child sib
sib_pos := pos - sibling_offset(height)
// break if sib is out of mmr
if sib_pos >= m.cur_size {
break
}
proofs = append(proofs, m.values[sib_pos].getHash())
// goto parent node
pos = pos + 1
} else {
// get right child
sib_pos := pos + sibling_offset(height)
// break if sib is out of mmr
if sib_pos >= m.cur_size {
break
}
proofs = append(proofs, m.values[sib_pos].getHash())
// goto parent node
next_pos := pos + parent_offset(height)
pos = next_pos
}
height += 1
}
// now pos is peak of the mountain(because pos can't find a sibling)
peak_pos := pos
peaks := get_peaks(m.cur_size)
// bagging rhs peaks into one hash
rhs_peak_hash := m.bag_rhs_peaks(peak_pos, peaks)
if !equal_hash(rhs_peak_hash, Hash{0}) {
proofs = append(proofs, rhs_peak_hash)
}
// put left peaks to proof
for i := len(peaks) - 1; i >= 0; i-- {
p := peaks[i]
if p < pos {
proofs = append(proofs, m.values[p].getHash())
}
}
return newMerkleProof(m.cur_size, proofs)
}
merkle verify
proof的验证,以相同的顺序重新计算Merkle Root就可以,如下:
func (m *MerkleProof) verify(root Hash, pos uint64, leaf_hash Hash) bool {
peaks := get_peaks(m.mmrSize)
height := 0
for _, proof := range m.proofs {
// verify bagging peaks
if pos_in_peaks(pos, peaks) {
if pos == peaks[len(peaks)-1] {
leaf_hash = merge2(leaf_hash, proof)
} else {
leaf_hash = merge2(proof, leaf_hash)
pos = peaks[len(peaks)-1]
}
continue
}
// verify merkle path
pos_height, next_height := pos_height_in_tree(pos), pos_height_in_tree(pos+1)
if next_height > pos_height {
// we are in right child
leaf_hash = merge2(proof, leaf_hash)
pos += 1
} else {
leaf_hash = merge2(leaf_hash, proof)
pos += parent_offset(height)
}
height += 1
}
return equal_hash(leaf_hash, root)
}
MMR的作用
MMR可以极大的减少merkle证明的数据量,可以大幅度的减轻存储和网络的负担,提升验证效率,目前Open timestamp 和 Grin 等项目及Fly client的论文中都使用了MMR的证明。