一致性 hash 算法理解与实现

前言

近段时间在了解分布式时,经常绕不开一个算法: 一致性哈希算法。于是在了解并实践这个算法后,就有了此文章。

算法间的对比

在分布式分片中,存在着几种算法: 取模,分段,一致性 hash。

取模 分段 一致性哈希
上层是否感知
迁移成本 低,只涉及相邻节点
单点故障影响 低,只影响相邻节点
算法复杂度
热点数据 存在 存在 存在

一致性哈希主要解决问题

从上述对比可知,一致性哈希主要降低节点上下线中带来的数据迁移成本,同时节点数量的变化与分片原则于上层无感,使上层更专注于领域内逻辑的编写,使整体架构更加灵活。

一致性 hash 原理

  1. 基本数据结构

​ 基本数据类型因人而已,常规的哈希取模采用大多采用将数据 hash 到 2^32 - 1的空间中,一致性哈希通常在此基础上将空间变成一个环。如下图所示。

image

​ 本次实现采用的是 key 按大小排列的哈希表。原打算使用数组承接数据,但排序成本随 key 的增多而加大,遂放弃。

  1. 数据存储

    数据存储与哈希取模算法大致相同,都是通过计算存入数据的哈希值放入对应的哈希槽上。但一致性哈希差异之处在于当计算 hash 不在环上,数据存入首个 hash 槽中。

    场景假设: 现已上线 4 节点(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有5个数据(hash1 ~ 5)于存入节点中,结果如下图所示。

    image

​ 本次实现采用的思路是

1. 计算存入数据的 hash 值
2. 寻找最近的(比数据 hash 值大的最小的节点 hash)节点并写入
3. 若 2 中未能寻找服务器,则写入第一个(hash 最小)节点中
  1. 节点上线

​ 新节点加入一致性哈希环中,原理是通过计算节点所代表的 hash 值,并根据计算值将节点映射在环上,最后迁移相邻节点数据到新节点上。

​ 场景假设: 现已上线 4 台服务器(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有一个新节点(hash5)节点上线到环上。结果如下图所示。

image

​ 本次实现采用的思路是

1. 计算上线节点 hash 值
2. 计算上线节点所新增的虚拟节点的 hash 值(若初始化指定虚拟节点数量)
3. 寻找最近的(比上线节点与虚拟节点 hash 值大的最小的节点 hash)节点,取出节点数据
4. 将1 2点节点加入到 hash 环中
5. 将 3 中取出的数据重新放入到 hash 环上
  1. 节点下线

​ 已有节点下线,原理是通过计算节点所代表的 hash 值,取出节点所含数据,下线节点,将取出数据重新放入 hash 环上。

​ 场景假设: 现已上线 5 台服务器(server1 ~ 5),对应 hash 值为 hash1 ~ 5。现节点 server4 下线。结果如下图所示。

image

​ 本次实现采用的思路是

1. 计算下线节点 hash 值
2. 取出下线节点以及虚拟节点(若初始化指定虚拟节点数量)存储数据
3. 将下线节点以及虚拟节点(若初始化指定虚拟节点数量)从 hash 环上移除
4. 将 2 中数据重新放入到环上

代码实现

一致性哈希分为两个方案: 不带虚拟节点与带虚拟节点。而两个方案实现类似,所以本次实现将两种方案合在一起实现。实现如下。

package org.CommonAlgorithms.ConsistentHash;

import org.CommonAlgorithms.HashAlgorithm.HashService;
import java.util.List;

/**
 * consistentHashing interface
 * @author cartoon
 * @since  10/01/2021
 * @version 1.1
 */
public interface ConsistentHashing {

    /**
     * put data to hash loop
     * @param data data list
     * @return put result
     */
    boolean putData(List<String> data);

    /**
     * put data to hash loop
     * @param data data
     * @return put result
     */
    boolean putData(String data);

    /**
     * remove node from hash loop
     * @param node removing node
     * @return remove result
     */
    boolean removeNode(String node);

    /**
     * add node to hash loop
     * @param node adding node
     * @return add result
     */
    boolean addNode(String node);

    /**
     * inject hash method to hash loop
     * @param hashService hash method
     * @throws UnsupportedOperationException if loop already has node
     */
    void setHashMethod(HashService hashService);

    /**
     * print all data in loop according ascending hash value with nodes
     */
    void printAllData();

}


package org.CommonAlgorithms.ConsistentHash;

import org.CommonAlgorithms.HashAlgorithm.HashService;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.*;

/**
 * consistent hash achieve
 * @author cartoon
 * @since  2021/01/17
 */
public class ConsistentHashingImpl implements ConsistentHashing {

    private static final Logger log = LoggerFactory.getLogger(ConsistentHashingImpl.class);

    /**
     * virtual node name template
     */
    private static final String virtualNodeFormat = "%s&&%d";

    /**
     * real node and its virtual node mapping
     */
    private SortedMap<String, List<String>> realNodeToVirtualNode;

    /**
     * hash and its node mapping
     */
    private SortedMap<Integer, String> hashToNodes;

    /**
     * node and its data mapping
     */
    private Map<String, List<String>> nodeToData;

    /**
     * determine virtual node's number of each node
     */
    private int virtualNodeNum;

    /**
     * inject hash method, if null, use loop default hash method
     */
    private HashService hashService;


    public ConsistentHashingImpl() {
        this(0, new String[0]);
    }

    public ConsistentHashingImpl(String... nodes) {
        this(0, nodes);
    }

    public ConsistentHashingImpl(int virtualNodeNum) {
        this(virtualNodeNum, new String[0]);
    }

    public ConsistentHashingImpl(int virtualNodeNum, String... nodes) {
        //1. intercept virtual num smaller than 0
        if(virtualNodeNum < 0){
            log.error("virtual num is not allow smaller than 0");
            throw new IllegalArgumentException();
        }
        //2. initialize loop member attributes
        this.virtualNodeNum = virtualNodeNum;
        realNodeToVirtualNode = new TreeMap<>();
        hashToNodes = new TreeMap<>();
        nodeToData = new HashMap<>();
        for(String server : nodes){
            hashToNodes.put(getHash(server), server);
            nodeToData.put(server, new LinkedList<>());
        }
        //3. if virtual node number bigger than 0, add virtual node
        if(virtualNodeNum > 0){
            for(String server : nodes){
                addVirtualNode(server);
            }
        }
    }

    @Override
    public boolean putData(List<String> data) {
        //1. circulate call put data method to add data to loop
        for(String incomingData : data){
            if(!putData(incomingData)){
                return false;
            }
        }
        return true;
    }

    @Override
    public boolean putData(String data) {
        if(hashToNodes.isEmpty()){
            log.error("put data, usable server is empty");
            return false;
        }
        //1. calculate data's hash value
        int currentHash = getHash(data);
        //2. get usual node(node's hash value is bigger than data's hash value), if usual node list is empty, get first node in loop
        SortedMap<Integer, String> usableNodes = hashToNodes.tailMap(currentHash);
        String node = usableNodes.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : usableNodes.get(usableNodes.firstKey());
        //3. add data to node
        List<String> dataList = nodeToData.get(node);
        dataList.add(data);
        log.info("put data, data {} is placed to server {}, hash: {}", data, node, currentHash);
        return true;
    }

    @Override
    public boolean removeNode(String node) {
        //1. calculate hash value of removing node
        int removeServerHash = getHash(node);
        if(!hashToNodes.containsKey(removeServerHash)){
            log.error("remove server, current server is not in server list, please check server ip");
            return false;
        }
        //2. get data from removing node
        List<String> removeServerData = nodeToData.get(node);
        //3. get removing node's virtual node data, remove all virtual node with removing node
        if(virtualNodeNum != 0){
            for(String virtualNode : realNodeToVirtualNode.get(node)){
                removeServerData.addAll(nodeToData.get(virtualNode));
                hashToNodes.remove(getHash(virtualNode));
                nodeToData.remove(virtualNode);
            }
        }
        //4. remove node from hash loop
        hashToNodes.remove(removeServerHash);
        nodeToData.remove(node);
        if(hashToNodes.size() == 0){
            log.info("remove server, after remove, server list is empty");
            return true;
        }
        //5. put data to loop by call put data method
        putData(removeServerData);
        log.info("remove server, remove server {} success", node);
        return true;
    }

    @Override
    public boolean addNode(String node) {
        //1, calculate adding node's hash value
        int addServerHash = getHash(node);
        //2. add node and migrate data
        if(hashToNodes.isEmpty()){
            //2.1 add node and its virtual node to loop directly when current loop is empty
            hashToNodes.put(addServerHash, node);
            nodeToData.put(node, new LinkedList<>());
            if(virtualNodeNum > 0){
                addVirtualNode(node);
            }
        } else{
            //2.2.1 get data to be migrated from loop
            SortedMap<Integer, String> greatServers = hashToNodes.tailMap(addServerHash);
            String greatServer = greatServers.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : greatServers.get(greatServers.firstKey());
            List<String> firstGreatServerData = new LinkedList<>(nodeToData.get(greatServer));
            //2.2.2 add node and its virtual node to loop
            hashToNodes.put(addServerHash, node);
            nodeToData.put(greatServer, new LinkedList<>());
            nodeToData.put(node, new LinkedList<>());
            if(virtualNodeNum != 0){
                addVirtualNode(node);
            }
            //2.2.3 migrate 2.2.1 data to loop by call put data method
            putData(firstGreatServerData);
        }
        log.info("add server, server {} has been added", node);
        return true;
    }

    @Override
    public void printAllData() {
        nodeToData.forEach((server, data) -> log.info("server {} contains data {}", server, data));
    }

    @Override
    public void setHashMethod(HashService hashService) {
        if(!hashToNodes.isEmpty()){
            throw new UnsupportedOperationException();
        }
        this.hashService = hashService;
    }

    private void addVirtualNode(String realNode){
        if(virtualNodeNum > 0){
            List<String> virtualNodeList = new LinkedList<>();
            for(int cnt = 0; cnt < this.virtualNodeNum; cnt++){
                //1. generate virtual node name by default format
                String virtualNodeName = String.format(virtualNodeFormat, realNode, cnt);
                //2. calculate each virtual node's hash value
                int virtualNodeHash = getHash(virtualNodeName);
                //3. current node already exist in loop, continue
                if(hashToNodes.containsKey(virtualNodeHash)){
                    continue;
                }
                //4. add node to loop
                virtualNodeList.add(virtualNodeName);
                hashToNodes.put(virtualNodeHash, virtualNodeName);
                nodeToData.put(virtualNodeName, new LinkedList<>());
            }
            //5. map virtual node to real node
            realNodeToVirtualNode.put(realNode, virtualNodeList);
        }
    }


    private int getHash(String data){
        return hashService == null ? defaultGetHash(data) : hashService.getHash(data);
    }

    private int defaultGetHash(String data){
        int res = 0;
        for(char tempChar : data.toCharArray()){
            if(tempChar >= '0' && tempChar <= '9'){
                res += tempChar;
            }
        }
        return res;
    }
}

测试结果

不带虚拟节点的一致性哈希
测试代码
package ConsistentHash;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;
import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * @author cartoon
 * @date 2020/12/27
 */
public class ConsistentHashingWithoutVirtualNodeTest {

    private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithoutVirtualNodeTest.class);

    private ConsistentHashing consistentHashing;

    private String[] servers;

    private String[] data;

    @Before
    public void before(){
        servers = new String[]{"000", "111", "222", "333", "555"};
        consistentHashing = new ConsistentHashingImpl(servers);
        data = new String[]{"000", "111", "222", "333", "555"};
    }

    @Test
    public void testConsistentHashing(){
        for(String str : data){
            Assert.assertTrue(consistentHashing.putData(str));
        }
        consistentHashing.removeNode("333");
        consistentHashing.addNode("444");
        consistentHashing.putData("444");
        consistentHashing.printAllData();
    }
}
测试结果
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555]
含虚拟节点的一致性哈希测试
测试代码
package ConsistentHash;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;
import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * @author cartoon
 * @date 2021/01/17
 */
public class ConsistentHashingWithVirtualNodeTest {

    private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithVirtualNodeTest.class);

    private ConsistentHashing consistentHashing;

    private String[] servers;

    private String[] data;

    @Before
    public void before(){
        servers = new String[]{"000", "111", "222", "333", "555"};
        consistentHashing = new ConsistentHashingImpl(3, servers);
        data = new String[]{"000", "111", "222", "333", "555"};
    }

    @Test
    public void testConsistentHashing(){
        for(String str : data){
            Assert.assertTrue(consistentHashing.putData(str));
        }
        consistentHashing.removeNode("333");
        consistentHashing.addNode("444");
        consistentHashing.putData("444");
        consistentHashing.putData("555&&0");
        consistentHashing.printAllData();
    }
}
测试结果
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555&&0 is placed to server 555&&0, hash: 207
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&2 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&1 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&0 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&1 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&2 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&1 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&2 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&2 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&0 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&1 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&2 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&0 contains data [555&&0]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&0 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&1 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&0 contains data []

实现存在的缺陷

1. 哈希算法过于简单,哈希冲突概率较大
2. 真实节点含有虚拟节点的数量不均
3. 节点上线时真实节点与已存在的虚拟节点的顺序冲突尚未解决

后记

本次实现的所有代码已全部上传到 github,项目主要包含一些常用的算法,如排序算法,限流算法的简单实现,欢迎提 issue。

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 206,482评论 6 481
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 88,377评论 2 382
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 152,762评论 0 342
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 55,273评论 1 279
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 64,289评论 5 373
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,046评论 1 285
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,351评论 3 400
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,988评论 0 259
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 43,476评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,948评论 2 324
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,064评论 1 333
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,712评论 4 323
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,261评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,264评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,486评论 1 262
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,511评论 2 354
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,802评论 2 345

推荐阅读更多精彩内容