前言
今天在使用自己搭建的虚拟机测试时,发现3台zookeeper中有一台起不来,具体情况如下:
故障节点
$ZK_HOME/bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
但是jps查看进程的时候没有,其他2台是正常的。
查看zk的日志发现:
2021-05-10 11:34:38,908 [myid:1] - INFO [main:Util@190] - Invalid snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.380001e0ca len = 0 byte = 0
2021-05-10 11:34:38,908 [myid:1] - INFO [main:Util@190] - Invalid snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.380000f211 len = 0 byte = 0
2021-05-10 11:34:38,909 [myid:1] - INFO [main:Util@190] - Invalid snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.3500034cd6 len = 0 byte = 0
2021-05-10 11:34:38,909 [myid:1] - INFO [main:FileSnap@83] - Reading snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.3500021022
2021-05-10 11:34:38,917 [myid:1] - ERROR [main:QuorumPeer@648] - Unable to load database on disk
java.io.IOException: 输入/输出错误
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.zookeeper.server.persistence.FileTxnLog$PositionInputStream.read(FileTxnLog.java:452)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:585)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:604)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:570)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:531)
at org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:358)
at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:140)
at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:601)
at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:591)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:164)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
2021-05-10 11:34:38,918 [myid:1] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server
提示故障机器的snapshot无效,无法从磁盘加载。
具体怎么做呢?
解决方案
由于另外两台机器是正常的,我们可以将故障机器的zk数据文件夹备份一下,让其从正常运行的节点之一复制快照
步骤如下:
mv $ZK_HOME/tmp/version-2 $ZK_HOME/tmp/version-2.bak
$ZK_HOME/bin/zkServer.sh start
jps查看进程是否启动。并查看$ZK_HOME/tmp/version-2是否同步新的snapshot。
--by 俩只猴