MySQL高可用架构——MHA(二)

从上面的输出可以看出整个MHA的切换过程，共包括以下的步骤：
1.配置文件检查阶段，这个阶段会检查整个集群配置文件配置
2.宕机的master处理，这个阶段包括虚拟ip摘除操作，主机关机操作（这个我这里还没有实现，需要研究）
3.复制dead maste和最新slave相差的relay log，并保存到MHA Manger具体的目录下
4.识别含有最新更新的slave
5.应用从master保存的二进制日志事件（binlog events）
6.提升一个slave为新的master进行复制
7.使其他的slave连接新的master进行复制
最后启动MHA Manger监控，查看集群里面现在谁是master（在切换后监控就停止了。。。还有东西没搞对？）后来在官方网站看到这句话就明白了。

Running MHA Manager from daemontools

Currently MHA Manager process does not run as a daemon. If failover completed successfully or the master process was killed by accident, the manager stops working. To run as a daemon, daemontool. or any external daemon program can be used. Here is an example to run from daemontools.

[root@192.168.0.20 ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:23971) is running(0:PING_OK), master:192.168.0.60
[root@192.168.0.20 ~]#

二.手动Failover（MHA Manager必须没有运行）

手动failover，这种场景意味着在业务上没有启用MHA自动切换功能，当主服务器故障时，人工手动调用MHA来进行故障切换操作，具体命令如下：

注意：如果，MHA manager检测到没有dead的server，将报错，并结束failover：

Mon Apr 21 21:23:33 2014 - [info] Dead Servers:
Mon Apr 21 21:23:33 2014 - [error][/usr/local/share/perl5/MHA/MasterFailover.pm, ln181] None of server is dead. Stop failover.
Mon Apr 21 21:23:33 2014 - [error][/usr/local/share/perl5/MHA/ManagerUtil.pm, ln178] Got ERROR:  at /usr/local/bin/masterha_master_switch line 53

进行手动切换命令如下：

[root@192.168.0.20 ~]# masterha_master_switch --master_state=dead --conf=/etc/masterha/app1.cnf --dead_master_host=192.168.0.50 --dead_master_port=3306 --new_master_host=192.168.0.60 --new_master_port=3306 --ignore_last_failover

输出的信息会询问你是否进行切换：

Mon Apr 21 21:28:00 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Apr 21 21:28:00 2014 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Mon Apr 21 21:28:00 2014 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Mon Apr 21 21:28:00 2014 - [info] MHA::MasterFailover version 0.53.
Mon Apr 21 21:28:00 2014 - [info] Starting master failover.
Mon Apr 21 21:28:00 2014 - [info] 
Mon Apr 21 21:28:00 2014 - [info] * Phase 1: Configuration Check Phase..
Mon Apr 21 21:28:00 2014 - [info] 
Mon Apr 21 21:28:00 2014 - [info] Dead Servers:
Mon Apr 21 21:28:00 2014 - [info]   192.168.0.50(192.168.0.50:3306)
Mon Apr 21 21:28:00 2014 - [info] Checking master reachability via mysql(double check)..
Mon Apr 21 21:28:00 2014 - [info]  ok.
Mon Apr 21 21:28:00 2014 - [info] Alive Servers:
Mon Apr 21 21:28:00 2014 - [info]   192.168.0.60(192.168.0.60:3306)
Mon Apr 21 21:28:00 2014 - [info]   192.168.0.70(192.168.0.70:3306)
Mon Apr 21 21:28:00 2014 - [info] Alive Slaves:
Mon Apr 21 21:28:00 2014 - [info]   192.168.0.60(192.168.0.60:3306)  Version=5.5.19-ndb-7.2.4-gpl-log (oldest major version between slaves) log-bin:enabled
Mon Apr 21 21:28:00 2014 - [info]     Replicating from 192.168.0.50(192.168.0.50:3306)
Mon Apr 21 21:28:00 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Apr 21 21:28:00 2014 - [info]   192.168.0.70(192.168.0.70:3306)  Version=5.5.19-ndb-7.2.4-gpl-log (oldest major version between slaves) log-bin:enabled
Mon Apr 21 21:28:00 2014 - [info]     Replicating from 192.168.0.50(192.168.0.50:3306)
Master 192.168.0.50 is dead. Proceed? (yes/NO): yes
Mon Apr 21 21:36:01 2014 - [info] ** Phase 1: Configuration Check Phase completed.
Mon Apr 21 21:36:01 2014 - [info] 
Mon Apr 21 21:36:01 2014 - [info] * Phase 2: Dead Master Shutdown Phase..
Mon Apr 21 21:36:01 2014 - [info] 
Mon Apr 21 21:36:01 2014 - [info] HealthCheck: SSH to 192.168.0.50 is reachable.
Mon Apr 21 21:36:01 2014 - [info] Forcing shutdown so that applications never connect to the current master..
Mon Apr 21 21:36:01 2014 - [info] Executing master IP deactivatation script:
Mon Apr 21 21:36:01 2014 - [info]   /usr/local/bin/master_ip_failover --orig_master_host=192.168.0.50 --orig_master_ip=192.168.0.50 --orig_master_port=3306 --command=stopssh --ssh_user=root  


IN SCRIPT TEST====/sbin/ifconfig eth1:1 down==/sbin/ifconfig eth1:1 192.168.0.88/24===

Disabling the VIP on old master: 192.168.0.50 
Mon Apr 21 21:36:02 2014 - [info]  done.
Mon Apr 21 21:36:02 2014 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Mon Apr 21 21:36:02 2014 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Mon Apr 21 21:36:02 2014 - [info] 
Mon Apr 21 21:36:02 2014 - [info] * Phase 3: Master Recovery Phase..
Mon Apr 21 21:36:02 2014 - [info] 
Mon Apr 21 21:36:02 2014 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Mon Apr 21 21:36:02 2014 - [info] 
Mon Apr 21 21:36:02 2014 - [info] The latest binary log file/position on all slaves is mysql-bin.000020:112
Mon Apr 21 21:36:02 2014 - [info] Latest slaves (Slaves that received relay log files to the latest):
Mon Apr 21 21:36:02 2014 - [info]   192.168.0.60(192.168.0.60:3306)  Version=5.5.19-ndb-7.2.4-gpl-log (oldest major version between slaves) log-bin:enabled
Mon Apr 21 21:36:02 2014 - [info]     Replicating from 192.168.0.50(192.168.0.50:3306)
Mon Apr 21 21:36:02 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Apr 21 21:36:02 2014 - [info]   192.168.0.70(192.168.0.70:3306)  Version=5.5.19-ndb-7.2.4-gpl-log (oldest major version between slaves) log-bin:enabled
Mon Apr 21 21:36:02 2014 - [info]     Replicating from 192.168.0.50(192.168.0.50:3306)
Mon Apr 21 21:36:02 2014 - [info] The oldest binary log file/position on all slaves is mysql-bin.000020:112
Mon Apr 21 21:36:02 2014 - [info] Oldest slaves:
Mon Apr 21 21:36:02 2014 - [info]   192.168.0.60(192.168.0.60:3306)  Version=5.5.19-ndb-7.2.4-gpl-log (oldest major version between slaves) log-bin:enabled
Mon Apr 21 21:36:02 2014 - [info]     Replicating from 192.168.0.50(192.168.0.50:3306)
Mon Apr 21 21:36:02 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Apr 21 21:36:02 2014 - [info]   192.168.0.70(192.168.0.70:3306)  Version=5.5.19-ndb-7.2.4-gpl-log (oldest major version between slaves) log-bin:enabled
Mon Apr 21 21:36:02 2014 - [info]     Replicating from 192.168.0.50(192.168.0.50:3306)
Mon Apr 21 21:36:02 2014 - [info] 
Mon Apr 21 21:36:02 2014 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Mon Apr 21 21:36:02 2014 - [info] 
Mon Apr 21 21:36:02 2014 - [info] Fetching dead master's binary logs..
Mon Apr 21 21:36:02 2014 - [info] Executing command on the dead master 192.168.0.50(192.168.0.50:3306): save_binary_logs --command=save --start_file=mysql-bin.000020  --start_pos=112 --binlog_dir=/data/mysql --output_file=/tmp/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.53
  Creating /tmp if not exists..    ok.
 Concat binary/relay logs from mysql-bin.000020 pos 112 to mysql-bin.000020 EOF into /tmp/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog ..
  Dumping binlog format description event, from position 0 to 112.. ok.
  Dumping effective binlog data from /data/mysql/mysql-bin.000020 position 112 to tail(131).. ok.
 Concat succeeded.
saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog                                                   100%  131     0.1KB/s   00:00    
Mon Apr 21 21:36:02 2014 - [info] scp from root@192.168.0.50:/tmp/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog to local:/var/log/masterha/app1.log/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog succeeded.
Mon Apr 21 21:36:02 2014 - [info] HealthCheck: SSH to 192.168.0.60 is reachable.
Mon Apr 21 21:36:03 2014 - [info] HealthCheck: SSH to 192.168.0.70 is reachable.
Mon Apr 21 21:36:03 2014 - [info] 
Mon Apr 21 21:36:03 2014 - [info] * Phase 3.3: Determining New Master Phase..
Mon Apr 21 21:36:03 2014 - [info] 
Mon Apr 21 21:36:03 2014 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Mon Apr 21 21:36:03 2014 - [info] All slaves received relay logs to the same position. No need to resync each other.
Mon Apr 21 21:36:03 2014 - [info] 192.168.0.60 can be new master.
Mon Apr 21 21:36:03 2014 - [info] New master is 192.168.0.60(192.168.0.60:3306)
Mon Apr 21 21:36:03 2014 - [info] Starting master failover..
Mon Apr 21 21:36:03 2014 - [info] 
From:
192.168.0.50 (current master)
 +--192.168.0.60
 +--192.168.0.70

To:
192.168.0.60 (new master)
 +--192.168.0.70

Starting master switch from 192.168.0.50(192.168.0.50:3306) to 192.168.0.60(192.168.0.60:3306)? (yes/NO): yes
Mon Apr 21 21:36:06 2014 - [info] New master decided manually is 192.168.0.60(192.168.0.60:3306)
Mon Apr 21 21:36:06 2014 - [info] 
Mon Apr 21 21:36:06 2014 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Mon Apr 21 21:36:06 2014 - [info] 
Mon Apr 21 21:36:06 2014 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Mon Apr 21 21:36:06 2014 - [info] Sending binlog..
saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog                                                   100%  131     0.1KB/s   00:00    
Mon Apr 21 21:36:07 2014 - [info] scp from local:/var/log/masterha/app1.log/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog to root@192.168.0.60:/tmp/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog succeeded.
Mon Apr 21 21:36:07 2014 - [info] 
Mon Apr 21 21:36:07 2014 - [info] * Phase 3.4: Master Log Apply Phase..
Mon Apr 21 21:36:07 2014 - [info] 
Mon Apr 21 21:36:07 2014 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Mon Apr 21 21:36:07 2014 - [info] Starting recovery on 192.168.0.60(192.168.0.60:3306)..
Mon Apr 21 21:36:07 2014 - [info]  Generating diffs succeeded.
Mon Apr 21 21:36:07 2014 - [info] Waiting until all relay logs are applied.
Mon Apr 21 21:36:07 2014 - [info]  done.
Mon Apr 21 21:36:07 2014 - [info] Getting slave status..
Mon Apr 21 21:36:07 2014 - [info] This slave(192.168.0.60)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000020:112). No need to recover from Exec_Master_Log_Pos.
Mon Apr 21 21:36:07 2014 - [info] Connecting to the target slave host 192.168.0.60, running recover script..
Mon Apr 21 21:36:07 2014 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=root --slave_host=192.168.0.60 --slave_ip=192.168.0.60  --slave_port=3306 --apply_files=/tmp/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog --workdir=/tmp --target_version=5.5.19-ndb-7.2.4-gpl-log --timestamp=20140421212800 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.53 --slave_pass=xxx
Mon Apr 21 21:36:07 2014 - [info] 
Applying differential binary/relay log files /tmp/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog on 192.168.0.60:3306. This may take long time...
Applying log files succeeded.
Mon Apr 21 21:36:07 2014 - [info]  All relay logs were successfully applied.
Mon Apr 21 21:36:07 2014 - [info] Getting new master's binlog name and position..
Mon Apr 21 21:36:07 2014 - [info]  mysql-bin.000022:506716
Mon Apr 21 21:36:07 2014 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.0.60', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000022', MASTER_LOG_POS=506716, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Mon Apr 21 21:36:07 2014 - [info] Executing master IP activate script:
Mon Apr 21 21:36:07 2014 - [info]   /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.0.50 --orig_master_ip=192.168.0.50 --orig_master_port=3306 --new_master_host=192.168.0.60 --new_master_ip=192.168.0.60 --new_master_port=3306  


IN SCRIPT TEST====/sbin/ifconfig eth1:1 down==/sbin/ifconfig eth1:1 192.168.0.88/24===

Enabling the VIP - 192.168.0.88/24 on the new master - 192.168.0.60 
Mon Apr 21 21:36:08 2014 - [info]  OK.
Mon Apr 21 21:36:08 2014 - [info] Setting read_only=0 on 192.168.0.60(192.168.0.60:3306)..
Mon Apr 21 21:36:08 2014 - [info]  ok.
Mon Apr 21 21:36:08 2014 - [info] ** Finished master recovery successfully.
Mon Apr 21 21:36:08 2014 - [info] * Phase 3: Master Recovery Phase completed.
Mon Apr 21 21:36:08 2014 - [info] 
Mon Apr 21 21:36:08 2014 - [info] * Phase 4: Slaves Recovery Phase..
Mon Apr 21 21:36:08 2014 - [info] 
Mon Apr 21 21:36:08 2014 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Mon Apr 21 21:36:08 2014 - [info] 
Mon Apr 21 21:36:08 2014 - [info] -- Slave diff file generation on host 192.168.0.70(192.168.0.70:3306) started, pid: 33518. Check tmp log /var/log/masterha/app1.log/192.168.0.70_3306_20140421212800.log if it takes time..
Mon Apr 21 21:36:08 2014 - [info] 
Mon Apr 21 21:36:08 2014 - [info] Log messages from 192.168.0.70 ...
Mon Apr 21 21:36:08 2014 - [info] 
Mon Apr 21 21:36:08 2014 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Mon Apr 21 21:36:08 2014 - [info] End of log messages from 192.168.0.70.
Mon Apr 21 21:36:08 2014 - [info] -- 192.168.0.70(192.168.0.70:3306) has the latest relay log events.
Mon Apr 21 21:36:08 2014 - [info] Generating relay diff files from the latest slave succeeded.
Mon Apr 21 21:36:08 2014 - [info] 
Mon Apr 21 21:36:08 2014 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Mon Apr 21 21:36:08 2014 - [info] 
Mon Apr 21 21:36:08 2014 - [info] -- Slave recovery on host 192.168.0.70(192.168.0.70:3306) started, pid: 33520. Check tmp log /var/log/masterha/app1.log/192.168.0.70_3306_20140421212800.log if it takes time..
saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog                                                   100%  131     0.1KB/s   00:00    
Mon Apr 21 21:36:09 2014 - [info] 
Mon Apr 21 21:36:09 2014 - [info] Log messages from 192.168.0.70 ...
Mon Apr 21 21:36:09 2014 - [info] 
Mon Apr 21 21:36:08 2014 - [info] Sending binlog..
Mon Apr 21 21:36:08 2014 - [info] scp from local:/var/log/masterha/app1.log/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog to root@192.168.0.70:/tmp/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog succeeded.
Mon Apr 21 21:36:08 2014 - [info] Starting recovery on 192.168.0.70(192.168.0.70:3306)..
Mon Apr 21 21:36:08 2014 - [info]  Generating diffs succeeded.
Mon Apr 21 21:36:08 2014 - [info] Waiting until all relay logs are applied.
Mon Apr 21 21:36:08 2014 - [info]  done.
Mon Apr 21 21:36:08 2014 - [info] Getting slave status..
Mon Apr 21 21:36:08 2014 - [info] This slave(192.168.0.70)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000020:112). No need to recover from Exec_Master_Log_Pos.
Mon Apr 21 21:36:08 2014 - [info] Connecting to the target slave host 192.168.0.70, running recover script..
Mon Apr 21 21:36:08 2014 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=root --slave_host=192.168.0.70 --slave_ip=192.168.0.70  --slave_port=3306 --apply_files=/tmp/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog --workdir=/tmp --target_version=5.5.19-ndb-7.2.4-gpl-log --timestamp=20140421212800 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.53 --slave_pass=xxx
Mon Apr 21 21:36:09 2014 - [info] 
Applying differential binary/relay log files /tmp/saved_master_binlog_from_192.168.0.50_3306_20140421212800.binlog on 192.168.0.70:3306. This may take long time...
Applying log files succeeded.
Mon Apr 21 21:36:09 2014 - [info]  All relay logs were successfully applied.
Mon Apr 21 21:36:09 2014 - [info]  Resetting slave 192.168.0.70(192.168.0.70:3306) and starting replication from the new master 192.168.0.60(192.168.0.60:3306)..
Mon Apr 21 21:36:09 2014 - [info]  Executed CHANGE MASTER.
Mon Apr 21 21:36:09 2014 - [info]  Slave started.
Mon Apr 21 21:36:09 2014 - [info] End of log messages from 192.168.0.70.
Mon Apr 21 21:36:09 2014 - [info] -- Slave recovery on host 192.168.0.70(192.168.0.70:3306) succeeded.
Mon Apr 21 21:36:09 2014 - [info] All new slave servers recovered successfully.
Mon Apr 21 21:36:09 2014 - [info] 
Mon Apr 21 21:36:09 2014 - [info] * Phase 5: New master cleanup phease..
Mon Apr 21 21:36:09 2014 - [info] 
Mon Apr 21 21:36:09 2014 - [info] Resetting slave info on the new master..
Mon Apr 21 21:36:09 2014 - [info]  192.168.0.60: Resetting slave info succeeded.
Mon Apr 21 21:36:09 2014 - [info] Master failover to 192.168.0.60(192.168.0.60:3306) completed successfully.
Mon Apr 21 21:36:09 2014 - [info] 

----- Failover Report -----

app1: MySQL Master failover 192.168.0.50 to 192.168.0.60 succeeded

Master 192.168.0.50 is down!

Check MHA Manager logs at server01 for details.

Started manual(interactive) failover.
Invalidated master IP address on 192.168.0.50.
The latest slave 192.168.0.60(192.168.0.60:3306) has all relay logs for recovery.
Selected 192.168.0.60 as a new master.
192.168.0.60: OK: Applying all logs succeeded.
192.168.0.60: OK: Activated master IP address.
192.168.0.70: This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.0.70: OK: Applying all logs succeeded. Slave started, replicating from 192.168.0.60.
192.168.0.60: Resetting slave info succeeded.
Master failover to 192.168.0.60(192.168.0.60:3306) completed successfully.

上述模拟了master宕机的情况下手动把192.168.0.60提升为主库的操作过程。

三.在线进行切换

在许多情况下，需要将现有的主服务器迁移到另外一台服务器上。比如主服务器硬件故障，RAID 控制卡需要重建，将主服务器移到性能更好的服务器上等等。维护主服务器引起性能下降，导致停机时间至少无法写入数据。另外，阻塞或杀掉当前运行的会话会导致主主之间数据不一致的问题发生。 MHA 提供快速切换和优雅的阻塞写入，这个切换过程只需要 0.5-2s 的时间，这段时间内数据是无法写入的。在很多情况下，0.5-2s 的阻塞写入是可以接受的。因此切换主服务器不需要计划分配维护时间窗口。

MHA在线切换的大概过程：
1.检测复制设置和确定当前主服务器
2.确定新的主服务器
3.阻塞写入到当前主服务器
4.等待所有从服务器赶上复制
5.授予写入到新的主服务器
6.重新设置从服务器

注意，在线切换的时候应用架构需要考虑以下两个问题：

1.自动识别master和slave的问题（master的机器可能会切换），如果采用了vip的方式，基本可以解决这个问题。

2.负载均衡的问题（可以定义大概的读写比例，每台机器可承担的负载比例，当有机器离开集群时，需要考虑这个问题）

为了保证数据完全一致性，在最快的时间内完成切换，MHA的在线切换必须满足以下条件才会切换成功，否则会切换失败。

1.所有slave的IO线程都在运行

2.所有slave的SQL线程都在运行

3.所有的show slave status的输出中Seconds_Behind_Master参数小于或者等于running_updates_limit秒，如果在切换过程中不指定running_updates_limit,那么默认情况下running_updates_limit为1秒。

4.在master端，通过show processlist输出，没有一个更新花费的时间大于running_updates_limit秒。

在线切换步骤如下：

首先，停掉MHA监控：

[root@192.168.0.20 ~]# masterha_stop --conf=/etc/masterha/app1.cnf

其次，进行在线切换操作（模拟在线切换主库操作，原主库192.168.0.50变为slave，192.168.0.60提升为新的主库）

[root@192.168.0.20 ~]# masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.0.60 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000

最后查看日志，了解切换过程，输出信息如下：

[root@192.168.0.20 ~]#  masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.0.60 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000
Wed Apr 23 00:27:39 2014 - [info] MHA::MasterRotate version 0.53.
Wed Apr 23 00:27:39 2014 - [info] Starting online master switch..
Wed Apr 23 00:27:39 2014 - [info] 
Wed Apr 23 00:27:39 2014 - [info] * Phase 1: Configuration Check Phase..
Wed Apr 23 00:27:39 2014 - [info] 
Wed Apr 23 00:27:39 2014 - [info] Reading default configuratoins from /etc/masterha_default.cnf..
Wed Apr 23 00:27:39 2014 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Wed Apr 23 00:27:39 2014 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Wed Apr 23 00:27:39 2014 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.0.50(192.168.0.50:3306)
Wed Apr 23 00:27:39 2014 - [info] Master configurations are as below: 
Master 192.168.0.60(192.168.0.60:3306), replicating from 192.168.0.50(192.168.0.50:3306), read-only
Master 192.168.0.50(192.168.0.50:3306), replicating from 192.168.0.60(192.168.0.60:3306)

Wed Apr 23 00:27:39 2014 - [info] Current Alive Master: 192.168.0.50(192.168.0.50:3306)
Wed Apr 23 00:27:39 2014 - [info] Alive Slaves:
Wed Apr 23 00:27:39 2014 - [info]   192.168.0.60(192.168.0.60:3306)  Version=5.5.19-ndb-7.2.4-gpl-log (oldest major version between slaves) log-bin:enabled
Wed Apr 23 00:27:39 2014 - [info]     Replicating from 192.168.0.50(192.168.0.50:3306)
Wed Apr 23 00:27:39 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed Apr 23 00:27:39 2014 - [info]   192.168.0.70(192.168.0.70:3306)  Version=5.5.19-ndb-7.2.4-gpl-log (oldest major version between slaves) log-bin:enabled
Wed Apr 23 00:27:39 2014 - [info]     Replicating from 192.168.0.50(192.168.0.50:3306)

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.0.50(192.168.0.50:3306)? (YES/no): yes
Wed Apr 23 00:27:40 2014 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Wed Apr 23 00:27:40 2014 - [info]  ok.
Wed Apr 23 00:27:40 2014 - [info] Checking MHA is not monitoring or doing failover..
Wed Apr 23 00:27:40 2014 - [info] Checking replication health on 192.168.0.60..
Wed Apr 23 00:27:40 2014 - [info]  ok.
Wed Apr 23 00:27:40 2014 - [info] Checking replication health on 192.168.0.70..
Wed Apr 23 00:27:40 2014 - [info]  ok.
Wed Apr 23 00:27:40 2014 - [info] 192.168.0.60 can be new master.
Wed Apr 23 00:27:40 2014 - [info] 
From:
192.168.0.50 (current master)
 +--192.168.0.60
 +--192.168.0.70

To:
192.168.0.60 (new master)
 +--192.168.0.70
 +--192.168.0.50

Starting master switch from 192.168.0.50(192.168.0.50:3306) to 192.168.0.60(192.168.0.60:3306)? (yes/NO): yes
Wed Apr 23 00:27:41 2014 - [info] Checking whether 192.168.0.60(192.168.0.60:3306) is ok for the new master..
Wed Apr 23 00:27:41 2014 - [info]  ok.
Wed Apr 23 00:27:41 2014 - [info] ** Phase 1: Configuration Check Phase completed.
Wed Apr 23 00:27:41 2014 - [info] 
Wed Apr 23 00:27:41 2014 - [info] * Phase 2: Rejecting updates Phase..
Wed Apr 23 00:27:41 2014 - [info] 
Wed Apr 23 00:27:41 2014 - [info] Executing master ip online change script to disable write on the current master:
Wed Apr 23 00:27:41 2014 - [info]   /usr/local/bin/master_ip_online_change.pl --command=stop --orig_master_host=192.168.0.50 --orig_master_ip=192.168.0.50 --orig_master_port=3306 --new_master_host=192.168.0.60 --new_master_ip=192.168.0.60 --new_master_port=3306  
Wed Apr 23 00:27:41 2014 714804 Set read_only on the new master.. ok.
Wed Apr 23 00:27:41 2014 719969 Set read_only=1 on the orig master.. ok.
Disabling the VIP on old master: 192.168.0.50 
reverse mapping checking getaddrinfo for bogon [192.168.0.50] failed - POSSIBLE BREAK-IN ATTEMPT!
Wed Apr 23 00:27:51 2014 963762 Killing all application threads..
Wed Apr 23 00:27:51 2014 963869 done.
Wed Apr 23 00:27:51 2014 - [info]  ok.
Wed Apr 23 00:27:51 2014 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Wed Apr 23 00:27:51 2014 - [info] Executing FLUSH TABLES WITH READ LOCK..
Wed Apr 23 00:27:51 2014 - [info]  ok.
Wed Apr 23 00:27:51 2014 - [info] Orig master binlog:pos is mysql-bin.000028:112.
Wed Apr 23 00:27:51 2014 - [info]  Waiting to execute all relay logs on 192.168.0.60(192.168.0.60:3306)..
Wed Apr 23 00:27:51 2014 - [info]  master_pos_wait(mysql-bin.000028:112) completed on 192.168.0.60(192.168.0.60:3306). Executed 0 events.
Wed Apr 23 00:27:51 2014 - [info]   done.
Wed Apr 23 00:27:51 2014 - [info] Getting new master's binlog name and position..
Wed Apr 23 00:27:51 2014 - [info]  mysql-bin.000023:1550
Wed Apr 23 00:27:51 2014 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.0.60', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000023', MASTER_LOG_POS=1550, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Wed Apr 23 00:27:51 2014 - [info] Executing master ip online change script to allow write on the new master:
Wed Apr 23 00:27:51 2014 - [info]   /usr/local/bin/master_ip_online_change.pl --command=start --orig_master_host=192.168.0.50 --orig_master_ip=192.168.0.50 --orig_master_port=3306 --new_master_host=192.168.0.60 --new_master_ip=192.168.0.60 --new_master_port=3306  
Wed Apr 23 00:27:52 2014 077334 Set read_only=0 on the new master.
Enabling the VIP - 192.168.0.88/24 on the new master - 192.168.0.60 
reverse mapping checking getaddrinfo for bogon [192.168.0.60] failed - POSSIBLE BREAK-IN ATTEMPT!
Wed Apr 23 00:28:02 2014 - [info]  ok.
Wed Apr 23 00:28:02 2014 - [info] 
Wed Apr 23 00:28:02 2014 - [info] * Switching slaves in parallel..
Wed Apr 23 00:28:02 2014 - [info] 
Wed Apr 23 00:28:02 2014 - [info] -- Slave switch on host 192.168.0.70(192.168.0.70:3306) started, pid: 3036
Wed Apr 23 00:28:02 2014 - [info] 
Wed Apr 23 00:28:02 2014 - [info] Log messages from 192.168.0.70 ...
Wed Apr 23 00:28:02 2014 - [info] 
Wed Apr 23 00:28:02 2014 - [info]  Waiting to execute all relay logs on 192.168.0.70(192.168.0.70:3306)..
Wed Apr 23 00:28:02 2014 - [info]  master_pos_wait(mysql-bin.000028:112) completed on 192.168.0.70(192.168.0.70:3306). Executed 0 events.
Wed Apr 23 00:28:02 2014 - [info]   done.
Wed Apr 23 00:28:02 2014 - [info]  Resetting slave 192.168.0.70(192.168.0.70:3306) and starting replication from the new master 192.168.0.60(192.168.0.60:3306)..
Wed Apr 23 00:28:02 2014 - [info]  Executed CHANGE MASTER.
Wed Apr 23 00:28:02 2014 - [info]  Slave started.
Wed Apr 23 00:28:02 2014 - [info] End of log messages from 192.168.0.70 ...
Wed Apr 23 00:28:02 2014 - [info] 
Wed Apr 23 00:28:02 2014 - [info] -- Slave switch on host 192.168.0.70(192.168.0.70:3306) succeeded.
Wed Apr 23 00:28:02 2014 - [info] Unlocking all tables on the orig master:
Wed Apr 23 00:28:02 2014 - [info] Executing UNLOCK TABLES..
Wed Apr 23 00:28:02 2014 - [info]  ok.
Wed Apr 23 00:28:02 2014 - [info] Starting orig master as a new slave..
Wed Apr 23 00:28:02 2014 - [info]  Resetting slave 192.168.0.50(192.168.0.50:3306) and starting replication from the new master 192.168.0.60(192.168.0.60:3306)..
Wed Apr 23 00:28:02 2014 - [info]  Executed CHANGE MASTER.
Wed Apr 23 00:28:02 2014 - [info]  Slave started.
Wed Apr 23 00:28:02 2014 - [info] All new slave servers switched successfully.
Wed Apr 23 00:28:02 2014 - [info] 
Wed Apr 23 00:28:02 2014 - [info] * Phase 5: New master cleanup phease..
Wed Apr 23 00:28:02 2014 - [info] 
Wed Apr 23 00:28:02 2014 - [info]  192.168.0.60: Resetting slave info succeeded.
Wed Apr 23 00:28:02 2014 - [info] Switching master to 192.168.0.60(192.168.0.60:3306) completed successfully.

其中参数的意思：

--orig_master_is_new_slave 切换时加上此参数是将原 master 变为 slave 节点，如果不加此参数，原来的 master 将不启动

--running_updates_limit=10000,故障切换时,候选master 如果有延迟的话， mha 切换不能成功，加上此参数表示延迟在此时间范围内都可切换（单位为s），但是切换的时间长短是由recover 时relay 日志的大小决定

注意：由于在线进行切换需要调用到master_ip_online_change这个脚本，但是由于该脚本不完整，需要自己进行相应的修改，我google到后发现还是有问题，脚本中new_master_password这个变量获取不到，导致在线切换失败，所以进行了相关的硬编码，直接把mysql的root用户密码赋值给变量new_master_password，如果有哪位大牛知道原因，请指点指点。这个脚本还可以管理vip。下面贴出脚本：

#!/usr/bin/env perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;
use MHA::NodeUtil;
use Time::HiRes qw( sleep gettimeofday tv_interval );
use Data::Dumper;

my $_tstart;
my $_running_interval = 0.1;
my (
  $command,          $orig_master_host, $orig_master_ip,
  $orig_master_port, $orig_master_user, 
  $new_master_host,  $new_master_ip,    $new_master_port,
  $new_master_user,  
);


my $vip = '192.168.0.88/24';  # Virtual IP 
my $key = "1"; 
my $ssh_start_vip = "/sbin/ifconfig eth1:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth1:$key down";
my $ssh_user = "root";
my $new_master_password='123456';
my $orig_master_password='123456';
GetOptions(
  'command=s'              => \$command,
  #'ssh_user=s'             => \$ssh_user,  
  'orig_master_host=s'     => \$orig_master_host,
  'orig_master_ip=s'       => \$orig_master_ip,
  'orig_master_port=i'     => \$orig_master_port,
  'orig_master_user=s'     => \$orig_master_user,
  #'orig_master_password=s' => \$orig_master_password,
  'new_master_host=s'      => \$new_master_host,
  'new_master_ip=s'        => \$new_master_ip,
  'new_master_port=i'      => \$new_master_port,
  'new_master_user=s'      => \$new_master_user,
  #'new_master_password=s'  => \$new_master_password,
);

exit &main();

sub current_time_us {
  my ( $sec, $microsec ) = gettimeofday();
  my $curdate = localtime($sec);
  return $curdate . " " . sprintf( "%06d", $microsec );
}

sub sleep_until {
  my $elapsed = tv_interval($_tstart);
  if ( $_running_interval > $elapsed ) {
    sleep( $_running_interval - $elapsed );
  }
}

sub get_threads_util {
  my $dbh                    = shift;
  my $my_connection_id       = shift;
  my $running_time_threshold = shift;
  my $type                   = shift;
  $running_time_threshold = 0 unless ($running_time_threshold);
  $type                   = 0 unless ($type);
  my @threads;

  my $sth = $dbh->prepare("SHOW PROCESSLIST");
  $sth->execute();

  while ( my $ref = $sth->fetchrow_hashref() ) {
    my $id         = $ref->{Id};
    my $user       = $ref->{User};
    my $host       = $ref->{Host};
    my $command    = $ref->{Command};
    my $state      = $ref->{State};
    my $query_time = $ref->{Time};
    my $info       = $ref->{Info};
    $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);
    next if ( $my_connection_id == $id );
    next if ( defined($query_time) && $query_time < $running_time_threshold );
    next if ( defined($command)    && $command eq "Binlog Dump" );
    next if ( defined($user)       && $user eq "system user" );
    next
      if ( defined($command)
      && $command eq "Sleep"
      && defined($query_time)
      && $query_time >= 1 );

    if ( $type >= 1 ) {
      next if ( defined($command) && $command eq "Sleep" );
      next if ( defined($command) && $command eq "Connect" );
    }

    if ( $type >= 2 ) {
      next if ( defined($info) && $info =~ m/^select/i );
      next if ( defined($info) && $info =~ m/^show/i );
    }

    push @threads, $ref;
  }
  return @threads;
}

sub main {
  if ( $command eq "stop" ) {
    ## Gracefully killing connections on the current master
    # 1. Set read_only= 1 on the new master
    # 2. DROP USER so that no app user can establish new connections
    # 3. Set read_only= 1 on the current master
    # 4. Kill current queries
    # * Any database access failure will result in script die.
    my $exit_code = 1;
    eval {
      ## Setting read_only=1 on the new master (to avoid accident)
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error(die_on_error)_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );
      print current_time_us() . " Set read_only on the new master.. ";
      $new_master_handler->enable_read_only();
      if ( $new_master_handler->is_read_only() ) {
        print "ok.\n";
      }
      else {
        die "Failed!\n";
      }
      $new_master_handler->disconnect();

      # Connecting to the orig master, die if any database error happens
      my $orig_master_handler = new MHA::DBHelper();
      $orig_master_handler->connect( $orig_master_ip, $orig_master_port,
        $orig_master_user, $orig_master_password, 1 );

      ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand
      #$orig_master_handler->disable_log_bin_local();
      #print current_time_us() . " Drpping app user on the orig master..\n";
      #FIXME_xxx_drop_app_user($orig_master_handler);

      ## Waiting for N * 100 milliseconds so that current connections can exit
      my $time_until_read_only = 15;
      $_tstart = [gettimeofday];
      my @threads = get_threads_util( $orig_master_handler->{dbh},
        $orig_master_handler->{connection_id} );
      while ( $time_until_read_only > 0 && $#threads >= 0 ) {
        if ( $time_until_read_only % 5 == 0 ) {
          printf
"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",
            current_time_us(), $#threads + 1, $time_until_read_only * 100;
          if ( $#threads < 5 ) {
            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
              foreach (@threads);
          }
        }
        sleep_until();
        $_tstart = [gettimeofday];
        $time_until_read_only--;
        @threads = get_threads_util( $orig_master_handler->{dbh},
          $orig_master_handler->{connection_id} );
      }

      ## Setting read_only=1 on the current master so that nobody(except SUPER) can write
      print current_time_us() . " Set read_only=1 on the orig master.. ";
      $orig_master_handler->enable_read_only();
      if ( $orig_master_handler->is_read_only() ) {
        print "ok.\n";
      }
      else {
        die "Failed!\n";
      }

      ## Waiting for M * 100 milliseconds so that current update queries can complete
      my $time_until_kill_threads = 5;
      @threads = get_threads_util( $orig_master_handler->{dbh},
        $orig_master_handler->{connection_id} );
      while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {
        if ( $time_until_kill_threads % 5 == 0 ) {
          printf
"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",
            current_time_us(), $#threads + 1, $time_until_kill_threads * 100;
          if ( $#threads < 5 ) {
            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
              foreach (@threads);
          }
        }
        sleep_until();
        $_tstart = [gettimeofday];
        $time_until_kill_threads--;
        @threads = get_threads_util( $orig_master_handler->{dbh},
          $orig_master_handler->{connection_id} );
      }



                print "Disabling the VIP on old master: $orig_master_host \n";
                &stop_vip();     


      ## Terminating all threads
      print current_time_us() . " Killing all application threads..\n";
      $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );
      print current_time_us() . " done.\n";
      #$orig_master_handler->enable_log_bin_local();
      $orig_master_handler->disconnect();

      ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "start" ) {
    ## Activating master ip on the new master
    # 1. Create app user with write privileges
    # 2. Moving backup script if needed
    # 3. Register new master's ip to the catalog database

# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.
# If exit code is 0 or 10, MHA does not abort
    my $exit_code = 10;
    eval {
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );

      ## Set read_only=0 on the new master
      #$new_master_handler->disable_log_bin_local();
      print current_time_us() . " Set read_only=0 on the new master.\n";
      $new_master_handler->disable_read_only();

      ## Creating an app user on the new master
      #print current_time_us() . " Creating app user on the new master..\n";
      #FIXME_xxx_create_app_user($new_master_handler);
      #$new_master_handler->enable_log_bin_local();
      $new_master_handler->disconnect();

      ## Update master ip on the catalog database, etc
                print "Enabling the VIP - $vip on the new master - $new_master_host \n";
                &start_vip();
                $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "status" ) {

    # do nothing
    exit 0;
  }
  else {
    &usage();
    exit 1;
  }
}

# A simple system call that enable the VIP on the new master 
sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
  print
"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
  die;
}

四.修复宕机的Master

通常情况下自动切换以后，原master可能已经废弃掉，待原master主机修复后，如果数据完整的情况下，可能想把原来master重新作为新主库的slave，这时我们可以借助当时自动切换时刻的MHA日志来完成对原master的修复。下面是提取相关日志的命令：

[root@192.168.0.20 app1]# grep -i "All other slaves should start" manager.log 
Mon Apr 21 22:28:33 2014 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.0.60', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000022', MASTER_LOG_POS=506716, MASTER_USER='repl', MASTER_PASSWORD='xxx';
[root@192.168.0.20 app1]#

获取上述信息以后，就可以直接在修复后的master上执行change master to相关操作，重新作为从库了。

最后补充一下邮件发送脚本send_report ，这个脚本在询问一位朋友后可以使用，如下：

#!/usr/bin/perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';
use Mail::Sender;
use Getopt::Long;

#new_master_host and new_slave_hosts are set only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
my $smtp='smtp.163.com';
my $mail_from='xxxx';
my $mail_user='xxxxx';
my $mail_pass='xxxxx';
my $mail_to=['xxxx','xxxx'];
GetOptions(
  'orig_master_host=s' => \$dead_master_host,
  'new_master_host=s'  => \$new_master_host,
  'new_slave_hosts=s'  => \$new_slave_hosts,
  'subject=s'          => \$subject,
  'body=s'             => \$body,
);

mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);

sub mailToContacts {
    my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;
    open my $DEBUG, "> /tmp/monitormail.log"
        or die "Can't open the debug      file:$!\n";
    my $sender = new Mail::Sender {
        ctype       => 'text/plain; charset=utf-8',
        encoding    => 'utf-8',
        smtp        => $smtp,
        from        => $mail_from,
        auth        => 'LOGIN',
        TLS_allowed => '0',
        authid      => $user,
        authpwd     => $passwd,
        to          => $mail_to,
        subject     => $subject,
        debug       => $DEBUG
    };

    $sender->MailMsg(
        {   msg   => $msg,
            debug => $DEBUG
        }
    ) or print $Mail::Sender::Error;
    return 1;
}
# Do whatever you want here
exit 0;

最后切换以后发送告警的邮件示例，注意，这个是我后续的测试，和上面环境出现的ip不一致不要在意。

master failover mail.png

总结：
目前高可用方案可以一定程度上实现数据库的高可用，比如前面文章介绍的MMM，heartbeat+drbd，Cluster等。还有percona的Galera Cluster等。这些高可用软件各有优劣。在进行高可用方案选择时，主要是看业务还有对数据一致性方面的要求。最后出于对数据库的高可用和数据一致性的要求，推荐使用MHA架构。

参考资料：
http://huoding.com/2011/12/18/139
http://blog.itpub.net/88305/viewspace-730135/
https://code.google.com/p/mysql-master-ha/wiki/TableOfContents?tm=6（自备梯子）
http://k-1-ne-jp.blogspot.com/2013/02/masterhamasterswitch_13.html（自备梯子）

最后编辑于：2017.12.09 23:12:45

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 204,684评论 6赞 478
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 87,143评论 2赞 381
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 151,214评论 0赞 337
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 54,788评论 1赞 277
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 63,796评论 5赞 368
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,665评论 1赞 281
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,027评论 3赞 399
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,679评论 0赞 258
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 41,346评论 1赞 299
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,664评论 2赞 321
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,766评论 1赞 331
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,412评论 4赞 321
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,015评论 3赞 307
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 29,974评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,203评论 1赞 260
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 45,073评论 2赞 350
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,501评论 2赞 343

MySQL高可用架构——MHA(二)

MySQL高可用架构——MHA(二)

上一篇

Running MHA Manager from daemontools

推荐阅读更多精彩内容