448 阿裏雲技術社區[雲棲]

semi-sync插件崩潰導致MySQL重啟故障詳析

本文講的是semi-sync插件崩潰導致MySQL重啟故障詳析

導讀

semi-sync插件崩潰導致MySQL重啟的故障分析全過程~

作者：劉安
現為愛可生高級測試工程師，擁有豐富的自動測試開發經驗；曾先後在文思海輝、Splunk擔任測試工程師。

一、起因：

在公司測試MySQL高可用組件時發現一個異常。如下：

停止從庫，高可用組件將從庫自動啟動後，主庫發生了重啟。然而，正常情況下，主庫不應發生重啟。

二、環境：

1、OS： CentOS release 6.7 (Final)；

2、MySQL：Ver 14.14 Distrib 5.7.13, for linux-glibc2.5 (x86_64) using EditLine wrapper；

3、主從實例開啟半同步 MySQL semi-sync配置文檔。

三、分析：

(一）首先，分析log文件：

在主庫的mysql-error.log中，我們可以發現如下幾點：

1、主庫的semi-sync在崩潰前做了啟停動作：

2017-07-25T16:02:38.636061+08:00 40 [Note] Semi-sync replication switched OFF.
2017-07-25T16:02:38.636105+08:00 40 [Note] Semi-sync replication disabled on the master.
2017-07-25T16:02:38.636137+08:00 0 [Note] Stopping ack receiver thread
2017-07-25T16:02:38.638008+08:00 40 [Note] Semi-sync replication enabled on the master.
2017-07-25T16:02:38.638131+08:00 0 [Note] Starting ack receiver thread

2、主庫的semi-sync啟動後，立即引發了semi-sync插件崩潰並拋出Assert異常。


mysqld: /export/home/pb2/build/sb_0-19016729-1464157976.67/mysql-5.7.13/plugin/semisync/semisync_master.cc:844: int ReplSemiSyncMaster::commitTrx(const char*, my_off_t): Assertion `entry' failed.
08:02:38 UTC - mysqld got signal 6 ;

3、這裏還有一個相當重要的複現故障的線索。簡單說明一下，這條SQL語句是為了檢測主從數據差，而由高可用中間件不斷地向數據庫更新時間戳。


Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f408c0054c0): update universe.u_delay set real_timestamp=now(), logic_timestamp = logic_timestamp + 1 where source = 'ustats'
Connection ID (thread ID): 61
Status: NOT_KILLED

到這裏初步判斷是這個故障應與設置rpl_semi_sync_master_enabled開關相關，並且應該當時是有事務提交的。

（二）其次，進一步驗證：

這裏隻部署一組開啟semi-sync的MySQL實例，而不安裝高可用組件，利用bash腳本向主庫不斷插入數據：


#!/usr/bin/env bash
/opt/mysql/base/bin/mysql -uroot -p1 -S /opt/mysql/data/3306/mysqld.sock -e "create database if not exists test;use test;drop table if exists t1;create table t1(id int) "
i=0
while true 
do
    /opt/mysql/base/bin/mysql -uroot -p1 -S /opt/mysql/data/3306/mysqld.sock -e "insert into test.t1 values($i)"
    i=$((i+1))
done

在主庫的機器上，反複運行如下命令，啟停semi-sync master，不超過5次必能複現這個故障。


/opt/mysql/base/bin/mysql -uroot -p1 -S /opt/mysql/data/3306/mysqld.sock -e'SET GLOBAL rpl_semi_sync_master_enabled = OFF;SET GLOBAL rpl_semi_sync_master_enabled = ON'

所以這個故障複現有了兩個必要條件：

啟停semi-sync master；

數據庫當時有事務提交。

（三）最後分析MySQL源碼：

至於為什麼不能每次啟停semi-sync master都能觸發這個故障，那我們就要看看MySQL的源碼了。還好mysql-error.log明確地指出了拋出異常的位置：


mysqld: /export/home/pb2/build/sb_0-19016729-1464157976.67/mysql-5.7.13/plugin/semisync/semisync_master.cc:844: int ReplSemiSyncMaster::commitTrx(const char*, my_off_t): Assertion `entry' failed.
08:02:38 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

找到MySQL源碼，拋出異常的的位置 mysql-5.7.13 semisync_master.cc。


      /* wait for the position to be ACK'ed back */
      assert(entry);
      entry->n_waiters++;
      wait_result= mysql_cond_timedwait(&entry->cond, &LOCK_binlog_, &abstime);

看來是這個‘entry’的值為NULL，才拋出的異常，但是這還沒有回答我的疑問；再看MySQL的報錯信息，MySQL自己也認識到這應該是個BUG，那它有沒有修複這個故障呢？

我找到MySQL最新5.7.19的版本，查看它的曆史記錄mysql-5.7.19 semisync_master.cc history 。

這正是我想要的修複和提交，打開看看有沒有對產生故障原因的分析，Bug#22202516: ENABLING SEMI-SYNC DURING COMMIT CAN CAUSE MASTER TO ASSERT。

為了方便解釋這個BUG, 這裏先簡單的介紹一下MySQL binlog提交過程：MySQL 在5.6版本引入了Binary Log Group Commit，二進製日誌的提交由此可以簡化為三個階段：

Flush stage：將事務的日誌寫入binlog文件的緩存中
Sync stage: 將binlog文件中的緩存數據寫入磁盤
Commit stage：根據順序調用存儲引擎提交事務

MYSQL_BIN_LOG::ordered_commit這是事務在binlog階段提交的核心函數。通過該函數，實現了事務日誌寫入binlog文件，以及觸發dump線程將binlog發送到從庫，在最後將事務設置為提交狀態；

事實上有沒有semi-sync機製，上麵介紹的MySQL的有關事務提交中關於binlog的流程都是一樣的。semi-sync的作用隻是主從之間的一個確認過程，主庫等待從庫返回相關位置的binlog已經同步到從庫的確認，沒有得到確認之前，事務提交則在該函數（步驟）上等待直至獲得返回。

在半同步複製中，為了在主庫提交事務時等待從庫的確認，才引出了由‘entry’組成的active transaction鏈表。

在BUG分析中提到：

首先在flush stage，由semi-sync創建一個‘entry’，並關聯一個或一組處於flush stage的事務。這個‘entry’被插入一個active transaction鏈表。


int ReplSemiSyncMaster::writeTranxInBinlog(const char* log_file_name, my_off_t log_file_pos)
{
  ...
if (is_on())
{
  assert(active_tranxs_ != NULL);
  if(active_tranxs_->insert_tranx_node(log_file_name, log_file_pos))
  {
    /*
      if insert tranx_node failed, print a warning message
      and turn off semi-sync
    */
    sql_print_warning("Semi-sync failed to insert tranx_node for binlog file: %s, position: %lu",
                      log_file_name, (ulong)log_file_pos);
    switch_off();
  }
}
 ...
}

接下來是sync stage，在sync stage每個處於這個階段的線程將會設置trx_wait_binlog_name和trx_wait_binlog_pos為事務對應binlog的相對位置；

最後再commit stage，semi-sync為了等從庫的確認，會用trx_wait_binlog_name和trx_wait_binlog_pos獲取這個位置關聯的'entry',


int ReplSemiSyncMaster::commitTrx(const char* trx_wait_binlog_name, my_off_t trx_wait_binlog_pos)
{
...
TranxNode* entry= NULL;
mysql_cond_t* thd_cond= NULL;
if (active_tranxs_ != NULL && trx_wait_binlog_name)
{
  entry=
    active_tranxs_->find_active_tranx_node(trx_wait_binlog_name,
                                           trx_wait_binlog_pos);
  if (entry)
    thd_cond= &entry->cond;
}
 ...
}

這裏有兩個場景會出現‘entry’找不到的情況：

1、從庫已經確認的事務在binlog位置比當前等待從庫確認的事務在binlog的位置要大；

2、當事務進入flush stage的時候semi-sync還沒有開啟，也就沒有‘entry’被創建，然後插入active transaction鏈表。

情況1永遠也不會進入等待階段，因為它已經被從庫確認過了；

情況2則會導致上述的斷言異常，因為這裏它無法在active transaction鏈表找到相應的’entry‘。

到這裏我的疑問終於解開，原來開啟semi-sync master的時機很重要，要恰巧撞到事務提交過了flush stage但還沒有到commit stage這個縫隙。

四、結論：

最終修複的結果也很明顯，就是遇到找不到‘entry’並且從庫也沒有確認的情況，這個事務提交就被認為是一次異步提交。

再看一眼在哪些版本修複了這個BUG：

所以升級MySQL版本可以解決問題。

五、複盤：

最後總結一下我的診斷路徑：

觀察故障出現的現象，分析MySQL的error log；
通過猜想和實驗，構成簡單的複現場景；
通過日誌和MySQL源碼的對應關係, 搜索代碼曆史, 定位BUG號；
通過閱讀BUG的分析，了解異常發生的原理並知道了詳細的複現條件。

原文發布時間為：2017-09-04
本文作者：劉安
本文來自雲棲社區合作夥伴“老葉茶館”，了解相關信息可以關注“老葉茶館”微信公眾號

最後更新：2017-09-22 16:03:59

semi-sync插件崩潰導致MySQL重啟故障詳析

上一篇：道旅鬼穀子分享：如何打好業務監控的組合拳

下一篇：基於 Laravel 開發 ThinkSNS+ 中前端的抉擇

相關內容

熱門內容

最新內容

semi-sync插件崩潰導致MySQL重啟故障詳析

上一篇： 道旅鬼穀子分享：如何打好業務監控的組合拳

下一篇： 基於 Laravel 開發 ThinkSNS+ 中前端的抉擇

相關內容

熱門內容

最新內容

上一篇：道旅鬼穀子分享：如何打好業務監控的組合拳

下一篇：基於 Laravel 開發 ThinkSNS+ 中前端的抉擇