MySQL · 源碼分析 · MySQL replication partial transaction

replication 概述

目前MySQL支持的replication方式多種多樣
1. 普通的master-slave 異步replication
2. 半同步的semi-sync replication
3. 支持多通道的group replication和double binlog

如果按連接協議來區分，又可以分為

非GTID模式，通過binlog文件名和文件的偏移來決定replication位點信息
GTID模式，通過GTID信息來決定replication位點信息

如果按apply binglog的方式來區分，又可以分為

串行，按binlog event順序依次執行
並行，以db, table或transaction為粒度的並行複製，以及基於group commit的LOGICAL_CLOCK並行複製

不論哪種replication, 都離不開replication最基本的組件，

IO thread，負責從master拉取binlog.
SQL thread，負責apply relay log binlog.

replication 異常

複製過程中，由於網絡或者master主機宕機，都會造成slave IO thread異常中斷。
例如以下事務在複製過程中發生上述異常，

SET GTID_NEXT;        # GTID設置為ON時           
BEGIN;  
INSERT row1;
INSERT row2;
COMMIT;

那麼備庫接收的binlog可能不包含完整的事務，備庫可能僅接收到BEGIN，也可能隻接收到INSERT row1.

然而，當IO thread恢複後，SQL線程怎麼正確處理這種異常呢？

異常恢複

IO thread 異常中斷後，SQL線程是正常工作的，SQL執行了部分事務，它會等待IO 線程發送新的binlog. IO thread 線程恢複後，SQL線程可以選擇繼續執行事務或者回滾事務重新執行事務，這是由replication協議決定的。

GTID模式下，設置auto_position=1時，slave會根據GTID信息，從事務起點開始，重新將事務完整binlog發給備庫。此時，備庫需要回滾之前的部分事務。
GTID模式下，設置auto_position=0或非GTID模式下，slave會根據位點信息從master續傳之前的binlog。此時，備庫可以繼續完成之前的部分事務。

繼續執行事務比較簡單，但是回滾之前的部分事務就比較複雜.

分為兩種情況來分析：

串行複製

串行複製時，完整的事務會由SQL thread來執行，當執行到GTID_LOG_EVENT時，會發這個GTID已經分配過了，這時候就可以回滾事物。具體參考

Gtid_log_event::do_apply_event()

  if (thd->owned_gtid.sidno)
  {
    /*
      Slave will execute this code if a previous Gtid_log_event was applied
      but the GTID wasn't consumed yet (the transaction was not committed
      nor rolled back).
      On a client session we cannot do consecutive SET GTID_NEXT without
      a COMMIT or a ROLLBACK in the middle.
      Applying this event without rolling back the current transaction may
      lead to problems, as a "BEGIN" event following this GTID will
      implicitly commit the "partial transaction" and will consume the
      GTID. If this "partial transaction" was left in the relay log by the
      IO thread restarting in the middle of a transaction, you could have
      the partial transaction being logged with the GTID on the slave,
      causing data corruption on replication.
    */
    if (thd->transaction.all.ha_list)
    {
      /* This is not an error (XA is safe), just an information */
      rli->report(INFORMATION_LEVEL, 0,
                  "Rolling back unfinished transaction (no COMMIT "
                  "or ROLLBACK in relay log). A probable cause is partial "
                  "transaction left on relay log because of restarting IO "
                  "thread with auto-positioning protocol.");
      const_cast<Relay_log_info*>(rli)->cleanup_context(thd, 1);
    }
    gtid_rollback(thd);
  }

並行複製

並行複製有別於串行複製，binlog event由worker線程執行。按串行複製的方式來回滾事務是行不通的，因為重新發送的事務binlog並不一定會分配原來的worker來執行。因此，回滾操作需交給coordinate線程(即sql線程)來完成。
GTID模式下，設置auto_position=1時. IO thread重連時，都會發送
ROTATE_LOG_EVENT和FORMAT_DESCRIPTION_EVENT. 並且FORMAT_DESCRIPTION_EVENT的log_pos>0. 通過非auto_position方式重連的FORMAT_DESCRIPTION_EVENT的log_pos在send之前會被置為0. SQL線程通過執行FORMAT_DESCRIPTION_EVENT且其log_pos>0來判斷是否應進入回滾邏輯。而回滾是通過構造Rollback event讓work來執行的。
具體參考

exec_relay_log_event()
/*
      GTID protocol will put a FORMAT_DESCRIPTION_EVENT from the master with
      log_pos != 0 after each (re)connection if auto positioning is enabled.
      This means that the SQL thread might have already started to apply the
      current group but, as the IO thread had to reconnect, it left this
      group incomplete and will start it again from the beginning.
      So, before applying this FORMAT_DESCRIPTION_EVENT, we must let the
      worker roll back the current group and gracefully finish its work,
      before starting to apply the new (complete) copy of the group.
    */
    if (ev->get_type_code() == FORMAT_DESCRIPTION_EVENT &&
        ev->server_id != ::server_id && ev->log_pos != 0 &&
        rli->is_parallel_exec() && rli->curr_group_seen_gtid)
    {
      if (coord_handle_partial_binlogged_transaction(rli, ev))
        /*
          In the case of an error, coord_handle_partial_binlogged_transaction
          will not try to get the rli->data_lock again.
        */
        DBUG_RETURN(1);
    }

MySQL官方針對此問題有過多次改進，詳見以下commit

666aec4a9e976bef4ddd90246c4a31dd456cbca3
3f6ed37fa218ef6a39f28adc896ac0d2f0077ddb
9e2140fc8764feeddd70c58983a8b50f52a12f18

異常case處理

當slave SQL線程處於部分事務異常時，按上節的邏輯，IO thread恢複後，複製是可以正常進行的。但如果IO thread如果長時間不能恢複，那麼SQL apply線程會一直等待新的binlog，並且會一直持有事務中的鎖。當slave切換為master後，新master會接受用戶連接處理事務，這樣SQL apply線程持有的事務鎖，可能阻塞用戶線程的事務。這是我們不希望看到的。

此時可以通過stop slave來停止SQL apply線程，讓事務回滾釋放鎖。

另一種更好的方案是讓SQL apply 線程自動識別這種情況，並加以處理。比如，增加等待超時機製，超時後自動kill sql 線程或回滾SQL線程的部分事務。

最後更新：2017-08-21 09:02:51

MySQL · 源碼分析 · MySQL replication partial transaction

replication 概述

replication 異常

異常恢複

異常case處理

上一篇： MySQL · 特性分析 · 到底是誰執行了FTWL

下一篇： PgSQL · 特性介紹 · 列存元數據掃描介紹

相關內容

熱門內容

最新內容