AliSQL · 特性介紹 · 動態加字段

背景

加字段作為業務需求變更中最常見的需求，InnoDB引擎表的加字段功能一直以來被運維人員所詬病，
雖然支持了online方式，但隨著表空間越來越大，copy整張表的代價也越來越大。
AliSQL版本在InnoDB的compact記錄格式的基礎上，設計了新的記錄格式comfort，支持動態加字段。

使用方法

使用的實例如下：

CREATE TABLE test(
id int primary key,
name varchar(100),
key(name)
)ENGINE=InnoDB  ROW_FORMAT=comfort;

ALTER TABLE test ADD col1 INT;

這裏沒有增加新的語法，隻是增加了新的InnoDB的記錄格式，alter語句保持一致。
可以通過SHOW CREATE TABLE或者查詢information_schema.tables查看ROW_FORMAT。

mysql> show create table test\G;
*************************** 1. row ***************************
       Table: test
Create Table: CREATE TABLE `test` (
  `id` int(11) NOT NULL,
  `name` varchar(100) DEFAULT NULL,
  `col1` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMFORT
1 row in set (0.00 sec)

實現方法

AliSQL設計了一種新的記錄格式，命名為comfort，其格式從compact演化而來：

Compact行記錄的格式:

變長字段長度列表：如果列的長度小於255字節，用1字節表示；如果大於255個字節，用2字節表示。
NULL標誌位：表明該行數據是否有NULL值。占一個字節。
記錄頭信息：固定占用5字節,每位的含義見下表：

名稱	大小(bit)	描述
()	1	未知
()	1	未知
delete_flag	1	該行是否已被刪除
min_rec_flag	1	為1，如果該記錄是預先被定義為最小的記錄
n_owned	4	該記錄擁有的記錄數
heap_no	13	索引堆中該記錄的排序記錄
record_type	3	記錄類型，000表示普通，001表示B+樹節點指針，010表示infimum，011表示supermum，1xx表示保留
next_record	16	頁中下一條記錄的相對位置

新的Comfort記錄格式如下：

[Lens | N_nulls | N_fields | Extra_bytes | columns...]

其中：
1. Extra_bytes中info_bits占用一個bit來標識comfort記錄，即記錄頭中未使用的2個bit中的其中一個。
2. 新增N_fields占用1或者2個Bytes來標識當前記錄的column數量:
當記錄數小於128個時，占用1個Bytes
當大於等於128時，使用2個Bytes。

實現邏輯

假設變更的case如下：

CREATE TABLE `test` (
  `id` int(11) NOT NULL,
  `name` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMFORT；

alter table test add col1 int;

1. alter變更

1. 變更數據字典SYS_TABLES中的n_cols字段，即更新column數量
InnoDB的變更語句如下：

     trx->op_info = "Updating column in SYS_TABLES";
     /* N_COLS include compact format bit.*/
     error = que_eval_sql(
             info,
             "PROCEDURE UPDATE_SYS_TABLES_PROC () IS\n"
             "BEGIN\n"
             "UPDATE SYS_TABLES SET N_COLS=N_COLS+1\n"
             "WHERE ID=:table_id;\n"
             "END;\n",
             FALSE, trx);

2.變更數據字典SYS_COLUMNS，新增一條記錄，即新增的column
InnoDB的變更語句如下：

       trx->op_info = "inserting column in SYS_COLUMNS";
       error = que_eval_sql(
               info,
               "PROCEDURE INSERT_SYS_COLUMNS_PROC () IS\n"
               "BEGIN\n"
               "INSERT INTO SYS_COLUMNS VALUES\n"
               "(:table_id, :pos, :name, :mtype, :prtype, :len, :prec);\n"
               "END;\n",
               FALSE, trx);

3. 變更dictionary cache中的dict_table_t對象
新的column需要追加到dict_table_t定義的column數組中，

變更前：
table->columns:
(id, name, row_id, trx_id, undo_ptr)

變更後：
table->columns:
(id, name, col1, row_id, trx_id, undo_ptr)

其代碼如下：

      /* The new column will be added into after user_def cols,
      before SYS_COLS(ROW_ID, TRX_ID, ROLL_PTR) in dict_table_t */
      for (ulint i= 0; i < n_cols; i++) {
              col = (dict_col_t*)save_cols + i;
              if (i == n_cols - DATA_N_SYS_COLS) {
                      dict_mem_table_add_col(user_table, user_table->heap,
                                      field->field_name,
                                      mtype, prtype, len);
              }
              dict_mem_table_add_col(user_table, user_table->heap,
                                      col_name,
                                      col->mtype, col->prtype, col->len);
              new_col = dict_table_get_nth_col(user_table, user_table->n_def - 1);
              dict_col_copy_ord_prefix(new_col, col);
      }

4. 變更Dictionary Cache中的dict_index_t對象（Cluster index）

變更前：
Primary key的field數組如下:
(id, trx_id, undo_ptr, name)

變更後：
Primary key的field數組如下:
(id, trx_id, undo_ptr, name， col1)

其代碼如下：

       /*The new column will added into after last field in dict_index_t */
       for (ulint i = 0; i < n_fields; i++) {
               dfield = (dict_field_t*)(save_fields) + i;
               if (dfield->col->ind < n_cols - DATA_N_SYS_COLS) {
                       col = dict_table_get_nth_col(user_table, dfield->col->ind);
               } else {
                       col = dict_table_get_nth_col(user_table, dfield->col->ind + 1);
               }
               dict_index_add_col(clust_index, user_table, col, dfield->prefix_len);
       }
       col = dict_table_get_nth_col(user_table, n_cols - DATA_N_SYS_COLS);

5. 變更Dictionary Cache中的dict_index_t對象(Secondary index)

變更前：
secondary index的field數組：(name, id)

變更後：
secondary index的field數組：(name, id)

在變更前後，二級索引所對應的fields沒有發生變化，fields所對應的column的位置也沒有變更，隻是因為dict_table_t對象的columns對象重建了，所以需要變更一下field做引用的culumn，這裏需要reload一下即可。

對比Online和Dynamic方式

InnoDB原生的Online方式的步驟大致是：
1. 持有exclusive MDL lock，
2. 根據變更後的表結構新建臨時表，
3. 新建log表，記錄原表的變更
4. MDL降級為shared 鎖，原表允許DML，
5. copy數據到新的臨時表，並持續copy log表中的記錄
6. MDL升級為exclusive
7. apply完log表中所有的記錄，並rename表
8. 刪除老表，完成變更

InnoDB新的Dynamic方式的步驟大致是：
1. 持有exclusive MDL lock，
2. 降級為shared的鎖，允許DML
3. 升級為exclusive鎖
4. 變更數據字典（SYS_TABLES, SYS_COLUMNS）
5. 變更數據字典緩存（dict_table_t, dict_index_t）
6. 釋放MDL鎖

測試情況：

Compact格式的表加字段，共計20W多條記錄的情況下，耗時25.98s。

Comfort格式的表加字段，共計20W多條記錄的情況下，耗時0.01s。

總結

動態加字段能夠在不copy記錄的情況下，秒級完成結構的變更，大大方便了運維DBA人員的日常變更，這個功能patch已經開源在AliSQL版本。
如果有興趣，可以關注AliSQL的開源項目：https://github.com/alibaba/AliSQL

最後更新：2017-05-21 09:01:42

AliSQL · 特性介紹 · 動態加字段

背景

使用方法

實現方法

實現邏輯

1. alter變更

對比Online和Dynamic方式

總結

上一篇： PgSQL · 特性分析 · 數據庫崩潰恢複（上）

下一篇： MySQL · 引擎特性 · InnoDB Buffer Pool

相關內容

熱門內容

最新內容