閱讀196 返回首頁    go 阿裏雲 go 技術社區[雲棲]


PostgreSQL 10.0 preview 功能增強 - 增加access method CHECK接口amcheck

標簽

PostgreSQL , 10.0 , amcheck , 邏輯一致性檢測 , 物理存儲檢測


背景

一些高端存儲、包括ZFS文件係統,在使用了RAID後,有塊檢測和異常塊的修複功能。

對於數據庫來說,數據的可靠性是非常重要的指標,例如:

1. 寫進入是什麼,讀出來就應該是什麼。

2. 當操作係統的collate發生變化時,索引的順序可能與實際的collate順序不匹配。造成不穩定現象。

3. 數據塊partial write,可能導致數據損壞。

4. 內存頁異常,使用到某些異常頁時,可能帶來問題。

PostgreSQL通過full page write來避免3的問題。另外在數據頁上麵有checksum提供檢測。

PostgreSQL 10.0 提供了一個check接口,可以對數據進行檢測,發現以上問題。

amcheck接口

amcheck是一個框架,用於檢測數據的一致性。例如PG有heap存儲,b-tree,gist,gin,sp-gist,brin索引存儲。amcheck可以用於檢測各種接口對應數據存儲的一致性。

命名為amcheck, am指的是access method,檢測的自然是access method相關的。

src/backend/access  
  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:37 brin  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:37 common  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:37 gin  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:37 gist  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:37 hash  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:38 heap  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:37 index  
-rw-r--r-- 1 digoal digoal  321 Apr 14 12:17 Makefile  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:37 nbtree  
-rw-rw-r-- 1 digoal digoal 4759 Apr 14 23:38 objfiles.txt  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:37 rmgrdesc  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:37 spgist  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:37 tablesample  
drwxrwxr-x 2 digoal digoal 4096 Apr 14 23:38 transam  

目前amcheck已經做到可以檢測索引的異常(例如前麵提到的操作係統collate變化引發的索引的邏輯順序異常)。未來會擴展更多的檢測接口。

10.0也推出了對ICU的支持,從根源上避免了collate的問題。

《PostgreSQL 10.0 preview 功能增強 - 國際化功能增強,支持ICU(International Components for Unicode)》

amcheck 可檢測的異常

1. Structural inconsistencies caused by incorrect operator class implementations.

問題可能來自操作係統collate的變化,導致collate變化前後,QUERY輸出不一致(順序)的結果.

檢測方法,參考每種access method的一致性校驗function

https://www.postgresql.org/docs/devel/static/xindex.html

2. Corruption caused by hypothetical undiscovered bugs in the underlying PostgreSQL access method code or sort code.

3. Filesystem or storage subsystem faults where checksums happen to simply not be enabled.

4. Corruption caused by faulty RAM, and the broader memory subsystem and operating system.

amcheck 檢測到的異常修複

不同的異常,修複的方法不一樣,通常能直接修複的是REINDEX。(但並不是所有的異常都有方法修複。)

但是社區給出了一個建議,如果是代碼的BUG,REINDEX可能是無法修複的,但是通過pageinspect插件,可以幫助進行問題診斷。

amcheck patch介紹

Add amcheck extension to contrib.  
  
author	Andres Freund <andres@anarazel.de>	  
Fri, 10 Mar 2017 07:50:40 +0800 (15:50 -0800)  
committer	Andres Freund <andres@anarazel.de>	  
Fri, 10 Mar 2017 08:33:02 +0800 (16:33 -0800)  
  
This is the beginning of a collection of SQL-callable functions to  
verify the integrity of data files.  For now it only contains code to  
verify B-Tree indexes.  
  
This adds two SQL-callable functions, validating B-Tree consistency to  
a varying degree.  Check the, extensive, docs for details.  
  
The goal is to later extend the coverage of the module to further  
access methods, possibly including the heap.  Once checks for  
additional access methods exist, we'll likely add some "dispatch"  
functions that cover multiple access methods.  
  
Author: Peter Geoghegan, editorialized by Andres Freund  
Reviewed-By: Andres Freund, Tomas Vondra, Thomas Munro,  
   Anastasia Lubennikova, Robert Haas, Amit Langote  
Discussion: CAM3SWZQzLMhMwmBqjzK+pRKXrNUZ4w90wYMUWfkeV8mZ3Debvw@mail.gmail.com  

amcheck b-tree數據檢測接口

對於b-tree索引數據,通過這兩個接口可以進行檢測。

1. bt_index_check(index regclass) returns void

加select一樣的accessshared鎖。基本無影響。注意,如果被檢測的索引頁在shared buffer中時,不會掃磁盤。

返回空表示正常。

test=# SELECT bt_index_check(c.oid), c.relname, c.relpages  
FROM pg_index i  
JOIN pg_opclass op ON i.indclass[0] = op.oid  
JOIN pg_am am ON op.opcmethod = am.oid  
JOIN pg_class c ON i.indexrelid = c.oid  
JOIN pg_namespace n ON c.relnamespace = n.oid  
WHERE am.amname = 'btree' AND n.nspname = 'pg_catalog'  
-- Don't check temp tables, which may be from another session:  
AND c.relpersistence != 't'  
-- Function may throw an error when this is omitted:  
AND i.indisready AND i.indisvalid  
ORDER BY c.relpages DESC LIMIT 10;  
 bt_index_check |             relname             | relpages   
----------------+---------------------------------+----------  
                | pg_depend_reference_index       |       43  
                | pg_depend_depender_index        |       40  
                | pg_proc_proname_args_nsp_index  |       31  
                | pg_description_o_c_o_index      |       21  
                | pg_attribute_relid_attnam_index |       14  
                | pg_proc_oid_index               |       10  
                | pg_attribute_relid_attnum_index |        9  
                | pg_amproc_fam_proc_index        |        5  
                | pg_amop_opr_fam_index           |        5  
                | pg_amop_fam_strat_index         |        5  
(10 rows)  

2. bt_index_parent_check(index regclass) returns void

被檢測的索引,以及索引對應的表加ShareLock鎖。衝突較大,堵塞INSERT, UPDATE, and DELETE,表的VACUUM,以及更大的鎖操作。

HOT STNADBY不允許執行 bt_index_parent_check(index regclass) 。

這個patch的討論,詳見郵件組,本文末尾URL。

PostgreSQL社區的作風非常嚴謹,一個patch可能在郵件組中討論幾個月甚至幾年,根據大家的意見反複的修正,patch合並到master已經非常成熟,所以PostgreSQL的穩定性也是遠近聞名的。

參考

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3717dc149ecf44b8be95350a68605ba7299474fd

https://www.postgresql.org/docs/devel/static/amcheck.html

最後更新:2017-04-21 00:30:37

  上一篇:go PostgreSQL 10.0 preview 功能增強 - WAL一致性校驗
  下一篇:go PostgreSQL 10.0 preview 功能增強 - libpq支持多主機連接(failover,LB)讓數據庫HA和應用配合更緊密