350
技術社區[雲棲]
PostgreSQL flashback(閃回) 功能實現與介紹
標簽
PostgreSQL , 髒讀 , 事務 , flashback , 閃回 , drop , truncate , dml
背景
閃回的需求往往是救命的需求,因為通常情況下數據庫正常運行是不需要閃回的,往往是出現了誤操作,被攻擊,被注入後,數據庫的數據被刪除或惡意纂改並且纂改的事務已提交,也就是說纂改已經被持久化了。
這種情況下需要閃回來救命,回到被破壞前的狀態。
閃回的目標分為兩種:
DML閃回和DDL閃回。
DML閃回指對INSET, UPDATE, DELETE操作的閃回。DDL閃回指DROP, TRUNCATE操作的閃回。
閃回的實現分兩種:
1、物理回退,相當於使用物理備份和歸檔進行時間點恢複,全庫恢複到誤操作前的狀態。
(可以新建一個庫用於恢複,恢複到目標時間點,恢複後,將誤操作前的數據導出來,再導入線上數據庫。)
2、在當前庫回退,在當前庫,將誤操作影響的數據找出來。
閃回的手段:
1、物理回退,PG內核已支持時間點恢複,隻要有誤操作前的全量備份和所有歸檔即可。
2、當前庫回退,使用HOOK,可以實現DROP和TRUNCATE操作的回收站功能。
3、使用延遲垃圾回收、髒讀、行頭事務號、事務提交日誌,可以實現DML操作的閃回。
以前寫的一些閃回方案:
《PostgreSQL 閃回 - flash back query emulate by trigger》
《PostgreSQL Oracle 兼容性之 - 事件觸發器實現類似Oracle的回收站功能》
《PostgreSQL 回收站功能 - 基於HOOK的recycle bin pgtrashcan》
《PostgreSQL 最佳實踐 - 任意時間點恢複源碼分析》
《PostgreSQL 最佳實踐 - 在線增量備份與任意時間點恢複》
《阿裏雲ApsaraDB RDS for PostgreSQL 最佳實踐 - 6 任意時間點恢複》
《PostgreSQL PITR THREE recovery target MODE: name,xid,time USE CASE - 2》
《PostgreSQL PITR THREE recovery target MODE: name,xid,time USE CASE - 1》
本文將詳細介紹“使用延遲垃圾回收、髒讀、行頭事務號、事務提交日誌,實現DML操作的閃回。”的方法。
flashback 前提
1、延遲VACUUM,確保誤操作的數據還沒有被垃圾回收。
vacuum_defer_cleanup_age = 1000000 # 延遲100萬個事務再回收垃圾,誤操作後在100萬個事務內,如果發現了誤操作,才有可能使用本文提到的方法閃回。
2、記錄未被freeze,確保無操作的數據,以及後麵提交的事務號沒有被freeze(抹去)。
vacuum_freeze_min_age = 50000000 # 事務年齡大於5000萬時,才可能被抹去事務號。
3、開啟事務提交時間跟蹤,確保可以從xid得到事務結束的時間。(開啟事務結束時間跟蹤後,會開辟一塊共享內存區存儲這個信息。)。
track_commit_timestamp = on # 開啟事務結束時間跟蹤
4、事務提交時間跟蹤未抹去。可以加大BUFFER來增加可跟蹤的事務數。(重啟數據庫不影響,有持久化已有的跟蹤記錄)
Size
CommitTsShmemBuffers(void)
{
return Min(16, Max(4, NBuffers / 1024));
}
cd $PGDATA
ll
drwx------ 2 digoal users 4.0K Oct 10 10:28 pg_commit_ts
flashback 準備工作
1、將xid轉換為txid的函數,見本文末尾。(因為記錄的行頭部,存儲的是XID,而不是TXID,而查詢事務提交狀態,用的是TXID,因此需要轉換一下。)
2、髒讀插件pg_dirtyread(用於讀取髒頁)
https://github.com/ChristophBerg/pg_dirtyread
create extension pg_dirtyread ;
3、根據txid查詢事務提交狀態
postgres=# select txid_status(2);
txid_status
-------------
committed
(1 row)
4、根據xid查詢事務提交時間
postgres=# select xmin,xmax,xid_to_txid(xmin),txid_status(xid_to_txid(xmin)),pg_xact_commit_timestamp(xmin) from trac ;
xmin | xmax | xid_to_txid | txid_status | pg_xact_commit_timestamp
----------+----------+-------------+-------------+-------------------------------
40477717 | 40477727 | 25810281493 | committed | 2017-10-10 10:29:21.269612+08
40477719 | 40477727 | 25810281495 | committed | 2017-10-10 11:15:05.875067+08
(2 rows)
5、使用髒讀插件,將需要flashback的數據(連同頭信息xmin,xmax)寫入臨時表,根據以上兩種方法生成如下字段:
寫入事務提交狀態、事務提交時間。(xmin)
刪除事務提交狀態、事務提交時間。(xmax)
create table tmp_xxx as
select
xid_to_txid(xmin) as xmin_txid, txid_status(xid_to_txid(xmin)) as xmin_cmstat, pg_xact_commit_timestamp(xmin) as xmin_ts,
xid_to_txid(xmax) as xmax_txid, txid_status(xid_to_txid(xmax)) as xmax_cmstat, pg_xact_commit_timestamp(case xmax when 0 then null else xmax end) as xmax_ts,
xmin as xmin1,xmax as xmax1,dead,oid,
id -- 目標表字段
from
(
SELECT * FROM pg_dirtyread('table'::regclass)
as t (tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean, oid oid,
id int) -- 目標表字段
) t;
例如
postgres=# select
xid_to_txid(xmin) as xmin_txid, txid_status(xid_to_txid(xmin)) as xmin_cmstat, pg_xact_commit_timestamp(xmin) as xmin_ts,
xid_to_txid(xmax) as xmax_txid, txid_status(xid_to_txid(xmax)) as xmax_cmstat, pg_xact_commit_timestamp(case xmax when 0 then null else xmax end) as xmax_ts,
xmin as xmin1,xmax as xmax1,dead,oid,
id -- 目標表字段
from
(
SELECT * FROM pg_dirtyread('trac'::regclass)
as t (tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean, oid oid,
id int) -- 目標表字段
) t;
xmin_txid | xmin_cmstat | xmin_ts | xmax_txid | xmax_cmstat | xmax_ts | xmin1 | xmax1 | dead | oid | id
-------------+-------------+-------------------------------+-------------+-------------+-------------------------------+----------+----------+------+-----+----
25810281493 | committed | 2017-10-10 10:29:21.269612+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477717 | 40477750 | f | 0 | 1
25810281495 | committed | 2017-10-10 11:15:05.875067+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477719 | 40477752 | f | 0 | 2
25810281503 | aborted | | 0 | | | 40477727 | 0 | t | 0 | 2
25810281503 | aborted | | 0 | | | 40477727 | 0 | t | 0 | 2
25810281513 | committed | 2017-10-10 16:56:12.206339+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477737 | 40477752 | f | 0 | 2
25810281514 | committed | 2017-10-10 16:56:13.706233+08 | 0 | | | 40477738 | 0 | f | 0 | 3
25810281515 | committed | 2017-10-10 16:56:15.108331+08 | 0 | | | 40477739 | 0 | f | 0 | 4
25810281516 | committed | 2017-10-10 16:56:16.092184+08 | 25810281521 | committed | 2017-10-10 16:56:58.684832+08 | 40477740 | 40477745 | f | 0 | 5
25810281517 | committed | 2017-10-10 16:56:17.207356+08 | 25810281519 | committed | 2017-10-10 16:56:48.011544+08 | 40477741 | 40477743 | f | 0 | 6
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 6
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 6
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 6
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 7
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 7
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 7
25810281519 | committed | 2017-10-10 16:56:48.011544+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477743 | 40477752 | f | 0 | 2
25810281520 | aborted | | 0 | | | 40477744 | 0 | t | 0 | 2
25810281521 | committed | 2017-10-10 16:56:58.684832+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477745 | 40477752 | f | 0 | 2
25810281523 | committed | 2017-10-10 17:21:39.000214+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477747 | 40477750 | f | 0 | 1
25810281523 | committed | 2017-10-10 17:21:39.000214+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477747 | 40477750 | f | 0 | 1
25810281523 | committed | 2017-10-10 17:21:39.000214+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477747 | 40477750 | f | 0 | 1
25810281523 | committed | 2017-10-10 17:21:39.000214+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477747 | 40477750 | f | 0 | 1
25810281523 | committed | 2017-10-10 17:21:39.000214+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477747 | 40477750 | f | 0 | 1
(23 rows)
postgres=# select xmin,xmax,* from trac ;
xmin | xmax | id
----------+------+----
40477738 | 0 | 3
40477739 | 0 | 4
(2 rows)
flashback 實踐
1、根據pg_xlogdump找到精確的誤操作xid,以及事務提交對應的時間戳。(或者使用用戶提供的時間戳,大概的誤操作前的時間。)
參考:
《PostgreSQL 使用pg_xlogdump找到誤操作事務號》
2、回退到過去的某個時間點(采用基於臨時表的VIEW來展現) (根據事務提交順序,逆序,逐個事務排除,逐個事務回退。)
select * from tmp_xxx where
(
xmin_cmstat='committed' and xmin_ts <= '$ts'
and
(xmax_cmstat='committed' and xmax_ts < '$ts') is distinct from true
)
;
例子
select * from tmp_xxx where
(
xmin_cmstat='committed' and xmin_ts <= '2017-10-10 16:56:58.684832+08'
and
(xmax_cmstat='committed' and xmax_ts < '2017-10-10 16:56:58.684832+08') is distinct from true
)
;
xmin_txid | xmin_cmstat | xmin_ts | xmax_txid | xmax_cmstat | xmax_ts | xmin1 | xmax1 | dead | oid | id
-------------+-------------+-------------------------------+-------------+-------------+-------------------------------+----------+----------+------+-----+----
25810281493 | committed | 2017-10-10 10:29:21.269612+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477717 | 40477750 | f | 0 | 1
25810281495 | committed | 2017-10-10 11:15:05.875067+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477719 | 40477752 | f | 0 | 2
25810281513 | committed | 2017-10-10 16:56:12.206339+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477737 | 40477752 | f | 0 | 2
25810281514 | committed | 2017-10-10 16:56:13.706233+08 | 0 | | | 40477738 | 0 | f | 0 | 3
25810281515 | committed | 2017-10-10 16:56:15.108331+08 | 0 | | | 40477739 | 0 | f | 0 | 4
25810281516 | committed | 2017-10-10 16:56:16.092184+08 | 25810281521 | committed | 2017-10-10 16:56:58.684832+08 | 40477740 | 40477745 | f | 0 | 5
25810281519 | committed | 2017-10-10 16:56:48.011544+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477743 | 40477752 | f | 0 | 2
25810281521 | committed | 2017-10-10 16:56:58.684832+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477745 | 40477752 | f | 0 | 2
(8 rows)
3、找出被誤操作事務 刪除、更新、插入 的記錄(OLD ROW(被刪除、更新前的), NEW ROW(更新後的、誤插入的))。
select * from tmp_xxx where xmax1=? and xmax_cmstat='committed' and xmin_cmstat='committed'; -- 被某個XID刪除、更新前的數據。
select * from tmp_xxx where xmin1=? and xmin_cmstat='committed'; -- 被某個XID插入、更新後的數據。
例子
postgres=# select * from tmp_xxx where xmax1=40477745 and xmax_cmstat='committed' and xmin_cmstat='committed';
xmin_txid | xmin_cmstat | xmin_ts | xmax_txid | xmax_cmstat | xmax_ts | xmin1 | xmax1 | dead | oid | id
-------------+-------------+-------------------------------+-------------+-------------+-------------------------------+----------+----------+------+-----+----
25810281516 | committed | 2017-10-10 16:56:16.092184+08 | 25810281521 | committed | 2017-10-10 16:56:58.684832+08 | 40477740 | 40477745 | f | 0 | 5
(1 row)
postgres=# select * from tmp_xxx where xmin1=40477745 and xmin_cmstat='committed';
xmin_txid | xmin_cmstat | xmin_ts | xmax_txid | xmax_cmstat | xmax_ts | xmin1 | xmax1 | dead | oid | id
-------------+-------------+-------------------------------+-------------+-------------+-------------------------------+----------+----------+------+-----+----
25810281521 | committed | 2017-10-10 16:56:58.684832+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477745 | 40477752 | f | 0 | 2
(1 row)
flashback 可能造成的負麵影響
flashback的前提,前麵已經講了,但是可能帶來一些負麵影響。
1、由於設置了vacuum_defer_cleanup_age,因此每次都會被觸發VACUUM,掃描表,但是又有一些垃圾不能被回收;
2、可能導致表膨脹。
解決負麵影響
1、提高autovacuum_naptime,從而降低頻繁掃描的概率。
2、膨脹可能無法解決。建議修改內核,實現可以僅針對重要的表設置 vacuum_defer_cleanup_age。從而避免全庫膨脹。
相關知識點
相關參數 - 跟蹤事務時間、延遲回收垃圾
track_commit_timestamp = on # collect timestamp of transaction commit
# (change requires restart)
vacuum_defer_cleanup_age = 100000 # number of xacts by which cleanup is delayed
相關函數 - 事務狀態、事務結束時間
Name | Return Type | Description |
---|---|---|
txid_status(bigint) | txid_status | report the status of the given transaction: committed, aborted, in progress, or null if the transaction ID is too old |
pg_xact_commit_timestamp(xid) | timestamp with time zone | get commit timestamp of a transaction |
https://www.postgresql.org/docs/10/static/functions-info.html
txid_status(bigint)
reports the commit status of a recent transaction.
Applications may use it to determine whether a transaction committed or aborted
when the application and database server become disconnected while a COMMIT is in progress.
The status of a transaction will be reported as either in progress, committed, or aborted,
provided that the transaction is recent enough that the system retains the commit status of that transaction.
If is old enough that no references to that transaction survive in the system and the commit status information has been discarded,
this function will return NULL. Note that prepared transactions are reported as in progress;
applications must check pg_prepared_xacts if they need to determine whether the txid is a prepared transaction.
讀髒頁插件
https://github.com/ChristophBerg/pg_dirtyread
如何跟蹤事務結束時間
開啟事務結束時間跟蹤後,會開辟一塊共享內存,跟蹤事務結束時間。
/*
* Number of shared CommitTS buffers.
*
* We use a very similar logic as for the number of CLOG buffers; see comments
* in CLOGShmemBuffers.
*/
Size
CommitTsShmemBuffers(void)
{
return Min(16, Max(4, NBuffers / 1024));
}
/*
* Shared memory sizing for CommitTs
*/
Size
CommitTsShmemSize(void)
{
return SimpleLruShmemSize(CommitTsShmemBuffers(), 0) +
sizeof(CommitTimestampShared);
}
xid和txid的區別
xid是32位的整型,會被FREEZE,循環使用,xmin,xmax都是XID類型。通過xid可以得到事務結束時間。
txid是64位整型,不會被循環使用,通過epoch可以從xid轉換為txid。通過txid可以得到事務的提交狀態。
/*
* Export internal transaction IDs to user level.
*
* Note that only top-level transaction IDs are ever converted to TXID.
* This is important because TXIDs frequently persist beyond the global
* xmin horizon, or may even be shipped to other machines, so we cannot
* rely on being able to correlate subtransaction IDs with their parents
* via functions such as SubTransGetTopmostTransaction().
*
/* txid will be signed int8 in database, so must limit to 63 bits */
#define MAX_TXID UINT64CONST(0x7FFFFFFFFFFFFFFF)
/*
* do a TransactionId -> txid conversion for an XID near the given epoch
*/
static txid
convert_xid(TransactionId xid, const TxidEpoch *state)
{
uint64 epoch;
/* return special xid's as-is */
if (!TransactionIdIsNormal(xid))
return (txid) xid;
/* xid can be on either side when near wrap-around */
epoch = (uint64) state->epoch;
if (xid > state->last_xid &&
TransactionIdPrecedes(xid, state->last_xid))
epoch--;
else if (xid < state->last_xid &&
TransactionIdFollows(xid, state->last_xid))
epoch++;
return (epoch << 32) | xid;
}
如何將xid轉換為txid
我們可以自定義一個函數,將xid轉換為txid。
vi xid_to_txid.c
#include "postgres.h"
#include "fmgr.h"
#include "access/xact.h"
#include "access/transam.h"
#include "access/xlog.h"
/* Use unsigned variant internally */
typedef uint64 txid;
/*
* Epoch values from xact.c
*/
typedef struct
{
TransactionId last_xid;
uint32 epoch;
} TxidEpoch;
PG_MODULE_MAGIC;
PG_FUNCTION_INFO_V1(xid_to_txid);
/*
* do a TransactionId -> txid conversion for an XID near the given epoch
*/
static txid
convert_xid(TransactionId xid, const TxidEpoch *state)
{
uint64 epoch;
/* return special xid's as-is */
if (!TransactionIdIsNormal(xid))
return (txid) xid;
/* xid can be on either side when near wrap-around */
epoch = (uint64) state->epoch;
if (xid > state->last_xid &&
TransactionIdPrecedes(xid, state->last_xid))
epoch--;
else if (xid < state->last_xid &&
TransactionIdFollows(xid, state->last_xid))
epoch++;
return (epoch << 32) | xid;
}
/*
* Fetch epoch data from xact.c.
*/
static void
load_xid_epoch(TxidEpoch *state)
{
GetNextXidAndEpoch(&state->last_xid, &state->epoch);
}
Datum
xid_to_txid(PG_FUNCTION_ARGS)
{
txid val;
TxidEpoch state;
TransactionId xid;
xid = DatumGetTransactionId(PG_GETARG_DATUM(0));
load_xid_epoch(&state);
val = convert_xid(xid, &state);
PG_RETURN_INT64(val);
}
編譯
gcc -O3 -Wall -Wextra -Werror -I /home/digoal/postgresql-10beta4/src/include -g -fPIC -c ./xid_to_txid.c -o xid_to_txid.o
gcc -O3 -Wall -Wextra -Werror -I /home/digoal/postgresql-10beta4/src/include -g -shared xid_to_txid.o -o libxid_to_txid.so
拷貝到數據庫軟件LIB目錄
cp libxid_to_txid.so pgsql10/lib/
創建函數
create or replace function xid_to_txid(xid) returns int8 as '$libdir/libxid_to_txid.so', 'xid_to_txid' language C STRICT;
測試
-- 當前值
postgres=# select xmin,xmax,xid_to_txid(xmin),txid_status(xid_to_txid(xmin)),pg_xact_commit_timestamp(xmin) from trac ;
xmin | xmax | xid_to_txid | txid_status | pg_xact_commit_timestamp
----------+------+-------------+-------------+-------------------------------
40477717 | 0 | 8630412309 | committed | 2017-10-10 10:29:21.269612+08
40477719 | 0 | 8630412311 | committed | 2017-10-10 11:15:05.875067+08
(2 rows)
-- 產生一些DML
postgres=# begin;
BEGIN
postgres=# update trac set id =2;
UPDATE 2
postgres=# rollback;
ROLLBACK
-- 值的變化
postgres=# select xmin,xmax,xid_to_txid(xmin),txid_status(xid_to_txid(xmin)),pg_xact_commit_timestamp(xmin) from trac ;
xmin | xmax | xid_to_txid | txid_status | pg_xact_commit_timestamp
----------+----------+-------------+-------------+-------------------------------
40477717 | 40477727 | 8630412309 | committed | 2017-10-10 10:29:21.269612+08
40477719 | 40477727 | 8630412311 | committed | 2017-10-10 11:15:05.875067+08
(2 rows)
postgres=# select xmin,xmax,xid_to_txid(xmin),txid_status(xid_to_txid(xmin)),pg_xact_commit_timestamp(xmin),xid_to_txid(xmax),txid_status(xid_to_txid(xmax)),pg_xact_commit_timestamp(xmax) from trac ;
xmin | xmax | xid_to_txid | txid_status | pg_xact_commit_timestamp | xid_to_txid | txid_status | pg_xact_commit_timestamp
----------+----------+-------------+-------------+-------------------------------+-------------+-------------+--------------------------
40477717 | 40477727 | 8630412309 | committed | 2017-10-10 10:29:21.269612+08 | 8630412319 | aborted |
40477719 | 40477727 | 8630412311 | committed | 2017-10-10 11:15:05.875067+08 | 8630412319 | aborted |
(2 rows)
-- 停庫,修改epoch
pg_ctl stop -m fast
pg_resetwal -e 5 $PGDATA
Write-ahead log reset
-- 通過控製文件可以得到這個epoch
pg_controldata
Latest checkpoint's NextXID: 5:40477728
-- 繼續修改epoch
pg_resetwal -e 6 $PGDATA
Write-ahead log reset
-- 通過控製文件可以得到這個epoch
pg_controldata
Latest checkpoint's NextXID: 6:40477728
-- 啟動
pg_ctl start
-- epoch修改後,txid發生了變化
postgres=# select xmin,xmax,xid_to_txid(xmin),txid_status(xid_to_txid(xmin)),pg_xact_commit_timestamp(xmin),xid_to_txid(xmax),txid_status(xid_to_txid(xmax)),pg_xact_commit_timestamp(xmax) from trac ;
xmin | xmax | xid_to_txid | txid_status | pg_xact_commit_timestamp | xid_to_txid | txid_status | pg_xact_commit_timestamp
----------+----------+-------------+-------------+-------------------------------+-------------+-------------+--------------------------
40477717 | 40477727 | 25810281493 | committed | 2017-10-10 10:29:21.269612+08 | 25810281503 | aborted |
40477719 | 40477727 | 25810281495 | committed | 2017-10-10 11:15:05.875067+08 | 25810281503 | aborted |
(2 rows)
為什麼會有EPOCH呢?因為XID是循環使用的,每一個循環都會使得EPOCH自增1。從而使得TXID可以持續增長。
參考
《PostgreSQL 閃回 - flash back query emulate by trigger》
《Use pg_resetxlog simulate tuple disappear within PostgreSQL》
《PostgreSQL xid(int4) to txid(int8)》
《PostgreSQL 使用pg_xlogdump找到誤操作事務號》
https://github.com/ChristophBerg/pg_dirtyread
https://www.postgresql.org/docs/10/static/functions-info.html
《PostgreSQL Oracle 兼容性之 - 事件觸發器實現類似Oracle的回收站功能》
《PostgreSQL 回收站功能 - 基於HOOK的recycle bin pgtrashcan》
其他
誤操作後,如果又發生了DDL,例如新增字段,修改字段等。使用本文提供的閃回方法,有沒有問題?
測一下就知道了,不會有問題。
postgres=# alter table trac add column info text;
ALTER TABLE
postgres=# insert into trac values (2,'test');
INSERT 0 1
postgres=# select
xid_to_txid(xmin) as xmin_txid, txid_status(xid_to_txid(xmin)) as xmin_cmstat, pg_xact_commit_timestamp(xmin) as xmin_ts,
xid_to_txid(xmax) as xmax_txid, txid_status(xid_to_txid(xmax)) as xmax_cmstat, pg_xact_commit_timestamp(case xmax when 0 then null else xmax end) as xmax_ts,
xmin as xmin1,xmax as xmax1,dead,oid,
id,info -- 目標表字段
from
(
SELECT * FROM pg_dirtyread('trac'::regclass)
as t (tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean, oid oid,
id int, info text) -- 目標表字段
) t;
xmin_txid | xmin_cmstat | xmin_ts | xmax_txid | xmax_cmstat | xmax_ts | xmin1 | xmax1 | dead | oid | id | info
-------------+-------------+-------------------------------+-------------+-------------+-------------------------------+----------+----------+------+-----+----+------
25810281493 | committed | 2017-10-10 10:29:21.269612+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477717 | 40477750 | f | 0 | 1 |
25810281495 | committed | 2017-10-10 11:15:05.875067+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477719 | 40477752 | f | 0 | 2 |
25810281503 | aborted | | 0 | | | 40477727 | 0 | t | 0 | 2 |
25810281503 | aborted | | 0 | | | 40477727 | 0 | t | 0 | 2 |
25810281513 | committed | 2017-10-10 16:56:12.206339+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477737 | 40477752 | f | 0 | 2 |
25810281514 | committed | 2017-10-10 16:56:13.706233+08 | 0 | | | 40477738 | 0 | f | 0 | 3 |
25810281515 | committed | 2017-10-10 16:56:15.108331+08 | 0 | | | 40477739 | 0 | f | 0 | 4 |
25810281516 | committed | 2017-10-10 16:56:16.092184+08 | 25810281521 | committed | 2017-10-10 16:56:58.684832+08 | 40477740 | 40477745 | f | 0 | 5 |
25810281517 | committed | 2017-10-10 16:56:17.207356+08 | 25810281519 | committed | 2017-10-10 16:56:48.011544+08 | 40477741 | 40477743 | f | 0 | 6 |
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 6 |
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 6 |
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 6 |
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 7 |
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 7 |
25810281518 | aborted | | 0 | | | 40477742 | 0 | t | 0 | 7 |
25810281519 | committed | 2017-10-10 16:56:48.011544+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477743 | 40477752 | f | 0 | 2 |
25810281520 | aborted | | 0 | | | 40477744 | 0 | t | 0 | 2 |
25810281521 | committed | 2017-10-10 16:56:58.684832+08 | 25810281528 | committed | 2017-10-10 17:25:47.410793+08 | 40477745 | 40477752 | f | 0 | 2 |
25810281523 | committed | 2017-10-10 17:21:39.000214+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477747 | 40477750 | f | 0 | 1 |
25810281523 | committed | 2017-10-10 17:21:39.000214+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477747 | 40477750 | f | 0 | 1 |
25810281523 | committed | 2017-10-10 17:21:39.000214+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477747 | 40477750 | f | 0 | 1 |
25810281523 | committed | 2017-10-10 17:21:39.000214+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477747 | 40477750 | f | 0 | 1 |
25810281523 | committed | 2017-10-10 17:21:39.000214+08 | 25810281526 | committed | 2017-10-10 17:25:34.593189+08 | 40477747 | 40477750 | f | 0 | 1 |
25810281532 | committed | 2017-10-10 18:58:50.720095+08 | 0 | | | 40477756 | 0 | f | 0 | 2 | test
(24 rows)
小結
本文介紹了原地閃回的一種方法:
1、新增髒讀功能、新增表級vacuum_defer_cleanup_age(避免全庫膨脹)功能、開啟track_commit_timestamp。可以實現dml flashback(閃回)。
2、增加回收站(通過HOOK)功能,可以實現DDL flashback。
全庫閃回,可以參考PostgreSQL的PITR的功能。
原地閃回的另一種方法:利用redo的undo內容進行閃回,這種方法可以避免膨脹的問題,回退到什麼時間點,與保留的REDO文件數有關。使用redo來閃回也會引入一個問題,UNDO需要記錄更多的內容,導致REDO文件內容變多。因此建議也是設置表級redo的UNDO內容。
create table\alter table
REPLICA IDENTITY { DEFAULT | USING INDEX index_name | FULL | NOTHING }
需要設置為FULL,存儲所有的OLD VALUE,才能實現UNDO。
REPLICA IDENTITY
This form changes the information which is written to the write-ahead log
to identify rows which are updated or deleted.
This option has no effect except when logical replication is in use.
DEFAULT (the default for non-system tables) records the old values of the columns
of the primary key, if any.
USING INDEX records the old values of the columns covered by the named index,
which must be unique, not partial, not deferrable, and include only columns marked NOT NULL.
FULL records the old values of all columns in the row.
NOTHING records no information about the old row.
(This is the default for system tables.)
In all cases, no old values are logged unless at least one of the columns that would
be logged differs between the old and new versions of the row.
最後更新:2017-10-28 23:34:02