閱讀201 返回首頁    go 阿裏雲 go 技術社區[雲棲]


ALICloudDB for PostgreSQL 試用報告 - 4 水平分庫 之 節點擴展

RDS現在還欠缺一個功能,就是數據庫克隆,你可以這麼理解,給現有的數據庫創建STANDBY,然後將這個STANDBY激活,就完成了對數據庫的克隆。
為什麼我們需要數據庫克隆功能呢?
這會使得數據庫的擴容變得非常簡單,比如我們這裏的應用場景,如果要將16個RDS,變成32個RDS,那麼克隆無疑是最好的辦法。因為不需要做邏輯數據遷移的事情,隻需要刪除不需要的數據庫,以及調整plproxy的cluster配置即可。
我們先假設RDS有創建STANDBYD的功能(相信未來會增加),看看如何來實現RDS的擴容。

假設主RDS1包含db0,db16兩個庫,現在要拆分為兩個RDS,RDS1(db0), RDS2(db16),分別包含db0和db16。
1. 為需要拆分的主RDS創建standby RDS, 確認流複製正常,確認standby RDS到達catchup狀態。

2. 配置密碼文件~/.pgpass,包含克隆庫的認證信息。

3. 調整plproxy cluster配置。使用隻讀事務連接需要遷移的數據庫,避免寫操作帶來的主備不一致。
例如 : 
alter server rds_pg_cluster options (set p16 'host=old_rds_hostname dbname=db16 port=3433 user=digoal keepalives_idle=30 keepalives_interval=10 keepalives_count=10 application_name=16 options=''-c default_transaction_read_only=true'' ');

4. 確認主RDS需遷移的庫(db16)沒有連接,確認standby處於catchup狀態。

5. 激活standby。

6. 調整plproxy cluster配置。原連接RDS1(db16),修改為RDS2(db16)。
例如 : 
alter server rds_pg_cluster options (set p16 'host=new_rds_hostname dbname=db16 port=3433 user=digoal keepalives_idle=30 keepalives_interval=10 keepalives_count=10 application_name=16');

7. 刪除主RDS節點已遷移的數據庫(db16),刪除standby節點多餘的數據庫(db0)。

循環1-7,將所有RDS拆分。

回到現實,現實是目前RDS沒有提供克隆功能。那麼我們需要人工實現數據遷移,需遷移的東西還挺多,包括表,視圖,函數,。。。。。可以用pg_dump,但是怎麼實現增量呢?可以通過PostgreSQL的logical decoding來實現增量複製。現在阿裏雲RDS PG已經支持了邏輯複製的功能。

下麵是人力擴容的例子:
源:
postgres=> select datname,pg_database_size(oid)/1024/1024 from pg_database;
  datname  | ?column? 
-----------+----------
 template1 |        6
 template0 |        6
 postgres  |     3618
 digoal    |        6
 db7       |      179
 db23      |      179
(6 rows)

目標,把db23遷移到以下RDS:
postgres=> select datname,pg_database_size(oid)/1024/1024 from pg_database;
  datname  | ?column? 
-----------+----------
 template1 |        6
 template0 |        6
 postgres  |        6
 digoal    |        6
 db8       |      179
 db24      |      179
(6 rows)

在目標RDS創建db23數據庫:
postgres=> create database db23;
CREATE DATABASE
postgres=> \c db23
psql (9.4.3, server 9.4.1)
You are now connected to database "db23" as user "digoal".
創建schema和需要遷移的函數:
db23=> create schema digoal;
CREATE SCHEMA
db23=> CREATE OR REPLACE FUNCTION digoal.dy(sql text)
db23->      RETURNS SETOF record
db23->      LANGUAGE plpgsql
db23->      STRICT
db23->     AS $function$
db23$>       declare
db23$>         rec record;
db23$>       begin
db23$>         for rec in execute sql loop
db23$>           return next rec;
db23$>         end loop;
db23$>         return;
db23$>       end;
db23$>     $function$;
CREATE FUNCTION
db23=> CREATE OR REPLACE FUNCTION digoal.dy_ddl(sql text)
db23->      RETURNS VOID
db23->      LANGUAGE plpgsql
db23->      STRICT
db23->     AS $function$
db23$>       declare
db23$>       begin
db23$>         execute sql;
db23$>         return;
db23$>       exception when others then return;
db23$>       end;
db23$>     $function$;
CREATE FUNCTION
準備需要遷移的數據的外部表:
db23=> create extension postgres_fdw;
CREATE EXTENSION

CREATE SERVER db23
        FOREIGN DATA WRAPPER postgres_fdw
        OPTIONS (host 'old.pg.rds.aliyuncs.com', port '3433', dbname 'db23');

CREATE USER MAPPING FOR digoal  --  locale user
        SERVER db23
        OPTIONS (user 'digoal', password 'digoal');  --  remote user/password

CREATE FOREIGN TABLE digoal.ft_userinfo (
        dbid int default 23,
        userid int,
    info text
)
        SERVER db23
        OPTIONS (schema_name 'digoal', table_name 'userinfo');

CREATE FOREIGN TABLE digoal.ft_session (
        dbid int default 23,
        userid int,
    last_login timestamp without time zone
)
        SERVER db23
        OPTIONS (schema_name 'digoal', table_name 'session');

CREATE FOREIGN TABLE digoal.ft_tbl_small (
        userid int,
    info text
)
        SERVER db23
        OPTIONS (schema_name 'digoal', table_name 'tbl_small'); 

CREATE FOREIGN TABLE digoal.ft_login_log (
        dbid int default 23,
        userid int,
        db_user name,
        client_addr inet,
        client_port int,
        server_addr inet,
        server_port int,
        login_time timestamp without time zone
)
        SERVER db23
        OPTIONS (schema_name 'digoal', table_name 'login_log');

創建物化視圖,日誌表(日誌數據不遷移)
CREATE MATERIALIZED VIEW digoal.userinfo (
        dbid ,
        userid ,
    info 
) as select * from digoal.ft_userinfo;
set maintenance_work_mem='10GB';  --  超出RDS內存限製,可能會被殺掉
create unique index pk_userinfo on digoal.userinfo (userid);

CREATE MATERIALIZED VIEW digoal.session (
        dbid ,
        userid ,
    last_login 
) as select * from digoal.ft_session;
set maintenance_work_mem='10GB';  --  超出RDS內存限製,可能會被殺掉
create unique index pk_session on digoal.session (userid);

CREATE MATERIALIZED VIEW digoal.tbl_small (
        userid ,
    info 
) as select * from digoal.ft_tbl_small;
set maintenance_work_mem='10GB';  --  超出RDS內存限製,可能會被殺掉
create unique index pk_tbl_small on digoal.tbl_small (userid);

CREATE TABLE digoal.login_log (
        dbid int default 23,
        userid int,
        db_user name,
        client_addr inet,
        client_port int,
        server_addr inet,
        server_port int,
        login_time timestamp without time zone
);

創建需要遷移的函數:
CREATE OR REPLACE FUNCTION digoal.dy_generate_test_ddl()
     RETURNS VOID
     LANGUAGE plpgsql
     STRICT
    AS $function$
      declare
        node int;
    sql text;
      begin
        select application_name::int into node from pg_stat_activity where pid=pg_backend_pid();
    sql := $a$insert into digoal.userinfo select $a$||node||$a$,generate_series($a$||node||$a$,32000000,32)$a$;
    execute sql;
    sql := $a$insert into digoal.session select dbid,userid from digoal.userinfo$a$;
    execute sql;
        return;
      exception when others then return;
      end;
    $function$;

CREATE OR REPLACE FUNCTION digoal.query_pk(IN i_userid int, OUT dbid int, OUT userid int, OUT info text)
     RETURNS record
     LANGUAGE plpgsql
     STRICT
    AS $function$
      declare
      begin
        select t.dbid,t.userid,t.info into dbid,userid,info from digoal.userinfo t where t.userid=i_userid;
        return;
      end;
    $function$;

CREATE OR REPLACE FUNCTION digoal.insert_log(IN i_userid int)
     RETURNS void
     LANGUAGE plpgsql
     STRICT
    AS $function$
      declare
      begin
        insert into digoal.login_log (userid,db_user,client_addr,client_port,server_addr,server_port,login_time)
        values (i_userid,current_user,inet_client_addr(),inet_client_port(),inet_server_addr(),inet_server_port(),now());
      end;
    $function$;

CREATE OR REPLACE FUNCTION digoal.query_insert(IN i_userid int, OUT dbid int, OUT userid int, OUT info text)
     RETURNS record
     LANGUAGE plpgsql
     STRICT
    AS $function$
      declare
      begin
        select t.dbid,t.userid,t.info into dbid,userid,info from digoal.userinfo t where t.userid=i_userid;
        insert into digoal.login_log (userid,db_user,client_addr,client_port,server_addr,server_port,login_time)
        values (i_userid,current_user,inet_client_addr(),inet_client_port(),inet_server_addr(),inet_server_port(),now());
        return;
      end;
    $function$;


CREATE OR REPLACE FUNCTION digoal.update_pk(IN i_userid int)
     RETURNS void
     LANGUAGE plpgsql
     STRICT
    AS $function$
      declare
      begin
        update digoal.session t set last_login=now() where t.userid=i_userid;
      end;
    $function$;

CREATE OR REPLACE FUNCTION digoal.query_update_insert(IN i_userid int, OUT dbid int, OUT userid int, OUT info text)
     RETURNS record
     LANGUAGE plpgsql
     STRICT
    AS $function$
      declare
      begin
        select t.dbid,t.userid,t.info into dbid,userid,info from digoal.userinfo t where t.userid=i_userid;
        insert into digoal.login_log (userid,db_user,client_addr,client_port,server_addr,server_port,login_time)
        values (i_userid,current_user,inet_client_addr(),inet_client_port(),inet_server_addr(),inet_server_port(),now());
        update digoal.session t set last_login=now() where t.userid=i_userid;
        return;
      end;
    $function$;

CREATE OR REPLACE FUNCTION digoal.query_smalltbl(IN i_userid int, OUT userid int, OUT info text)
     RETURNS record
     LANGUAGE plpgsql
     STRICT
    AS $function$
      declare
      begin
        select t.userid,t.info into userid,info from digoal.tbl_small t where t.userid=i_userid;
        return;
      end;
    $function$;


PL/Proxy節點操作如下:
配置.pgpass, 新增:
new.pg.rds.aliyuncs.com:3433:*:digoal:digoal
使用default_transaction_read_only默認讀事務,屏蔽寫操作.
在遷移時間段內,用戶可以正常執行讀請求,但是如果執行寫請求會失敗,這樣確保數據遷移的一致性,同時降低遷移過程對業務的影響。
postgres# alter server rds_pg_cluster options (set p23 'host=old.pg.rds.aliyuncs.com dbname=db23 port=3433 user=digoal keepalives_idle=30 keepalives_interval=10 keepalives_count=10 application_name=23 options=''-c default_transaction_read_only=true''');    -- 注意裏麵是兩個單引號
測試讀正常,寫失敗:
postgres=# select query_pk(23);
 query_pk 
----------
 (23,23,)
(1 row)

postgres=# select insert_log(23);
ERROR:  public.insert_log(1): [db23] REMOTE ERROR: cannot execute INSERT in a read-only transaction
CONTEXT:  Remote context: SQL statement "insert into digoal.login_log (userid,db_user,client_addr,client_port,server_addr,server_port,login_time)
    values (i_userid,current_user,inet_client_addr(),inet_client_port(),inet_server_addr(),inet_server_port(),now())"
PL/pgSQL function insert_log(integer) line 4 at SQL statement

new RDS執行操作如下,刷新物化視圖:
refresh materialized view CONCURRENTLY digoal.userinfo;
REFRESH MATERIALIZED VIEW
Time: 10953.220 ms
refresh materialized view CONCURRENTLY digoal.session;
REFRESH MATERIALIZED VIEW
Time: 11013.860 ms
refresh materialized view CONCURRENTLY digoal.tbl_small;
REFRESH MATERIALIZED VIEW
Time: 5084.118 ms

這裏卡住,因為要修改數據字典需要超級用戶。而RDS提供的用戶是普通用戶,無法修改數據字典(雖然有風險,這裏隻為演示)。所以這樣遷移行不通。
如果是超級用戶,那麼操作請參考我前期寫的BLOG
https://blog.163.com/digoal@126/blog/static/163877040201559105235803/

為了演示下去,我隻能選擇全量遷移。(其他增量方法也有,本文不演示)
db23=> drop materialized view digoal.session ;
DROP MATERIALIZED VIEW
Time: 16.528 ms
db23=> drop materialized view digoal.userinfo;
DROP MATERIALIZED VIEW
Time: 15.781 ms
db23=> drop materialized view digoal.tbl_small;
DROP MATERIALIZED VIEW
Time: 9.458 ms

為了提高遷移速度,用了一些手段。
set synchronous_commit=off;
set maintenance_work_mem='10GB';  --  超出RDS內存限製,可能會被殺掉
CREATE TABLE digoal.userinfo (
        dbid int default 23,
        userid int,
    info text
) with (autovacuum_enabled=off, toast.autovacuum_enabled=off);

CREATE TABLE digoal.session (
        dbid int default 23,
        userid int,
    last_login timestamp without time zone
) with (autovacuum_enabled=off, toast.autovacuum_enabled=off);

CREATE TABLE digoal.tbl_small (
        userid int,
    info text
) with (autovacuum_enabled=off, toast.autovacuum_enabled=off);

insert into digoal.userinfo select * from digoal.ft_userinfo;
INSERT 0 1000000
Time: 45290.701 ms
insert into digoal.session select * from digoal.ft_session;
INSERT 0 1000000
Time: 42212.278 ms
insert into digoal.tbl_small select * from digoal.ft_tbl_small;
INSERT 0 500000
Time: 22885.456 ms

alter table digoal.userinfo add constraint pk_userinfo primary key (userid);
ALTER TABLE
Time: 16962.174 ms
alter table digoal.session add constraint pk_session primary key (userid);
ALTER TABLE
Time: 20809.422 ms
alter table digoal.tbl_small add constraint pk_tbl_small primary key (userid);
ALTER TABLE
Time: 17484.201 ms

vacuum analyze digoal.userinfo;
Time: 65.790 ms
vacuum analyze digoal.session;
Time: 65.427 ms
vacuum analyze digoal.tbl_small;
Time: 45.453 ms

alter table digoal.userinfo set (autovacuum_enabled=on, toast.autovacuum_enabled=on);
alter table digoal.session set (autovacuum_enabled=on, toast.autovacuum_enabled=on);
alter table digoal.tbl_small set (autovacuum_enabled=on, toast.autovacuum_enabled=on);


PL/Proxy,  修改集群,db23的目標主機為新的RDS,並且開放讀寫權限:
postgres# alter server rds_pg_cluster options (set p23 'host=old.pg.rds.aliyuncs.com dbname=db23 port=3433 user=digoal keepalives_idle=30 keepalives_interval=10 keepalives_count=10 application_name=23');


刪除老RDS上的db23.
psql -h old.pg.rds.aliyuncs.com -p 3433 -U digoal postgres
postgres=> drop database db23;
DROP DATABASE

測試plproxy分發是否正常分發到新的數據庫:
postgres=# select * from query_pk(23);
 dbid | userid | info 
------+--------+------
   23 |     23 | 
(1 row)

vi test.sql
\setrandom id 1 32000000
select insert_log(:id);
pgbench -M prepared -n -r -f ./test.sql -P 1 -c 30 -j 30 -T 30
progress: 1.0 s, 7853.2 tps, lat 3.340 ms stddev 6.056
progress: 2.0 s, 10766.4 tps, lat 2.432 ms stddev 5.433
progress: 3.0 s, 11395.0 tps, lat 2.277 ms stddev 4.590
progress: 4.0 s, 11622.1 tps, lat 2.216 ms stddev 4.493
progress: 5.0 s, 10519.9 tps, lat 2.454 ms stddev 5.600
progress: 6.0 s, 11153.4 tps, lat 2.317 ms stddev 4.795
progress: 7.0 s, 11474.3 tps, lat 2.312 ms stddev 4.802
progress: 8.0 s, 11398.5 tps, lat 2.253 ms stddev 4.308
progress: 9.0 s, 12106.7 tps, lat 2.174 ms stddev 3.302
progress: 10.0 s, 12567.8 tps, lat 2.068 ms stddev 2.075

連接到新的db23:
psql -h new_rds -p 3433 -U digoal
postgres=> \c db23
psql (9.4.3, server 9.4.1)
You are now connected to database "db23" as user "digoal".
db23=> select count(*) from login_log ;
 count 
-------
 10547
(1 row)

[補充]
1. 如果實際使用內存超出了RDS內存限製,會被殺掉。
postgres=> set maintenance_work_mem='10GB';
SET
postgres=> alter table session add constraint pk_session primary key(userid);
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> \q

[參考]
1. https://blog.163.com/digoal@126/blog/static/163877040201559105235803/
2. https://www.postgresql.org/docs/9.4/static/libpq-connect.html

最後更新:2017-04-01 13:44:33

  上一篇:go Greenplum 添加或刪除standby master節點
  下一篇:go PostgreSQL的遠程數據操作---postgres_fdw