閱讀284 返回首頁    go 技術社區[雲棲]


PostgreSQL 按需切片的實現(TimescaleDB插件自動切片功能的plpgsql schemaless實現)

標簽

PostgreSQL , schemaless , 自動切片 , track_count , 計數器 , udf , plpgsql , timescaledb


背景

TimescaleDB是PostgreSQL的一款時序數據庫插件,其中自動切片是一個非常深入人心的功能。

https://www.timescale.com/

pic

實際上PostgreSQL plpgsql也可以實現類似的功能,當然,前提是需要使用schemaless的模式。

schemaless的設計思路和應用舉例:

《PostgreSQL 在鐵老大訂單係統中的schemaless設計和性能壓測》

《PostgreSQL schemaless 的實現(類mongodb collection)》

《PostgreSQL 時序最佳實踐 - 證券交易係統數據庫設計 - 阿裏雲RDS PostgreSQL最佳實踐》

下麵,就自動切片這個功能,我們看看schemaless的實現例子。

1、首先要監測寫入量,通過track_counts參數,可以給數據的寫入計數(默認track_counts參數是開啟的)。

《PostgreSQL pg_stat_reset清除track_counts的隱患》

postgres=# select * from pg_stat_all_tables where relname='test1';    
-[ RECORD 1 ]-------+-------    
relid               | 31129    
schemaname          | public    
relname             | test1    
seq_scan            | 0    
seq_tup_read        | 0    
idx_scan            |     
idx_tup_fetch       |     
n_tup_ins           | 1000    
n_tup_upd           | 0    
n_tup_del           | 0    
n_tup_hot_upd       | 0    
n_live_tup          | 1000    
n_dead_tup          | 0    
n_mod_since_analyze | 1000    
last_vacuum         |     
last_autovacuum     |     
last_analyze        |     
last_autoanalyze    |     
vacuum_count        | 0    
autovacuum_count    | 0    
analyze_count       | 0    
autoanalyze_count   | 0    

2、當數據寫入到一定量時,自動寫下一張表。

schemaless自動切片例子

1、設計:

寫入時,通過UDF寫入,並且實時監控每種數據流的寫入速度,並動態做數據分片。

2、測試表:

create table log(id int, info text, crt_time timestamp default now());    
create index idx_log_crt_time on log(crt_time);    
create table log_tmp (like log including all) inherits(log);    

3、分片規則:

當記錄數超過100000時,自動切換分區。

4、UDF定義:

create or replace function f(v_id int, v_info text) returns void as $$    
declare    
  suffix int;    
  v_rows int8;    
  min_time timestamp;    
  max_time timestamp;    
begin    
  -- 插入    
  insert into log_tmp(id,info) values (v_id, v_info);    
      
  -- 判斷記錄數是否達到切換閾值    
  select n_live_tup into v_rows from pg_stat_all_tables where relname='log_tmp' and schemaname='public';    
      
  -- 達到閾值,切換表    
  if v_rows >=100000 then    
    select count(*) into suffix from pg_inherits where inhparent='log'::regclass;    
    select min(crt_time), max(crt_time) into min_time, max_time from log_tmp ;    
    execute 'alter table log_tmp add constraint ck_log_'||suffix||'_1 check (crt_time>='''||min_time||''' and crt_time<='''||max_time||''')';    
    execute 'alter table log_tmp rename to log_'||suffix;    
    create table log_tmp (like log including all) inherits(log);    
  end if;    
      
  return;    
  exception when others then    
    return;    
end;    
$$ language plpgsql strict;    

5、壓測

vi test.sql    
    
select f(1, 'test');    
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 120    
    
    
NOTICE:  merging column "id" with inherited definition    
NOTICE:  merging column "info" with inherited definition    
NOTICE:  merging column "crt_time" with inherited definition    
progress: 1.0 s, 25350.5 tps, lat 2.487 ms stddev 0.986    
progress: 2.0 s, 26309.0 tps, lat 2.432 ms stddev 0.688    
progress: 3.0 s, 26251.9 tps, lat 2.438 ms stddev 0.741    
progress: 4.0 s, 26451.0 tps, lat 2.420 ms stddev 0.737    
NOTICE:  merging column "id" with inherited definition    
NOTICE:  merging column "info" with inherited definition    
NOTICE:  merging column "crt_time" with inherited definition    
progress: 5.0 s, 29471.0 tps, lat 2.172 ms stddev 0.844    
progress: 6.0 s, 32971.0 tps, lat 1.941 ms stddev 0.670    
progress: 7.0 s, 33028.0 tps, lat 1.938 ms stddev 0.661    
progress: 8.0 s, 33101.0 tps, lat 1.933 ms stddev 0.656    
NOTICE:  merging column "id" with inherited definition    
NOTICE:  merging column "info" with inherited definition    
NOTICE:  merging column "crt_time" with inherited definition    
progress: 9.0 s, 32805.0 tps, lat 1.951 ms stddev 0.752    
    
......    

自動切片成功:

postgres=# \d log_1    
                          Table "public.log_1"    
  Column  |            Type             | Collation | Nullable | Default     
----------+-----------------------------+-----------+----------+---------    
 id       | integer                     |           |          |     
 info     | text                        |           |          |     
 crt_time | timestamp without time zone |           |          | now()    
Indexes:    
    "log_tmp_crt_time_idx" btree (crt_time)    
Check constraints:    
    "ck_log_1_1" CHECK (crt_time >= '2017-11-02 23:52:34.264264'::timestamp without time zone AND crt_time <= '2017-11-02 23:54:05.939958'::timestamp without time zone)    
Inherits: log    
    
postgres=# \d log_2    
                          Table "public.log_2"    
  Column  |            Type             | Collation | Nullable | Default     
----------+-----------------------------+-----------+----------+---------    
 id       | integer                     |           |          |     
 info     | text                        |           |          |     
 crt_time | timestamp without time zone |           |          | now()    
Indexes:    
    "log_tmp_crt_time_idx1" btree (crt_time)    
Check constraints:    
    "ck_log_2_1" CHECK (crt_time >= '2017-11-02 23:54:05.948796'::timestamp without time zone AND crt_time <= '2017-11-02 23:54:10.946987'::timestamp without time zone)    
Inherits: log    

小結

timescaleDB插件還有很多其他的功能,在使用方便也更加的邊界,待TimescaleDB插件成熟,還是推薦使用TimescaleDB。

對於阿裏雲RDS PG,使用本文提到的方法,還可以實現實時寫入RDS PG,同時根據設置的閾值,批量寫入OSS外部表(寫OSS外部表可以使用DBLINK的異步接口)。

《阿裏雲RDS PostgreSQL OSS 外部表 - (dblink異步調用封裝)並行寫提速案例》

pic

最後更新:2017-11-12 01:34:44

  上一篇:go  PostgreSQL Oracle兼容性 - 計算字符長度與字節長度(char(?) 與varchar(?)空格如何計算長度)
  下一篇:go  PostgreSQL 多字段任意組合搜索的性能