PostgreSQL 多字段任意組合搜索的性能

標簽

PostgreSQL , 多字段搜索 , 任意字段搜索

背景

PostgreSQL 多字段，任意組合搜索，有三種優化手段：

1、gin索引（支持任意字段組合的查詢）

《寶劍贈英雄 - 任意組合字段等效查詢, 探探PostgreSQL多列展開式B樹 (GIN)》

2、bloom索引（支持任意隻讀組合的等值查詢）

《PostgreSQL 9.6 黑科技 bloom 算法索引，一個索引支撐任意列組合查詢》

3、每個單列btree索引（支持任意字段組合的查詢）

《PostgreSQL bitmapAnd, bitmapOr, bitmap index scan, bitmap heap scan》

例子

create table test(c1 int, c2 int, c3 int, c4 int, c5 int);

bloom, gin, multi-btree幾種索引創建方法

1、bloom

postgres=# create extension bloom ;  
CREATE EXTENSION  
postgres=# create index idx_test12_1 on test12 using bloom (c1,c2,c3,c4,c5);  
CREATE INDEX  
postgres=# explain select * from test12 where c1=1;  
                                 QUERY PLAN                                   
----------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=13.95..20.32 rows=8 width=20)  
   Recheck Cond: (c1 = 1)  
   ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..13.95 rows=8 width=0)  
         Index Cond: (c1 = 1)  
(4 rows)  
postgres=# explain select * from test12 where c1=1 and c2=1;  
                                 QUERY PLAN                                   
----------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=18.20..19.42 rows=1 width=20)  
   Recheck Cond: ((c1 = 1) AND (c2 = 1))  
   ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..18.20 rows=1 width=0)  
         Index Cond: ((c1 = 1) AND (c2 = 1))  
(4 rows)  
postgres=# explain select * from test12 where c1=1 or c2=1;  
                                    QUERY PLAN                                      
----------------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=27.91..38.16 rows=17 width=20)  
   Recheck Cond: ((c1 = 1) OR (c2 = 1))  
   ->  BitmapOr  (cost=27.91..27.91 rows=17 width=0)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..13.95 rows=8 width=0)  
               Index Cond: (c1 = 1)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..13.95 rows=8 width=0)  
               Index Cond: (c2 = 1)  
(7 rows)

2、gin

postgres=# create extension btree_gin;  
CREATE EXTENSION  
postgres=# create index idx_test12_1 on test12 using gin (c1,c2,c3,c4,c5);  
CREATE INDEX  
postgres=# explain select * from test12 where c1=1 or c2=1;  
                                   QUERY PLAN                                      
---------------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=4.94..15.19 rows=17 width=20)  
   Recheck Cond: ((c1 = 1) OR (c2 = 1))  
   ->  BitmapOr  (cost=4.94..4.94 rows=17 width=0)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..2.46 rows=8 width=0)  
               Index Cond: (c1 = 1)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..2.46 rows=8 width=0)  
               Index Cond: (c2 = 1)  
(7 rows)  
  
postgres=# explain select * from test12 where c1=1 and c2=1;  
                                QUERY PLAN                                   
---------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=3.60..4.82 rows=1 width=20)  
   Recheck Cond: ((c1 = 1) AND (c2 = 1))  
   ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..3.60 rows=1 width=0)  
         Index Cond: ((c1 = 1) AND (c2 = 1))  
(4 rows)

3、multi-btree

postgres=# drop index idx_test12_1 ;  
DROP INDEX  
postgres=# create index idx_test12_1 on test12 using btree(c1);  
CREATE INDEX  
postgres=# create index idx_test12_2 on test12 using btree(c2);  
CREATE INDEX  
postgres=# create index idx_test12_3 on test12 using btree(c3);  
CREATE INDEX  
postgres=# create index idx_test12_4 on test12 using btree(c4);  
CREATE INDEX  
postgres=# create index idx_test12_5 on test12 using btree(c5);  
CREATE INDEX  
  
postgres=# explain select * from test12 where c1=1 and c2=1;  
                                   QUERY PLAN                                      
---------------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=3.08..4.29 rows=1 width=20)  
   Recheck Cond: ((c2 = 1) AND (c1 = 1))  
   ->  BitmapAnd  (cost=3.08..3.08 rows=1 width=0)  
         ->  Bitmap Index Scan on idx_test12_2  (cost=0.00..1.41 rows=8 width=0)  
               Index Cond: (c2 = 1)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..1.41 rows=8 width=0)  
               Index Cond: (c1 = 1)  
(7 rows)  
  
postgres=# explain select * from test12 where c1=1 or c2=1;  
                                   QUERY PLAN                                      
---------------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=2.83..13.09 rows=17 width=20)  
   Recheck Cond: ((c1 = 1) OR (c2 = 1))  
   ->  BitmapOr  (cost=2.83..2.83 rows=17 width=0)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..1.41 rows=8 width=0)  
               Index Cond: (c1 = 1)  
         ->  Bitmap Index Scan on idx_test12_2  (cost=0.00..1.41 rows=8 width=0)  
               Index Cond: (c2 = 1)  
(7 rows)

gin, bloom, btree bitmap scan的性能如何呢？

1600個列的寬表，任意字段組合搜索性能

1、建表

postgres=# do language plpgsql $$  
declare  
  sql text;  
begin  
  sql := 'create table test1 (';  
  for i in 1..1600 loop  
    sql := sql||' c'||i||' int2 default random()*100,';  
  end loop;  
  sql := rtrim(sql,',');  
  sql := sql||')';  
  execute sql;   
    
  for i in 1..1600 loop   
    execute 'create index idx_test1_'||i||' on test1 (c'||i||')';   
  end loop;  
end;  
$$;  
DO

2、寫入測試數據

postgres=# insert into test1 (c1)  select generate_series(1,10000);  
INSERT 0 10000

3、測試腳本

vi test.sql  
  
\set c2 random(1,100)  
\set c3 random(1,100)  
\set c4 random(1,100)  
\set c5 random(1,100)  
\set c6 random(1,100)  
\set c7 random(1,100)  
select c1600 from test1 where c2=:c2 and c3=:c3 and c4=:c4 or (c5=:c5 and c6=:c6 and c7=:c7);

4、測試

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 120

5、性能

progress: 33.0 s, 208797.8 tps, lat 0.307 ms stddev 0.016  
progress: 34.0 s, 208516.0 tps, lat 0.307 ms stddev 0.032  
progress: 35.0 s, 208574.0 tps, lat 0.307 ms stddev 0.050  
progress: 36.0 s, 208858.2 tps, lat 0.306 ms stddev 0.013  
progress: 37.0 s, 208686.8 tps, lat 0.307 ms stddev 0.043  
progress: 38.0 s, 208764.2 tps, lat 0.307 ms stddev 0.013

注意，使用prepared statement，可以減少硬解析，提高性能。

從測試來看，任意字段的搜索，可以達到0.3毫秒的響應。

最後更新：2017-11-12 01:34:41

PostgreSQL 多字段任意組合搜索的性能

標簽

背景

bloom, gin, multi-btree幾種索引創建方法

1600個列的寬表，任意字段組合搜索性能

上一篇： PostgreSQL 按需切片的實現(TimescaleDB插件自動切片功能的plpgsql schemaless實現)

下一篇： PostgreSQL pg_stat_reset清除track_counts的隱患

相關內容

熱門內容

最新內容