540 阿裏雲技術社區[雲棲]

PostgreSQL 十億級模煳查詢最佳實踐

標簽

PostgreSQL , 模煳查詢 , 正則查詢 , pg_trgm , bytea , gin , 函數索引

背景

前模煳(有前綴的模煳)，後模煳(有後綴的模煳)，前後模煳(無前後綴的模煳)，正則匹配都屬於文本搜索領域常見的需求。

PostgreSQL擁有很強的文本搜索能力，除了支持全文檢索，還支持模煳查詢、正則查詢。內置的pg_trgm插件是一般數據庫沒有的，可能很多人沒有聽說過。同時還內置了表達式索引、GIN索引的功能。

不同的模煳查詢需求，有不同的優化方法。

對於前模煳和後模煳，PostgreSQL則與其他數據庫一樣，可以使用btree來加速。後模煳可以使用反轉函數的函數索引來加速。

對於前後模煳和正則匹配，一種方法是使用pg_trgm插件，利用GIN索引加速模煳和正則查詢（輸入3個或3個以上字符的模煳查詢效果很好）。另一種方法是自定義GIN表達式索引的方法，適合於定製的模煳查詢。

一、前模煳與後模煳的優化

1. 前模煳(有前綴的模煳)優化方法

使用b-tree可以支持前模煳的查詢。僅適合於collate="C"的查詢，當數據庫默認的lc_collate<>C時，索引和查詢都需要明確指定collate "C"。

索引、查詢條件的collate必須一致才能使用索引。

例子

test=# create table test(id int, info text);    
CREATE TABLE    
test=# insert into test select generate_series(1,1000000),md5(random()::text);    
INSERT 0 1000000    
test=# create index idx on test(info collate "C");    
CREATE INDEX    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test where info like 'abcd%' collate "C";    
                                                      QUERY PLAN                                                          
----------------------------------------------------------------------------------------------------------------------    
 Index Scan using idx on public.test  (cost=0.42..16.76 rows=100 width=37) (actual time=0.057..0.093 rows=18 loops=1)    
   Output: id, info    
   Index Cond: ((test.info >= 'abcd'::text) AND (test.info < 'abce'::text))    
   Filter: (test.info ~~ 'abcd%'::text COLLATE "C")    
   Buffers: shared hit=18 read=3    
 Planning time: 0.424 ms    
 Execution time: 0.124 ms    
(7 rows)

2. 後模煳(有後綴的模煳)的優化方法

使用反轉函數(reverse)索引，可以支持後模煳的查詢。僅適合於collate="C"的查詢，當數據庫默認的lc_collate<>C時，索引和查詢都需要明確指定collate "C"。

索引、查詢條件的collate必須一致才能使用索引。

例子

test=# create index idx1 on test(reverse(info) collate "C");    
CREATE INDEX    
test=# select * from test limit 1;    
 id |               info                   
----+----------------------------------    
  1 | b3275976cdd437a033d4329775a52514    
(1 row)    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test where reverse(info) like '4152%' collate "C";    
                                                        QUERY PLAN                                                            
--------------------------------------------------------------------------------------------------------------------------    
 Index Scan using idx1 on public.test  (cost=0.42..4009.43 rows=5000 width=37) (actual time=0.061..0.097 rows=18 loops=1)    
   Output: id, info    
   Index Cond: ((reverse(test.info) >= '4152'::text) AND (reverse(test.info) < '4153'::text))    
   Filter: (reverse(test.info) ~~ '4152%'::text COLLATE "C")    
   Buffers: shared hit=18 read=3    
 Planning time: 0.128 ms    
 Execution time: 0.122 ms    
(7 rows)    
    
test=# select * from test where reverse(info) like '4152%' collate "C";    
   id   |               info                   
--------+----------------------------------    
 847904 | abe2ecd90393b5275df8e34a39702514    
 414702 | 97f66d26545329321164042657d02514    
 191232 | 7820972c6220c2b01d46c11ebb532514    
 752742 | 93232ac39c6632e2540df44627c42514    
 217302 | 39e518893a1a7b1e691619bd1fc42514    
      1 | b3275976cdd437a033d4329775a52514    
 615718 | 4948f94c484c13dc6c4fae8a3db52514    
 308815 | fc2918ceff7c7a4dafd2e04031062514    
 149521 | 546d963842ea5ca593e622c810262514    
 811093 | 4b6eca2eb6b665af67b2813e91a62514    
 209000 | 1dfd0d4e326715c1739f031cca992514    
 937616 | 8827fd81f5b673fb5afecbe3e11b2514    
 419553 | bd6e01ce360af16137e8b6abc8ab2514    
 998324 | 7dff51c19dc5e5d9979163e7d14c2514    
 771518 | 8a54e30003a48539fff0aedc73ac2514    
 691566 | f90368348e3b6bf983fcbe10db2d2514    
 652274 | 8bf4a97b5f122a5540a21fa85ead2514    
 233437 | 739ed715fc203d47e37e79b5bcbe2514    
(18 rows)

3. 前、後模煳的合體優化方法

使用pg_trgm索引，可以支持前、後模煳的查詢，注意：

(有前綴的模煳)至少輸入1個字符，(有後綴的模煳)至少輸入2個字符。才有好的索引過濾效果。如果要支持中文，數據庫lc_collate,lc_ctype不能為"C"。

索引、查詢條件的collate必須一致才能使用索引。

test=# \l+ test    
                                              List of databases    
 Name |  Owner   | Encoding |  Collate   |   Ctype    | Access privileges |  Size  | Tablespace | Description     
------+----------+----------+------------+------------+-------------------+--------+------------+-------------    
 test | postgres | UTF8     | zh_CN.utf8 | zh_CN.utf8 |                   | 245 MB | pg_default |     
(1 row)    
    
test=# create extension pg_trgm;    
    
test=# create table test001(c1 text);    
CREATE TABLE

生成隨機中文字符串的函數

test=# create or replace function gen_hanzi(int) returns text as $$                
declare      
  res text;      
begin      
  if $1 >=1 then      
    select string_agg(chr(19968+(random()*20901)::int), '') into res from generate_series(1,$1);      
    return res;      
  end if;      
  return null;      
end;      
$$ language plpgsql strict;      
CREATE FUNCTION

生成隨機數據

test=# insert into test001 select gen_hanzi(20) from generate_series(1,100000);    
INSERT 0 100000    
    
test=# create index idx_test001_1 on test001 using gin (c1 gin_trgm_ops);    
CREATE INDEX    
    
test=# select * from test001 limit 5;    
                    c1                        
------------------------------------------    
 埳噪辦甾訥昃碇玾隉箖燋邢賀浮媊踮菵暔谉橅    
 秌橑籛鴎擬倶敤麁黿醠轇坙騉鏦纗蘛婃坹嫻儅    
 蔎緾鎧爪鵬二悲膼朠麻鸂鋬楨窷違繇糭嘓索籓    
 馳泅薬鐗愅撞竅溮滲蛁灎厀攚摐瞪拡擜詜隝緼    
 襳鋪煃匶瀌懲荼黹樆惺篋搔羾憯墆鋃硍蔓恧顤    
(5 rows)

模煳查詢

test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '你%';    
                                                      QUERY PLAN                                                           
-----------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=5.08..15.20 rows=10 width=61) (actual time=0.030..0.034 rows=3 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~~ '你%'::text)    
   Heap Blocks: exact=3    
   Buffers: shared hit=7    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..5.08 rows=10 width=0) (actual time=0.020..0.020 rows=3 loops=1)    
         Index Cond: (test001.c1 ~~ '你%'::text)    
         Buffers: shared hit=4    
 Planning time: 0.119 ms    
 Execution time: 0.063 ms    
(10 rows)    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '%恧顤';    
                                                      QUERY PLAN                                                           
-----------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=5.08..15.20 rows=10 width=61) (actual time=0.031..0.034 rows=1 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~~ '%恧顤'::text)    
   Rows Removed by Index Recheck: 1    
   Heap Blocks: exact=2    
   Buffers: shared hit=6    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..5.08 rows=10 width=0) (actual time=0.020..0.020 rows=2 loops=1)    
         Index Cond: (test001.c1 ~~ '%恧顤'::text)    
         Buffers: shared hit=4    
 Planning time: 0.136 ms    
 Execution time: 0.062 ms    
(11 rows)

二、前後均模煳的優化

使用pg_trgm插件，支持前後模煳的查詢。注意：

如果需要讓pg_trgm支持中文的模煳查詢，數據庫lc_collate,lc_ctype不能為"C"。

比如輸入3個或3個以上字符，否則效果不佳。

例子

test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '%燋邢賀%';    
                                                      QUERY PLAN                                                           
-----------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=5.08..15.20 rows=10 width=61) (actual time=0.038..0.038 rows=1 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~~ '%燋邢賀%'::text)    
   Heap Blocks: exact=1    
   Buffers: shared hit=5    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..5.08 rows=10 width=0) (actual time=0.025..0.025 rows=1 loops=1)    
         Index Cond: (test001.c1 ~~ '%燋邢賀%'::text)    
         Buffers: shared hit=4    
 Planning time: 0.170 ms    
 Execution time: 0.076 ms    
(10 rows)    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '%燋邢%';    
                                                              QUERY PLAN                                                                  
--------------------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=7615669.08..7615679.20 rows=10 width=61) (actual time=147.524..178.232 rows=1 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~~ '%燋邢%'::text)    
   Rows Removed by Index Recheck: 99999    
   Heap Blocks: exact=1137    
   Buffers: shared hit=14429    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..7615669.08 rows=10 width=0) (actual time=147.377..147.377 rows=100000 loops=1)    
         Index Cond: (test001.c1 ~~ '%燋邢%'::text)    
         Buffers: shared hit=13292    
 Planning time: 0.133 ms    
 Execution time: 178.265 ms    
(11 rows)

三、正則匹配的優化

目前，pg_trgm對中文(或多字節字符)的正則匹配效果不佳，對ascii字符的正則匹配效果很好。

Rows Removed by Index Recheck很小，說明索引過濾性很好。

例子

test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 ~ '12[0-9]{3,9}';    
                                                       QUERY PLAN                                                           
------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=65.08..75.20 rows=10 width=61) (actual time=0.196..0.196 rows=0 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~ '12[0-9]{3,9}'::text)    
   Rows Removed by Index Recheck: 1    
   Heap Blocks: exact=1    
   Buffers: shared hit=50    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..65.08 rows=10 width=0) (actual time=0.183..0.183 rows=1 loops=1)    
         Index Cond: (test001.c1 ~ '12[0-9]{3,9}'::text)    
         Buffers: shared hit=49    
 Planning time: 0.452 ms    
 Execution time: 0.221 ms    
(11 rows)    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 ~ '燋邢賀';    
                                                              QUERY PLAN                                                                  
--------------------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=7615669.08..7615679.20 rows=10 width=61) (actual time=143.846..232.278 rows=1 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~ '燋邢賀'::text)    
   Rows Removed by Index Recheck: 99999    
   Heap Blocks: exact=1137    
   Buffers: shared hit=14429    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..7615669.08 rows=10 width=0) (actual time=143.688..143.688 rows=100000 loops=1)    
         Index Cond: (test001.c1 ~ '燋邢賀'::text)    
         Buffers: shared hit=13292    
 Planning time: 0.254 ms    
 Execution time: 232.312 ms    
(11 rows)

pg_trgm模煳查詢的原理

首先，pg_trgm將字符串的前端添加2個空格，末尾添加1個空格。

然後，每連續的3個字符為一個TOKEN，拆開。

最後，對TOKEN建立GIN倒排索引。

查看字符串的TOKEN，可以使用如下方法。

test=# select show_trgm('123');    
        show_trgm            
-------------------------    
 {"  1"," 12",123,"23 "}    
(1 row)

pg_trgm前後模煳字符個數要求的原因

使用pg_trgm時，如果要獲得最好的效果，最好滿足這些條件。

1. 有前綴的模煳查詢，例如a%，至少需要提供1個字符。( 搜索的是token=' a' )

2. 有後綴的模煳查詢，例如%ab，至少需要提供2個字符。( 搜索的是token='ab ' )

3. 前後模煳查詢，例如%abcd%，至少需要提供3個字符。( 這個使用數組搜索，搜索的是token(s) 包含 {" a"," ab",abc,bcd,"cd "} )

原因是什麼呢？

因為pg_trgm生成的TOKEN是三個字符，隻有在以上三個條件下，才能匹配到對應的TOKEN。

test=# select show_trgm('123');    
        show_trgm            
-------------------------    
 {"  1"," 12",123,"23 "}    
(1 row)

四、小於3個輸入字符的模煳查詢的優化

當需要前後模煳搜索1個或者2個字符時，pg_trgm無法滿足需求，但是我們可以使用表達式GIN索引。

使用表達式，將字符串拆成1個單字，兩個連續的字符的數組，對數組建立GIN索引即可。

例子

test=# create or replace function split001(text) returns text[] as $$    
declare    
  res text[];    
begin    
  select regexp_split_to_array($1,'') into res;    
  for i in 1..length($1)-1 loop    
    res := array_append(res, substring($1,i,2));    
  end loop;    
  return res;    
end;    
$$ language plpgsql strict immutable;    
CREATE FUNCTION    
    
test=# create index idx_test001_2 on test001 using gin (split001(c1));    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where split001(c1) @> array['你好'];    
                                                       QUERY PLAN                                                           
------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=8.87..550.12 rows=500 width=61) (actual time=0.041..0.041 rows=0 loops=1)    
   Output: c1    
   Recheck Cond: (split001(test001.c1) @> '{你好}'::text[])    
   Buffers: shared hit=4    
   ->  Bitmap Index Scan on idx_test001_2  (cost=0.00..8.75 rows=500 width=0) (actual time=0.039..0.039 rows=0 loops=1)    
         Index Cond: (split001(test001.c1) @> '{你好}'::text[])    
         Buffers: shared hit=4    
 Planning time: 0.104 ms    
 Execution time: 0.068 ms    
(9 rows)    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where split001(c1) @> array['你'];    
                                                       QUERY PLAN                                                            
-------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=8.87..550.12 rows=500 width=61) (actual time=0.063..0.183 rows=86 loops=1)    
   Output: c1    
   Recheck Cond: (split001(test001.c1) @> '{你}'::text[])    
   Heap Blocks: exact=80    
   Buffers: shared hit=84    
   ->  Bitmap Index Scan on idx_test001_2  (cost=0.00..8.75 rows=500 width=0) (actual time=0.048..0.048 rows=86 loops=1)    
         Index Cond: (split001(test001.c1) @> '{你}'::text[])    
         Buffers: shared hit=4    
 Planning time: 0.101 ms    
 Execution time: 0.217 ms    
(10 rows)    
    
test=# select * from test001 where split001(c1) @> array['你'];    
                    c1                        
------------------------------------------    
 殐踨洪冨垓丩賢閚偉垢胸鍘崩你萭隡劭芛雫袰    
 靅慨熱臉罆淓寘鰻總襎戍謸棖陪丼倫柆套你仮    
......

五、相似查詢優化

模煳查詢和正則匹配都是找出完全符合條件的記錄，還有一種需求是相似查詢。

例如postgresql字符串，輸入 p0stgresgl 也能根據相似度匹配到。

例子

test=# create index idx_test001_3 on test001 using gist (c1 gist_trgm_ops);    
CREATE INDEX    
      
test=# explain (analyze,verbose,timing,costs,buffers) SELECT t, c1 <-> '癷磛鶿蠌鰓蠲123鶡埀婎鯿苿奶垨惸溴蔻筴熝憡' AS dist      
  FROM test001 t      
  ORDER BY dist LIMIT 5;      
                                                                   QUERY PLAN                                                                        
-------------------------------------------------------------------------------------------------------------------------------------------------    
 Limit  (cost=0.28..0.52 rows=5 width=89) (actual time=37.462..37.639 rows=5 loops=1)    
   Output: t.*, ((c1 <-> '癷磛鶿蠌鰓蠲123鶡埀婎鯿苿奶垨惸溴蔻筴熝憡'::text))    
   Buffers: shared hit=1631    
   ->  Index Scan using idx_test001_3 on public.test001 t  (cost=0.28..4763.28 rows=100000 width=89) (actual time=37.461..37.636 rows=5 loops=1)    
         Output: t.*, (c1 <-> '癷磛鶿蠌鰓蠲123鶡埀婎鯿苿奶垨惸溴蔻筴熝憡'::text)    
         Order By: (t.c1 <-> '癷磛鶿蠌鰓蠲123鶡埀婎鯿苿奶垨惸溴蔻筴熝憡'::text)    
         Buffers: shared hit=1631    
 Planning time: 0.089 ms    
 Execution time: 37.668 ms    
(9 rows)    
    
test=# SELECT t, c1 <-> '癷磛鶿蠌鰓蠲123鶡埀婎鯿苿奶垨惸溴蔻筴熝憡' AS dist      
  FROM test001 t      
  ORDER BY dist LIMIT 5;      
                     t                      |   dist       
--------------------------------------------+----------    
 (癷磛鶿蠌鰓蠲你鶡埀婎鯿苿奶垨惸溴蔻筴熝憡) | 0.307692    
 (坆桻悁斾耾瑚豌腏炁悿隖軻盃掗稐睟礓蜮鉛湆) | 0.976744    
 (癷鉜餯祂鼃恫蝅瓟顡廕梍蛸歡僷贊敔欓侑韌鐹) | 0.976744    
 (癷謔鳬戚蹪熼胘檙佌欔韜挹樷覄惶蹝頊鑜鞖媗) | 0.976744    
 (癷饎瞲餿堒歃峽盾豼擔禞嵪豦咢脈餛竨濟隘緘) | 0.976744    
(5 rows)

六、小結

1. 如果隻有前模煳查詢需求，使用collate "C"的b-tree索引。

2. 如果隻有後模煳的查詢需求，使用collate "C"的reverse()表達式的b-tree索引。

3. 如果有前後模煳查詢需求，並且包含中文，請使用lc_collate,lc_ctype <> "C"的數據庫，同時使用pg_trgm插件的gin索引。

4. 如果有前後模煳查詢需求，並且不包含中文，請使用pg_trgm插件的gin索引。

5. 如果有含有ascii字符的正則表達式查詢需求，請使用pg_trgm插件的gin索引。

6. 如果有輸入條件少於3個字符的模煳查詢需求，可以使用GIN表達式索引，通過數組包含的方式進行搜索，性能一樣非常好。

七、性能

1億條記錄，每條記錄15個隨機中文，一共8GB。測試前後模煳查詢性能。

1. 生成測試數據

vi test.sql    
insert into test001 select gen_hanzi(15) from generate_series(1,2500000);    
    
pgbench -n -r -P 1 -f ./test.sql -c 40 -j 40 -t 1 test    
  
  
  
test=# select count(*) from test001;    
   count       
-----------    
 100000000    
(1 row)    
test=# select * from test001 limit 10;  
               c1                 
--------------------------------  
 釾笉皜鰈確艄騚馺腃彊釲忰采汦擇  
 槮搮圮墔婂蹾飄孡鶒鎮貲聵線麯櫕  
 孨鄈韞萅赫炧暤蟠檼駧餪崉媧譌筯  
 烸喖醝稦怩鷟棾奜妛曫仾飛饡繪韋  
 撐豁襉峊炠眏罱襄彊鰮莆壏妒辷闤  
 蜁愊鶱磹貳帵眲嚉榑蒼潵簷簄椰魨  
 瑄翁蠃巨躋壾蛸湗鑂顂櫟砣八癱栵  
 餷巍笿鞽裝棊嘢恓煓熴錩鋈蹃煿屓  
 訆韄踔牤嘇糺絢軿鵑燿螛梋鰢謇郼  
 撲蓨傷釱糕觩嬖蓷鰼繩圓醷熌靉掑  
(10 rows)

2. 創建索引

test=# set maintenance_work_mem ='32GB';    
test=# create index idx_test001_1 on test001 using gin (c1 gin_trgm_ops);

3. 模煳查詢性能測試

3.1 前模煳

響應時間：9毫秒

返回4701行

select * from test001 where c1 like '你%';  
  
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '你%';  
                                                          QUERY PLAN                                                            
------------------------------------------------------------------------------------------------------------------------------  
 Bitmap Heap Scan on public.test001  (cost=89.50..10161.50 rows=10000 width=46) (actual time=1.546..8.868 rows=4701 loops=1)  
   Output: c1  
   Recheck Cond: (test001.c1 ~~ '你%'::text)  
   Rows Removed by Index Recheck: 85  
   Heap Blocks: exact=4776  
   Buffers: shared hit=4784  
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..87.00 rows=10000 width=0) (actual time=0.879..0.879 rows=4786 loops=1)  
         Index Cond: (test001.c1 ~~ '你%'::text)  
         Buffers: shared hit=8  
 Planning time: 0.099 ms  
 Execution time: 9.166 ms  
(11 rows)

3.2 後模煳

響應時間：0.25毫秒

返回2行

select * from test001 where c1 like '%靉掑';  
  
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '%靉掑';  
                                                         QUERY PLAN                                                           
----------------------------------------------------------------------------------------------------------------------------  
 Bitmap Heap Scan on public.test001  (cost=89.50..10161.50 rows=10000 width=46) (actual time=0.049..0.223 rows=2 loops=1)  
   Output: c1  
   Recheck Cond: (test001.c1 ~~ '%靉掑'::text)  
   Rows Removed by Index Recheck: 87  
   Heap Blocks: exact=89  
   Buffers: shared hit=94  
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..87.00 rows=10000 width=0) (actual time=0.031..0.031 rows=89 loops=1)  
         Index Cond: (test001.c1 ~~ '%靉掑'::text)  
         Buffers: shared hit=5  
 Planning time: 0.113 ms  
 Execution time: 0.249 ms  
(11 rows)

3.3 前後模煳

響應時間：0.2毫秒

返回1行

select * from test001 where c1 like '%螛梋鰢%';  
  
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '%螛梋鰢%';  
                                                         QUERY PLAN                                                           
----------------------------------------------------------------------------------------------------------------------------  
 Bitmap Heap Scan on public.test001  (cost=89.50..10161.50 rows=10000 width=46) (actual time=0.044..0.175 rows=1 loops=1)  
   Output: c1  
   Recheck Cond: (test001.c1 ~~ '%螛梋鰢%'::text)  
   Rows Removed by Index Recheck: 81  
   Heap Blocks: exact=82  
   Buffers: shared hit=87  
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..87.00 rows=10000 width=0) (actual time=0.027..0.027 rows=82 loops=1)  
         Index Cond: (test001.c1 ~~ '%螛梋鰢%'::text)  
         Buffers: shared hit=5  
 Planning time: 0.112 ms  
 Execution time: 0.201 ms  
(11 rows)

最後更新：2017-04-26 17:00:37

PostgreSQL 十億級模煳查詢最佳實踐

標簽

背景

一、前模煳與後模煳的優化

二、前後均模煳的優化

三、正則匹配的優化

pg_trgm模煳查詢的原理

pg_trgm前後模煳字符個數要求的原因

四、小於3個輸入字符的模煳查詢的優化

五、相似查詢優化

六、小結

七、性能

上一篇： PostgreSQL 中生成隨機漢字

下一篇：知識點

相關內容

熱門內容

最新內容

PostgreSQL 十億級模煳查詢最佳實踐

標簽

背景

一、前模煳與後模煳的優化

二、前後均模煳的優化

三、正則匹配的優化

pg_trgm模煳查詢的原理

pg_trgm前後模煳字符個數要求的原因

四、小於3個輸入字符的模煳查詢的優化

五、相似查詢優化

六、小結

七、性能

上一篇： PostgreSQL 中生成隨機漢字

下一篇： 知識點

相關內容

熱門內容

最新內容

下一篇：知識點