540 阿里云技术社区[云栖]

PostgreSQL 十亿级模糊查询最佳实践

背景

前模糊(有前缀的模糊)，后模糊(有后缀的模糊)，前后模糊(无前后缀的模糊)，正则匹配都属于文本搜索领域常见的需求。

PostgreSQL拥有很强的文本搜索能力，除了支持全文检索，还支持模糊查询、正则查询。内置的pg_trgm插件是一般数据库没有的，可能很多人没有听说过。同时还内置了表达式索引、GIN索引的功能。

不同的模糊查询需求，有不同的优化方法。

对于前模糊和后模糊，PostgreSQL则与其他数据库一样，可以使用btree来加速。后模糊可以使用反转函数的函数索引来加速。

对于前后模糊和正则匹配，一种方法是使用pg_trgm插件，利用GIN索引加速模糊和正则查询（输入3个或3个以上字符的模糊查询效果很好）。另一种方法是自定义GIN表达式索引的方法，适合于定制的模糊查询。

一、前模糊与后模糊的优化

1. 前模糊(有前缀的模糊)优化方法

使用b-tree可以支持前模糊的查询。仅适合于collate="C"的查询，当数据库默认的lc_collate<>C时，索引和查询都需要明确指定collate "C"。

索引、查询条件的collate必须一致才能使用索引。

例子

test=# create table test(id int, info text);    
CREATE TABLE    
test=# insert into test select generate_series(1,1000000),md5(random()::text);    
INSERT 0 1000000    
test=# create index idx on test(info collate "C");    
CREATE INDEX    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test where info like 'abcd%' collate "C";    
                                                      QUERY PLAN                                                          
----------------------------------------------------------------------------------------------------------------------    
 Index Scan using idx on public.test  (cost=0.42..16.76 rows=100 width=37) (actual time=0.057..0.093 rows=18 loops=1)    
   Output: id, info    
   Index Cond: ((test.info >= 'abcd'::text) AND (test.info < 'abce'::text))    
   Filter: (test.info ~~ 'abcd%'::text COLLATE "C")    
   Buffers: shared hit=18 read=3    
 Planning time: 0.424 ms    
 Execution time: 0.124 ms    
(7 rows)

2. 后模糊(有后缀的模糊)的优化方法

使用反转函数(reverse)索引，可以支持后模糊的查询。仅适合于collate="C"的查询，当数据库默认的lc_collate<>C时，索引和查询都需要明确指定collate "C"。

索引、查询条件的collate必须一致才能使用索引。

例子

test=# create index idx1 on test(reverse(info) collate "C");    
CREATE INDEX    
test=# select * from test limit 1;    
 id |               info                   
----+----------------------------------    
  1 | b3275976cdd437a033d4329775a52514    
(1 row)    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test where reverse(info) like '4152%' collate "C";    
                                                        QUERY PLAN                                                            
--------------------------------------------------------------------------------------------------------------------------    
 Index Scan using idx1 on public.test  (cost=0.42..4009.43 rows=5000 width=37) (actual time=0.061..0.097 rows=18 loops=1)    
   Output: id, info    
   Index Cond: ((reverse(test.info) >= '4152'::text) AND (reverse(test.info) < '4153'::text))    
   Filter: (reverse(test.info) ~~ '4152%'::text COLLATE "C")    
   Buffers: shared hit=18 read=3    
 Planning time: 0.128 ms    
 Execution time: 0.122 ms    
(7 rows)    
    
test=# select * from test where reverse(info) like '4152%' collate "C";    
   id   |               info                   
--------+----------------------------------    
 847904 | abe2ecd90393b5275df8e34a39702514    
 414702 | 97f66d26545329321164042657d02514    
 191232 | 7820972c6220c2b01d46c11ebb532514    
 752742 | 93232ac39c6632e2540df44627c42514    
 217302 | 39e518893a1a7b1e691619bd1fc42514    
      1 | b3275976cdd437a033d4329775a52514    
 615718 | 4948f94c484c13dc6c4fae8a3db52514    
 308815 | fc2918ceff7c7a4dafd2e04031062514    
 149521 | 546d963842ea5ca593e622c810262514    
 811093 | 4b6eca2eb6b665af67b2813e91a62514    
 209000 | 1dfd0d4e326715c1739f031cca992514    
 937616 | 8827fd81f5b673fb5afecbe3e11b2514    
 419553 | bd6e01ce360af16137e8b6abc8ab2514    
 998324 | 7dff51c19dc5e5d9979163e7d14c2514    
 771518 | 8a54e30003a48539fff0aedc73ac2514    
 691566 | f90368348e3b6bf983fcbe10db2d2514    
 652274 | 8bf4a97b5f122a5540a21fa85ead2514    
 233437 | 739ed715fc203d47e37e79b5bcbe2514    
(18 rows)

3. 前、后模糊的合体优化方法

使用pg_trgm索引，可以支持前、后模糊的查询，注意：

(有前缀的模糊)至少输入1个字符，(有后缀的模糊)至少输入2个字符。才有好的索引过滤效果。如果要支持中文，数据库lc_collate,lc_ctype不能为"C"。

索引、查询条件的collate必须一致才能使用索引。

test=# \l+ test    
                                              List of databases    
 Name |  Owner   | Encoding |  Collate   |   Ctype    | Access privileges |  Size  | Tablespace | Description     
------+----------+----------+------------+------------+-------------------+--------+------------+-------------    
 test | postgres | UTF8     | zh_CN.utf8 | zh_CN.utf8 |                   | 245 MB | pg_default |     
(1 row)    
    
test=# create extension pg_trgm;    
    
test=# create table test001(c1 text);    
CREATE TABLE

生成随机中文字符串的函数

test=# create or replace function gen_hanzi(int) returns text as $$                
declare      
  res text;      
begin      
  if $1 >=1 then      
    select string_agg(chr(19968+(random()*20901)::int), '') into res from generate_series(1,$1);      
    return res;      
  end if;      
  return null;      
end;      
$$ language plpgsql strict;      
CREATE FUNCTION

生成随机数据

test=# insert into test001 select gen_hanzi(20) from generate_series(1,100000);    
INSERT 0 100000    
    
test=# create index idx_test001_1 on test001 using gin (c1 gin_trgm_ops);    
CREATE INDEX    
    
test=# select * from test001 limit 5;    
                    c1                        
------------------------------------------    
 埳噪办甾讷昃碇玾陧箖燋邢贺浮媊踮菵暔谉橅    
 秌橑籛鴎拟倶敤麁鼋醠轇坙騉鏦纗蘛婃坹娴儅    
 蔎緾铠爪鹏二悲膼朠麻鸂鋬桢窷违繇糭嘓索籓    
 驰泅薬鐗愅撞窍浉渗蛁灎厀攚摐瞪拡擜詜隝緼    
 襳铺煃匶瀌惩荼黹樆惺箧搔羾憯墆锒硍蔓恧顤    
(5 rows)

模糊查询

test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '你%';    
                                                      QUERY PLAN                                                           
-----------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=5.08..15.20 rows=10 width=61) (actual time=0.030..0.034 rows=3 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~~ '你%'::text)    
   Heap Blocks: exact=3    
   Buffers: shared hit=7    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..5.08 rows=10 width=0) (actual time=0.020..0.020 rows=3 loops=1)    
         Index Cond: (test001.c1 ~~ '你%'::text)    
         Buffers: shared hit=4    
 Planning time: 0.119 ms    
 Execution time: 0.063 ms    
(10 rows)    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '%恧顤';    
                                                      QUERY PLAN                                                           
-----------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=5.08..15.20 rows=10 width=61) (actual time=0.031..0.034 rows=1 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~~ '%恧顤'::text)    
   Rows Removed by Index Recheck: 1    
   Heap Blocks: exact=2    
   Buffers: shared hit=6    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..5.08 rows=10 width=0) (actual time=0.020..0.020 rows=2 loops=1)    
         Index Cond: (test001.c1 ~~ '%恧顤'::text)    
         Buffers: shared hit=4    
 Planning time: 0.136 ms    
 Execution time: 0.062 ms    
(11 rows)

二、前后均模糊的优化

使用pg_trgm插件，支持前后模糊的查询。注意：

如果需要让pg_trgm支持中文的模糊查询，数据库lc_collate,lc_ctype不能为"C"。

比如输入3个或3个以上字符，否则效果不佳。

例子

test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '%燋邢贺%';    
                                                      QUERY PLAN                                                           
-----------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=5.08..15.20 rows=10 width=61) (actual time=0.038..0.038 rows=1 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~~ '%燋邢贺%'::text)    
   Heap Blocks: exact=1    
   Buffers: shared hit=5    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..5.08 rows=10 width=0) (actual time=0.025..0.025 rows=1 loops=1)    
         Index Cond: (test001.c1 ~~ '%燋邢贺%'::text)    
         Buffers: shared hit=4    
 Planning time: 0.170 ms    
 Execution time: 0.076 ms    
(10 rows)    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '%燋邢%';    
                                                              QUERY PLAN                                                                  
--------------------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=7615669.08..7615679.20 rows=10 width=61) (actual time=147.524..178.232 rows=1 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~~ '%燋邢%'::text)    
   Rows Removed by Index Recheck: 99999    
   Heap Blocks: exact=1137    
   Buffers: shared hit=14429    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..7615669.08 rows=10 width=0) (actual time=147.377..147.377 rows=100000 loops=1)    
         Index Cond: (test001.c1 ~~ '%燋邢%'::text)    
         Buffers: shared hit=13292    
 Planning time: 0.133 ms    
 Execution time: 178.265 ms    
(11 rows)

三、正则匹配的优化

目前，pg_trgm对中文(或多字节字符)的正则匹配效果不佳，对ascii字符的正则匹配效果很好。

Rows Removed by Index Recheck很小，说明索引过滤性很好。

例子

test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 ~ '12[0-9]{3,9}';    
                                                       QUERY PLAN                                                           
------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=65.08..75.20 rows=10 width=61) (actual time=0.196..0.196 rows=0 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~ '12[0-9]{3,9}'::text)    
   Rows Removed by Index Recheck: 1    
   Heap Blocks: exact=1    
   Buffers: shared hit=50    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..65.08 rows=10 width=0) (actual time=0.183..0.183 rows=1 loops=1)    
         Index Cond: (test001.c1 ~ '12[0-9]{3,9}'::text)    
         Buffers: shared hit=49    
 Planning time: 0.452 ms    
 Execution time: 0.221 ms    
(11 rows)    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 ~ '燋邢贺';    
                                                              QUERY PLAN                                                                  
--------------------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=7615669.08..7615679.20 rows=10 width=61) (actual time=143.846..232.278 rows=1 loops=1)    
   Output: c1    
   Recheck Cond: (test001.c1 ~ '燋邢贺'::text)    
   Rows Removed by Index Recheck: 99999    
   Heap Blocks: exact=1137    
   Buffers: shared hit=14429    
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..7615669.08 rows=10 width=0) (actual time=143.688..143.688 rows=100000 loops=1)    
         Index Cond: (test001.c1 ~ '燋邢贺'::text)    
         Buffers: shared hit=13292    
 Planning time: 0.254 ms    
 Execution time: 232.312 ms    
(11 rows)

pg_trgm模糊查询的原理

首先，pg_trgm将字符串的前端添加2个空格，末尾添加1个空格。

然后，每连续的3个字符为一个TOKEN，拆开。

最后，对TOKEN建立GIN倒排索引。

查看字符串的TOKEN，可以使用如下方法。

test=# select show_trgm('123');    
        show_trgm            
-------------------------    
 {"  1"," 12",123,"23 "}    
(1 row)

pg_trgm前后模糊字符个数要求的原因

使用pg_trgm时，如果要获得最好的效果，最好满足这些条件。

1. 有前缀的模糊查询，例如a%，至少需要提供1个字符。( 搜索的是token=' a' )

2. 有后缀的模糊查询，例如%ab，至少需要提供2个字符。( 搜索的是token='ab ' )

3. 前后模糊查询，例如%abcd%，至少需要提供3个字符。( 这个使用数组搜索，搜索的是token(s) 包含 {" a"," ab",abc,bcd,"cd "} )

原因是什么呢？

因为pg_trgm生成的TOKEN是三个字符，只有在以上三个条件下，才能匹配到对应的TOKEN。

test=# select show_trgm('123');    
        show_trgm            
-------------------------    
 {"  1"," 12",123,"23 "}    
(1 row)

四、小于3个输入字符的模糊查询的优化

当需要前后模糊搜索1个或者2个字符时，pg_trgm无法满足需求，但是我们可以使用表达式GIN索引。

使用表达式，将字符串拆成1个单字，两个连续的字符的数组，对数组建立GIN索引即可。

例子

test=# create or replace function split001(text) returns text[] as $$    
declare    
  res text[];    
begin    
  select regexp_split_to_array($1,'') into res;    
  for i in 1..length($1)-1 loop    
    res := array_append(res, substring($1,i,2));    
  end loop;    
  return res;    
end;    
$$ language plpgsql strict immutable;    
CREATE FUNCTION    
    
test=# create index idx_test001_2 on test001 using gin (split001(c1));    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where split001(c1) @> array['你好'];    
                                                       QUERY PLAN                                                           
------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=8.87..550.12 rows=500 width=61) (actual time=0.041..0.041 rows=0 loops=1)    
   Output: c1    
   Recheck Cond: (split001(test001.c1) @> '{你好}'::text[])    
   Buffers: shared hit=4    
   ->  Bitmap Index Scan on idx_test001_2  (cost=0.00..8.75 rows=500 width=0) (actual time=0.039..0.039 rows=0 loops=1)    
         Index Cond: (split001(test001.c1) @> '{你好}'::text[])    
         Buffers: shared hit=4    
 Planning time: 0.104 ms    
 Execution time: 0.068 ms    
(9 rows)    
    
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where split001(c1) @> array['你'];    
                                                       QUERY PLAN                                                            
-------------------------------------------------------------------------------------------------------------------------    
 Bitmap Heap Scan on public.test001  (cost=8.87..550.12 rows=500 width=61) (actual time=0.063..0.183 rows=86 loops=1)    
   Output: c1    
   Recheck Cond: (split001(test001.c1) @> '{你}'::text[])    
   Heap Blocks: exact=80    
   Buffers: shared hit=84    
   ->  Bitmap Index Scan on idx_test001_2  (cost=0.00..8.75 rows=500 width=0) (actual time=0.048..0.048 rows=86 loops=1)    
         Index Cond: (split001(test001.c1) @> '{你}'::text[])    
         Buffers: shared hit=4    
 Planning time: 0.101 ms    
 Execution time: 0.217 ms    
(10 rows)    
    
test=# select * from test001 where split001(c1) @> array['你'];    
                    c1                        
------------------------------------------    
 殐踨洪冨垓丩贤閚伟垢胸铡崩你萭隡劭芛雫袰    
 靅慨热脸罆淓寘鳗总襎戍謸枨陪丼伦柆套你仮    
......

五、相似查询优化

模糊查询和正则匹配都是找出完全符合条件的记录，还有一种需求是相似查询。

例如postgresql字符串，输入 p0stgresgl 也能根据相似度匹配到。

例子

test=# create index idx_test001_3 on test001 using gist (c1 gist_trgm_ops);    
CREATE INDEX    
      
test=# explain (analyze,verbose,timing,costs,buffers) SELECT t, c1 <-> '癷磛鹚蠌鳃蠲123鹖埀婎鳊苿奶垨惸溴蔻筴熝憡' AS dist      
  FROM test001 t      
  ORDER BY dist LIMIT 5;      
                                                                   QUERY PLAN                                                                        
-------------------------------------------------------------------------------------------------------------------------------------------------    
 Limit  (cost=0.28..0.52 rows=5 width=89) (actual time=37.462..37.639 rows=5 loops=1)    
   Output: t.*, ((c1 <-> '癷磛鹚蠌鳃蠲123鹖埀婎鳊苿奶垨惸溴蔻筴熝憡'::text))    
   Buffers: shared hit=1631    
   ->  Index Scan using idx_test001_3 on public.test001 t  (cost=0.28..4763.28 rows=100000 width=89) (actual time=37.461..37.636 rows=5 loops=1)    
         Output: t.*, (c1 <-> '癷磛鹚蠌鳃蠲123鹖埀婎鳊苿奶垨惸溴蔻筴熝憡'::text)    
         Order By: (t.c1 <-> '癷磛鹚蠌鳃蠲123鹖埀婎鳊苿奶垨惸溴蔻筴熝憡'::text)    
         Buffers: shared hit=1631    
 Planning time: 0.089 ms    
 Execution time: 37.668 ms    
(9 rows)    
    
test=# SELECT t, c1 <-> '癷磛鹚蠌鳃蠲123鹖埀婎鳊苿奶垨惸溴蔻筴熝憡' AS dist      
  FROM test001 t      
  ORDER BY dist LIMIT 5;      
                     t                      |   dist       
--------------------------------------------+----------    
 (癷磛鹚蠌鳃蠲你鹖埀婎鳊苿奶垨惸溴蔻筴熝憡) | 0.307692    
 (坆桻悁斾耾瑚豌腏炁悿隖轲盃挜稐睟礓蜮铅湆) | 0.976744    
 (癷鉜餯祂鼃恫蝅瓟顡廕梍蛸欢僷贊敔欓侑韧鐹) | 0.976744    
 (癷嚯鳬戚蹪熼胘檙佌欔韬挹樷覄惶蹝顼鑜鞖媗) | 0.976744    
 (癷饎瞲馊堒歃峡盾豼担禞嵪豦咢脉馄竨济隘缄) | 0.976744    
(5 rows)

六、小结

1. 如果只有前模糊查询需求，使用collate "C"的b-tree索引。

2. 如果只有后模糊的查询需求，使用collate "C"的reverse()表达式的b-tree索引。

3. 如果有前后模糊查询需求，并且包含中文，请使用lc_collate,lc_ctype <> "C"的数据库，同时使用pg_trgm插件的gin索引。

4. 如果有前后模糊查询需求，并且不包含中文，请使用pg_trgm插件的gin索引。

5. 如果有含有ascii字符的正则表达式查询需求，请使用pg_trgm插件的gin索引。

6. 如果有输入条件少于3个字符的模糊查询需求，可以使用GIN表达式索引，通过数组包含的方式进行搜索，性能一样非常好。

七、性能

1亿条记录，每条记录15个随机中文，一共8GB。测试前后模糊查询性能。

1. 生成测试数据

vi test.sql    
insert into test001 select gen_hanzi(15) from generate_series(1,2500000);    
    
pgbench -n -r -P 1 -f ./test.sql -c 40 -j 40 -t 1 test    
  
  
  
test=# select count(*) from test001;    
   count       
-----------    
 100000000    
(1 row)    
test=# select * from test001 limit 10;  
               c1                 
--------------------------------  
 釾笉皜鲽确艄騚馺腃彊釲忰采汦择  
 槮搮圮墔婂蹾飘孡鶒镇赀聩线麯櫕  
 孨鄈韫萅赫炧暤蟠檼駧餪崉娲譌筯  
 烸喖醝稦怩鷟棾奜妛曫仾飞饡绘韦  
 撑豁襉峊炠眏罱襄彊鳁莆壏妒辷阛  
 蜁愊鶱磹贰帵眲嚉榑苍潵檐簄椰鲀  
 瑄翁蠃巨跻壾蛸湗鑂顂栎砣八瘫栵  
 馇巍笿鞒装棊嘢恓煓熴锠鋈蹃煿屓  
 訆韄踔牤嘇糺绚軿鹃燿螛梋鰢謇郼  
 扑蓨伤釱糕觩嬖蓷鳛绳圆醷熌叆掑  
(10 rows)

2. 创建索引

test=# set maintenance_work_mem ='32GB';    
test=# create index idx_test001_1 on test001 using gin (c1 gin_trgm_ops);

3. 模糊查询性能测试

3.1 前模糊

响应时间：9毫秒

返回4701行

select * from test001 where c1 like '你%';  
  
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '你%';  
                                                          QUERY PLAN                                                            
------------------------------------------------------------------------------------------------------------------------------  
 Bitmap Heap Scan on public.test001  (cost=89.50..10161.50 rows=10000 width=46) (actual time=1.546..8.868 rows=4701 loops=1)  
   Output: c1  
   Recheck Cond: (test001.c1 ~~ '你%'::text)  
   Rows Removed by Index Recheck: 85  
   Heap Blocks: exact=4776  
   Buffers: shared hit=4784  
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..87.00 rows=10000 width=0) (actual time=0.879..0.879 rows=4786 loops=1)  
         Index Cond: (test001.c1 ~~ '你%'::text)  
         Buffers: shared hit=8  
 Planning time: 0.099 ms  
 Execution time: 9.166 ms  
(11 rows)

3.2 后模糊

响应时间：0.25毫秒

返回2行

select * from test001 where c1 like '%叆掑';  
  
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '%叆掑';  
                                                         QUERY PLAN                                                           
----------------------------------------------------------------------------------------------------------------------------  
 Bitmap Heap Scan on public.test001  (cost=89.50..10161.50 rows=10000 width=46) (actual time=0.049..0.223 rows=2 loops=1)  
   Output: c1  
   Recheck Cond: (test001.c1 ~~ '%叆掑'::text)  
   Rows Removed by Index Recheck: 87  
   Heap Blocks: exact=89  
   Buffers: shared hit=94  
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..87.00 rows=10000 width=0) (actual time=0.031..0.031 rows=89 loops=1)  
         Index Cond: (test001.c1 ~~ '%叆掑'::text)  
         Buffers: shared hit=5  
 Planning time: 0.113 ms  
 Execution time: 0.249 ms  
(11 rows)

3.3 前后模糊

响应时间：0.2毫秒

返回1行

select * from test001 where c1 like '%螛梋鰢%';  
  
test=# explain (analyze,verbose,timing,costs,buffers) select * from test001 where c1 like '%螛梋鰢%';  
                                                         QUERY PLAN                                                           
----------------------------------------------------------------------------------------------------------------------------  
 Bitmap Heap Scan on public.test001  (cost=89.50..10161.50 rows=10000 width=46) (actual time=0.044..0.175 rows=1 loops=1)  
   Output: c1  
   Recheck Cond: (test001.c1 ~~ '%螛梋鰢%'::text)  
   Rows Removed by Index Recheck: 81  
   Heap Blocks: exact=82  
   Buffers: shared hit=87  
   ->  Bitmap Index Scan on idx_test001_1  (cost=0.00..87.00 rows=10000 width=0) (actual time=0.027..0.027 rows=82 loops=1)  
         Index Cond: (test001.c1 ~~ '%螛梋鰢%'::text)  
         Buffers: shared hit=5  
 Planning time: 0.112 ms  
 Execution time: 0.201 ms  
(11 rows)

最后更新：2017-04-26 17:00:37

PostgreSQL 十亿级模糊查询最佳实践

标签

背景

一、前模糊与后模糊的优化

二、前后均模糊的优化

三、正则匹配的优化

pg_trgm模糊查询的原理

pg_trgm前后模糊字符个数要求的原因

四、小于3个输入字符的模糊查询的优化

五、相似查询优化

六、小结

七、性能

上一篇： PostgreSQL 中生成随机汉字

下一篇：知识点

相关内容

热门内容

最新内容

PostgreSQL 十亿级模糊查询最佳实践

标签

背景

一、前模糊与后模糊的优化

二、前后均模糊的优化

三、正则匹配的优化

pg_trgm模糊查询的原理

pg_trgm前后模糊字符个数要求的原因

四、小于3个输入字符的模糊查询的优化

五、相似查询优化

六、小结

七、性能

上一篇： PostgreSQL 中生成随机汉字

下一篇： 知识点

相关内容

热门内容

最新内容

下一篇：知识点