PostgreSQL 聚合表達式 FILTER , order , within group 用法

標簽

PostgreSQL , 聚合 , filter , order , within group

背景

PostgreSQL的分析功能還是比較強大的，例如支持多維分析，支持4大類聚合，支持窗口查詢，支持遞歸查詢等。

4大類聚合的用法請參考

《PostgreSQL aggregate function 1 : General-Purpose Aggregate Functions》

《PostgreSQL aggregate function 2 : Aggregate Functions for Statistics》

《PostgreSQL aggregate function 3 : Aggregate Functions for Ordered-Set》

《PostgreSQL aggregate function 4 : Hypothetical-Set Aggregate Functions》

多維分析請參考

《Greenplum 最佳實踐 - 多維分析的使用(CUBE, ROLLUP, GROUPING SETS in GreenPlum and Oracle)》

《PostgreSQL 9.5 new feature - Support GROUPING SETS, CUBE and ROLLUP.》

窗口查詢請參考

《時序數據合並場景加速分析和實現 - 複合索引，窗口分組查詢加速，變態遞歸加速》

《快速入門PostgreSQL應用開發與管理 - 4 高級SQL用法》

遞歸查詢請參考

《快速入門PostgreSQL應用開發與管理 - 3 訪問數據》

本文主要介紹一下聚合表達式的高級用法

aggregate_name (expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]  
  
aggregate_name (ALL expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]  
  
aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]  
  
aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ]  
  
aggregate_name ( [ expression [ , ... ] ] ) WITHIN GROUP ( order_by_clause ) [ FILTER ( WHERE filter_clause ) ]

例子

1. 我們在分組後，需要查出分組中複合條件的count，以及分組的count。

postgres=# create table test(id int, c1 int);  
CREATE TABLE  
postgres=# insert into test select generate_series(1,10000), random()*10;  
INSERT 0 10000  
  
postgres=# select * from test limit 10;  
 id | c1   
----+----  
  1 | 10  
  2 |  4  
  3 |  6  
  4 |  1  
  5 |  4  
  6 |  9  
  7 |  9  
  8 |  7  
  9 |  5  
 10 |  4  
(10 rows)

postgres=# select count(*), count(*) filter (where id<1000) from test group by c1;  
 count | count   
-------+-------  
  1059 |   118  
   998 |   109  
   999 |   101  
  1010 |    95  
   468 |    48  
   544 |    43  
   964 |   107  
   956 |   103  
  1021 |    87  
   977 |   101  
  1004 |    87  
(11 rows)

2. 我們需要按順序，將多條記錄，聚合為一個字符串或數組，同樣也可以加filter，隻聚合複合條件的記錄

postgres=# select string_agg(id::text, '-' order by id) filter (where id<100) from test group by c1;  
                string_agg                   
-------------------------------------------  
 35-65-74-97  
 4-12-19-31-36-40-85-89-90-98-99  
 17-18-22-42-43-44-58-59-64-70-75-83-84  
 11-14-15-16-21-30-41-54-62-67-73-80-81-94  
 2-5-10-51-79-93-96  
 9-26-45-46-47-61  
 3-27-28-37-48-55-56-68-69-77-92  
 8-20-24-33-34-49-50-60-63-66-78-91  
 25-39-53-57-71-76-82-87-95  
 6-7-29-32-38-72-86-88  
 1-13-23-52  
(11 rows)

3. 我們需要去每個分組，某個字段中值

postgres=# select percentile_cont(0.5) within group (order by id) from test group by c1;  
 percentile_cont   
-----------------  
          4911.5  
            5210  
            4698  
          4699.5  
            4955  
          5061.5  
            5115  
            5176  
          4897.5  
            5087  
            4973  
(11 rows)

4. 去過濾條件後的中值

postgres=# select percentile_cont(0.5) within group (order by id) filter (where id<100) from test group by c1;  
 percentile_cont   
-----------------  
            69.5  
              40  
              58  
            47.5  
              51  
            45.5  
              55  
            49.5  
              71  
              35  
              18  
(11 rows)

小結

PostgreSQL 的分析方法很全麵，建議用戶可以多了解一下我在開始時給出的鏈接，幫助提高生產力。

參考

https://www.postgresql.org/docs/9.6/static/sql-expressions.html

最後更新：2017-05-04 21:01:14

PostgreSQL 聚合表達式 FILTER , order , within group 用法

標簽

背景

例子

小結

參考

上一篇： NLP技術的應用及思考

下一篇：異步流複製模式如何保證不丟數據?

相關內容

熱門內容

最新內容

PostgreSQL 聚合表達式 FILTER , order , within group 用法

標簽

背景

例子

小結

參考

上一篇： NLP技術的應用及思考

下一篇： 異步流複製模式如何保證不丟數據?

相關內容

熱門內容

最新內容

下一篇：異步流複製模式如何保證不丟數據?