閱讀355 返回首頁    go 阿裏雲 go 技術社區[雲棲]


PostgreSQL數據庫監控中的統計學 - 對象空間的數據分布圖

有時,我們為了直觀的顯示對象的空間占用及分布情況,我們可能會以圖表的形式展示。
通常情況下,我們可以根據需求,以柱狀圖或者聚集圖的形式來展示,從各個維度來了解數據空間占用的分布情況,例如:
1. bucket分布,就是按大小排序,選定要劃分為幾個bucket,每個bucket放同樣數量的對象,輸出bucket的邊界,形式和pg_stats.histogram_bounds的輸出類似。
方法舉例,需要用到窗口函數ntile:
postgres=# select bucket,min(size),max(size),count(*) from (select relname,ntile(10) over( order by pg_relation_size(oid) ) bucket, pg_relation_size(oid) size from pg_class) t group by 1 order by 1;
 bucket |  min  |   max    | count 
--------+-------+----------+-------
      1 |     0 |        0 |    31
      2 |     0 |        0 |    30
      3 |     0 |        0 |    30
      4 |     0 |        0 |    30
      5 |     0 |     8192 |    30
      6 |  8192 |     8192 |    30
      7 |  8192 |    16384 |    30
      8 | 16384 |    16384 |    30
      9 | 16384 |    32768 |    30
     10 | 32768 | 36249600 |    30
(10 rows)

2. 按等間距線性分布,例如每100MB輸出一組落在對應SIZE的對象。這種方式有點像systemtap的
@hist_linear分布
https://sourceware.org/systemtap/langref/Statistics_aggregates.html#SECTION00094100000000000000
方法舉例:
postgres=# select pg_relation_size(oid)/1024/1024,count(*) from pg_class group by 1 order by 1;
 ?column? | count 
----------+-------
        0 |   299
       21 |     1
       34 |     1
(3 rows)

3. 按2^n間距指數分布,這種方式有點像systemtap的
@hist_log分布
https://sourceware.org/systemtap/langref/Statistics_aggregates.html#SECTION00094100000000000000
方法舉例:
首先要將int轉為二進製
https://blog.163.com/digoal@126/blog/static/16387704020132592725462/
create or replace function si32tob(i_num int) returns varbit as $$
declare
  o_bit text;
  o_len int;
  i_conv int;
  i_num_abs int;
  i_pos int;
begin
  if i_num = 0 then return varbit '0'; end if; 
   o_len := 32;
  i_conv := 2;
  i_num_abs := abs(i_num);
  i_pos := trunc((dlog1(i_num_abs))/0.693147180559945);
  o_bit := mod(i_num_abs,i_conv)::text;
  if i_pos >= 1 then
    for i in 1..i_pos loop
      o_bit := mod(i_num_abs>>i, i_conv)||o_bit;
    end loop;
  end if;
  if i_num >=0 then
    null;
  else
    o_len := o_len - char_length(o_bit) - 1;
    o_bit := repeat('0', o_len)||o_bit;
    o_bit := '1'||o_bit;
  end if;
  return o_bit::varbit;
end;
$$ language plpgsql;
輸出
postgres=# select 2^(bit_length(si32tob((pg_relation_size(oid))::int4))-1), count(*) from pg_class group by 1 order by 1;
 ?column? | count 
----------+-------
        1 |   145
     8192 |    67
    16384 |    60
    32768 |    14
    65536 |     6
   131072 |     4
   262144 |     7
   524288 |     1
 16777216 |     1
 33554432 |     1
(10 rows)

4. 聚集分布,可以用k-mean分布插件
https://blog.163.com/digoal@126/blog/static/163877040201571745048121/
[圖]
K_Means

查詢舉例:
postgres=# select class,min(size),max(size),count(*) from (select kmeans(array[pg_relation_size(oid)],10) over() as class,pg_relation_size(oid) size from pg_class ) t group by 1 order by 1;
 class |   min    |   max    | count 
-------+----------+----------+-------
     0 |   671744 |   671744 |     1
     1 |   483328 |   483328 |     1
     2 |   475136 |   475136 |     1
     3 | 22487040 | 22487040 |     1
     4 |   352256 |   352256 |     1
     5 | 36249600 | 36249600 |     1
     6 |   278528 |   319488 |     4
     7 |   221184 |   221184 |     1
     8 |    57344 |   139264 |    10
     9 |        0 |    49152 |   285
(10 rows)

K_Means

最後更新:2017-04-01 13:37:08

  上一篇:go 答疑解惑 · 歸檔進程cp命令的core文件追查
  下一篇:go MongoDB權限管理代碼分析