PgSQL · 源碼分析 · PG 優化器中的pathkey與索引在排序時的使用

概要

SQL在PostgreSQL中的處理，是類似於流水線方式的處理，先後由：

詞法、語法解析，生成解析樹後，將其交給語義解析
語義解析，生成查詢樹，將其交給Planner
Planner根據查詢樹，生成執行計劃，交給執行器
執行器執行完成後返回結果

數據庫優化器在生成執行計劃的時候，優化器會考慮是否需要使用索引，而使用了索引之後，則會考慮如何利用索引已經排過序的特點，來優化相關的排序，比如ORDER BY / GROUP BY等。

先來看個索引對ORDER BY起作用的例子：

postgres=# create table t(id int, name text, value int);
CREATE TABLE
postgres=# create index t_value on t(value);
CREATE INDEX
postgres=# explain select * from 
postgres-# t order by value;
                              QUERY PLAN                              
----------------------------------------------------------------------
 Index Scan using t_value on t  (cost=0.15..61.55 rows=1160 width=40)
(1 row)

postgres=# explain select * from 
t order by name;
                         QUERY PLAN                         
------------------------------------------------------------
 Sort  (cost=80.64..83.54 rows=1160 width=40)
   Sort Key: name
   ->  Seq Scan on t  (cost=0.00..21.60 rows=1160 width=40)
(3 rows)

由此可見，通過索引進行查詢後，是可以直接利用已經索引的有序需不需要再次進行排序。

本文將介紹優化器如何在已有索引的基礎上，優化排序的。

SQL的流水線處理

數據庫以流水線的方式處理SQL請求，當一個SQL到來後：

以SQL中SELECT語句的基本表達形式為例：

SELCT $targets FROM $tables_or_sub_queries WHERE $quals GROUP BY $columns ORDER BY $columns LIMIT $num OFFSET $columns;

為了表示一個SELECT的語句，語義解析之前是SelectStmt結構，其中包括targetlist、FROM 子句、WHERE子句、GROUP BY子句等。

在語義解析之後，會引入一個Query結構，該Query結構隻表示當前語句中的內容，並不直接包括需要遞歸的子句，比如子查詢(子查詢用RangeTblEntry描述，存放在Query->rtable列表中)等。在Query之後，優化器根據其中的內容生成RelOptInfo，作為整個執行計劃的入口。

Query 結構如下，此處我們著重關注rtable和jointree：

   95	/*
   96  * Query -
   97  *    Parse analysis turns all statements into a Query tree
   98  *    for further processing by the rewriter and planner.
   99  *
  100  *    Utility statements (i.e. non-optimizable statements) have the
  101  *    utilityStmt field set, and the rest of the Query is mostly dummy.
  102  *
  103  *    Planning converts a Query tree into a Plan tree headed by a PlannedStmt
  104  *    node --- the Query structure is not used by the executor.
  105  */
  106 typedef struct Query
  107 {
  108     NodeTag     type;
  109 
  110     CmdType     commandType;    /* select|insert|update|delete|utility */
  111 
  112     QuerySource querySource;    /* where did I come from? */
  113 
  114     uint32      queryId;        /* query identifier (can be set by plugins) */
  115 
  116     bool        canSetTag;      /* do I set the command result tag? */
  117 
  118     Node       *utilityStmt;    /* non-null if commandType == CMD_UTILITY */
  119 
  120     int         resultRelation; /* rtable index of target relation for
  121                                  * INSERT/UPDATE/DELETE; 0 for SELECT */
  122 
  ...
  
  
  133     List       *cteList;        /* WITH list (of CommonTableExpr's) */
  134 
  135     List       *rtable;         /* list of range table entries */
  136     FromExpr   *jointree;       /* table join tree (FROM and WHERE clauses) */
  137 
  138     List       *targetList;     /* target list (of TargetEntry) */
  139 
  
  ...
  
  146     List       *groupClause;    /* a list of SortGroupClause's */
  147 

  156     List       *sortClause;     /* a list of SortGroupClause's */
  157 
  158     Node       *limitOffset;    /* # of result tuples to skip (int8 expr) */
  159     Node       *limitCount;     /* # of result tuples to return (int8 expr) */

  ...
  
  180     int         stmt_len;       /* length in bytes; 0 means "rest of string" */
  181 } Query;

在Query中，此次最關注的是由$tables_or_sub_queries解析得來的Query->rtable。Query->rtable是一個RangeTblEntry的列表，用於表示$tables_or_sub_queries中的以下幾種類型：

表
子查詢，表示出另外一個子句，但不包含子句中的Query，而是由RangeTblEntry中的subquery來描述其對應的Query
JOIN，除了將JOIN相關的表添加到Query->rtable外，還會加入一個RangeTblEntry表示JOIN表達式用於後麵的執行計劃
函數

同時也會添加到相應的ParseState->p_joinlist，後轉換為FromExpr作為Query->jointree。後麵的執行計劃生成階段主要依賴Query->jointree和Query->rtable用於處理pathkey相關的信息。

執行計劃生成

在SQL的操作中，幾乎所有的操作（比如查詢）最終都會落在實際的表上，那麼在執行計劃中表的表示就比較重要。PostgreSQL用RelOptInfo結構體來表示，如下：


  518 typedef struct RelOptInfo
  519 {
  520     NodeTag     type;
  521 
  522     RelOptKind  reloptkind;
  523 
  524     /* all relations included in this RelOptInfo */
  525     Relids      relids;         /* set of base relids (rangetable indexes) */
  526 

  ...  

  537 
  538     /* materialization information */
  539     List       *pathlist;       /* Path structures */
  540     List       *ppilist;        /* ParamPathInfos used in pathlist */
  541     List       *partial_pathlist;   /* partial Paths */
  542     struct Path *cheapest_startup_path;
  543     struct Path *cheapest_total_path;
  544     struct Path *cheapest_unique_path;
  545     List       *cheapest_parameterized_paths;
  
  ...
  
  552     /* information about a base rel (not set for join rels!) */
  553     Index       relid;
  554     Oid         reltablespace;  /* containing tablespace */
  555     RTEKind     rtekind;        /* RELATION, SUBQUERY, FUNCTION, etc */
  
  ...
  
  562     List       *indexlist;      /* list of IndexOptInfo */
  563     List       *statlist;       /* list of StatisticExtInfo */

  ...
  584     /* used by various scans and joins: */
  585     List       *baserestrictinfo;   /* RestrictInfo structures (if base rel) */
  586     QualCost    baserestrictcost;   /* cost of evaluating the above */
  587     Index       baserestrict_min_security;  /* min security_level found in
  588                                              * baserestrictinfo */
  589     List       *joininfo;       /* RestrictInfo structures for join clauses

  ...
  595 } RelOptInfo;

事實上，RelOptInfo是執行計劃路徑生成的主要數據結構，同樣用於表述表、子查詢、函數等。

在SQL查詢中，JOIN是最為耗時，執行計劃的生成首先考慮JOIN。因此，整個執行計劃路徑的入口即為一個JOIN類型的RelOptInfo。當隻是單表的查詢時，則執行計劃入口為這張表的RelOptInfo。

執行計劃的生成過程，就是從下往上處理到最上層的RelOptInfo->pathlist的過程，選擇有成本較優先節點、刪除無用節點，最後得到一個成本最優的執行計劃。

在整個過程中，大約分為以下幾步：

獲取表信息
創建表RelOptInfo，將所有該表的掃瞄路徑加入到該表的RelOptInfo->pathlist
創建JOIN的RelOptInfo，將所有可能的JOIN順序和方式以Path結構體添加到RelOptInfo->pathlist
針對JOIN的RelOptInfo，添加GROUP BY、ORDER BY等節點

生成範圍表的掃瞄節點

執行計劃一開始，即首先將獲取所有的表信息，並以RelOptInfo（baserel）存放在PlannerInfo結構體中的simple_rel_array中，如RelOptInfo中的indexlist用於表示這張表的索引信息，用於判斷是否可以用上索引。

為每張表建立掃瞄路徑，一般有順序掃瞄和索引掃瞄兩種。掃瞄路徑用Path結構體來表示，並存放在該表對應的RelOptInfo->pathlist中。Path結構體如下：

  948 typedef struct Path
  949 {
  950     NodeTag     type;
  951 
  952     NodeTag     pathtype;       /* tag identifying scan/join method */
  953 
  954     RelOptInfo *parent;         /* the relation this path can build */
  955     PathTarget *pathtarget;     /* list of Vars/Exprs, cost, width */
  956 
  957     ParamPathInfo *param_info;  /* parameterization info, or NULL if none */
  958 
  959     bool        parallel_aware; /* engage parallel-aware logic? */
  960     bool        parallel_safe;  /* OK to use as part of parallel plan? */
  961     int         parallel_workers;   /* desired # of workers; 0 = not parallel */
  962 
  963     /* estimated size/costs for path (see costsize.c for more info) */
  964     double      rows;           /* estimated number of result tuples */
  965     Cost        startup_cost;   /* cost expended before fetching any tuples */
  966     Cost        total_cost;     /* total cost (assuming all tuples fetched) */
  967 
  968     List       *pathkeys;       /* sort ordering of path's output */
  969     /* pathkeys is a List of PathKey nodes; see above */
  970 } Path;

在添加表的掃瞄路徑時，會首先添加順序掃瞄（seqscan）到這張表的RelOptInfo->pathlist，保證表數據的獲取。而後考慮indexscan掃瞄節點等其他方式。

當RelOptInfo->indexlist滿足RelOptInfo->baserestrictinfo中的過濾條件，或滿足RelOptInfo->joininfo等條件時，則認為index是有效的。然後根據統計信息（如過濾性等）計算成本後，建立index掃瞄節點。

在建立index掃瞄節點時，根據索引建立時的情況（排序順序、比較操作符等），創建PathKeys的列表（可能多個字段），存放在IndexPath->Path->pathkeys中。PathKeys的結構體如下：

  830 /*
  831  * PathKeys
  832  *
  833  * The sort ordering of a path is represented by a list of PathKey nodes.
  834  * An empty list implies no known ordering.  Otherwise the first item
  835  * represents the primary sort key, the second the first secondary sort key,
  836  * etc.  The value being sorted is represented by linking to an
  837  * EquivalenceClass containing that value and including pk_opfamily among its
  838  * ec_opfamilies.  The EquivalenceClass tells which collation to use, too.
  839  * This is a convenient method because it makes it trivial to detect
  840  * equivalent and closely-related orderings. (See optimizer/README for more
  841  * information.)
  842  *
  843  * Note: pk_strategy is either BTLessStrategyNumber (for ASC) or
  844  * BTGreaterStrategyNumber (for DESC).  We assume that all ordering-capable
  845  * index types will use btree-compatible strategy numbers.
  846  */
  847 typedef struct PathKey
  848 {
  849     NodeTag     type;
  850 
  851     EquivalenceClass *pk_eclass;    /* the value that is ordered */
  852     Oid         pk_opfamily;    /* btree opfamily defining the ordering */
  853     int         pk_strategy;    /* sort direction (ASC or DESC) */
  854     bool        pk_nulls_first; /* do NULLs come before normal values? */
  855 } PathKey;
  856

事實上，PathKeys可以用於所有已排過序的RelOptInfo中，用於表示這個表、函數、子查詢、JOIN等是有序的，作為上層判斷選擇Path的依據之一。

在建立除seqscan之外的其他節點時，會與pathlist中已有的每個節點根據啟動成本和總體成本做對比（相差在一定比值，默認1%），則分為四種情況：

新建節點和已有節點，其中一方啟動成本和總成本都更優，且其pathkeys也更優，那麼刪除另外一個
新建節點和所有已有節點的啟動成本和總成本兩方麵的對比不一致（如總成本高但啟動成本較低，或反過來），且新建節點總成本較低，則會全部保留並添加到RelOptInfo->pathlist中。
新節點和已有節點，其中一方啟動成和總成本都更優，但其pathkeys不夠，則兩者都保留，由上層Path節點來判斷
當新建節點和已有節點成本相同時，則對比兩者的pathkeys，選擇保留更優pathkeys的節點

此時，即完成一張表所有的Path的生成，保存在該表的RelOptInfo->pathlist中，並從中選擇一條成本最低的Path，作為RelOptInfo->cheapest_total_path。索引掃瞄節點的pathkeys將會被上層路徑在與排序相關節點中用到，如ORDER BY、GROUP BY、MERGE JOIN等。

生成JOIN節點

JOIN節點生成的算法較為複雜，簡單來說，會針對所有參與JOIN的表，動態規劃不同的順序和JOIN方式，然後生成不同的Path加到這個JOIN的RelOptInfo->pathlist中。

最終執行計劃的生成

在完成JOIN的各個路徑判斷後，針對各路徑選擇成本最低的Path（表的JOIN順序和JOIN方式）作為最優路徑，並依據這個路徑上的pathkeys處理ORDER BY、GROUP BY等其他子句的計劃，從而完成最終的執行計劃。

在前麵的介紹中，每張表的RelOptInfo->pathlist中的indexscan的Path都帶有pathkeys信息，即表明這個節點執行完之後的結果是按pathkeys來排序的。那麼在以下幾個地方則可以用到該特性：

MERGE JOIN

在建立JOIN節點時，會有多種JOIN方式可以選擇，如NESTLOOP、MERGE JOIN等。當建立了MERGE JOIN節點之後，一般是需要對兩張表進行排序。但當某張表的掃瞄節點返回的是有序的，且該順序與查詢所需完全一致，則會去除這個排序節點，從而在成本上占據優勢。
ORDER BY

當最終的RelOptInfo節點建立完成後，會拿表RelOptInfo->pathlist中成本最低的Path，與帶有pathkeys的Path做成本上的對比，選擇成本更低的路徑。如果最終是pathkeys的路徑，那麼該RelOptInfo的pathkeys會保留。

若該SQL語句中帶ORDER BY，則可以判斷該RelOptInfo的pathkeys是否對ORDER BY（字段和排列順序一致），則不必再建立ORDER BY節點。如果pathkeys沒有幫助，則會建立排序節點
GROUP BY

GROUP BY有多種方式。如果RelOptInfo中的pathkeys與在解析階段產生的GROUP BY的pathkeys一致，則從成本上對RelOptInfo結果集的pathkeys對該GROUP BY是否有效，從而可以考慮選用SORT加AGG的方式。這種方式，因為pathkeys的存在，則不必再建SORT 節點。然後再對比與其他方式的成本，擇優采用。
子查詢

如果JOIN中包含子查詢，那麼則在JOIN的RelOptInfo->pathlist中添加一個subquery類型的Path，並把子查詢中的排序的結果指定為pathkeys放在該Path中。從而上層節點，可以用上麵同樣的方法，選用該RelOptInfo中最優的Path，並根據pathkeys決定是否需要排序。

總結

通過以上表述，可以說明一條SQL語句的執行計劃入口是一個RelOptInfo結構，其中成員pathlist則標示所有不同的查找路徑，在這些路徑中最終會落在表的RelOptInfo->pathlist中最優的Path中。如果該Path帶有pathkeys，那麼上層在處理ORT相關的操作時，可以根據pathkeys是否對排序有效而決定是否需要排序節點，從而選擇成本更低的路徑。

最後更新：2017-08-21 09:03:04

PgSQL · 源碼分析 · PG 優化器中的pathkey與索引在排序時的使用

概要

SQL的流水線處理

執行計劃生成

生成範圍表的掃瞄節點

生成JOIN節點

最終執行計劃的生成

總結

上一篇： MSSQL· 實現分析 · Extend Event日誌文件的分析方法

下一篇： MySQL · 源碼分析 · 內存分配機製

相關內容

熱門內容

最新內容