PostgreSQL 源码解读(31)- 查询语句#16(查询优化-表达式预处理#1)

本节简单介绍了PG查询优化对表达式预处理中连接Var(RTE中的Var,其中RTE_KIND=RTE_JOIN)溯源的过程。处理逻辑在主函数subquery_planner中通过调用flatten_join_alias_vars函数实现,该函数位于src/backend/optimizer/util/var.c文件中。

一、基本概念

连接Var溯源,意思是把连接产生的中间结果(中间结果也是Relation关系的一种)的投影替换为实际存在的关系的列(在PG中通过Var表示)。
如下面的SQL语句:

select a.*,b.grbh,b.je 
from t_dwxx a,
     lateral (select t1.dwbh,t1.grbh,t2.je 
              from t_grxx t1 inner join t_jfxx t2 
                   on t1.dwbh = a.dwbh 
                      and t1.grbh = t2.grbh) b;

b与a连接运算,查询树Query中的投影b.grbh和b.je这两列如需依赖关系b(子查询产生的中间结果),则需要把中间关系的投影列替换为实际的Relation的投影列,即t_grxx和t_jfxx的数据列.

PG源码中的注释:

     /*
      * If the query has any join RTEs, replace join alias variables with
      * base-relation variables.  We must do this first, since any expressions
      * we may extract from the joinaliasvars lists have not been preprocessed.
      * For example, if we did this after sublink processing, sublinks expanded
      * out from join aliases would not get processed.  But we can skip this in
      * non-lateral RTE functions, VALUES lists, and TABLESAMPLE clauses, since
      * they can't contain any Vars of the current query level.
      */
     if (root->hasJoinRTEs &&
         !(kind == EXPRKIND_RTFUNC ||
           kind == EXPRKIND_VALUES ||
           kind == EXPRKIND_TABLESAMPLE ||
           kind == EXPRKIND_TABLEFUNC))
        expr = flatten_join_alias_vars(root, expr);

二、源码解读

flatten_join_alias_vars

/*
  * flatten_join_alias_vars
  *    Replace Vars that reference JOIN outputs with references to the original
  *    relation variables instead.  This allows quals involving such vars to be
  *    pushed down.  Whole-row Vars that reference JOIN relations are expanded
  *    into RowExpr constructs that name the individual output Vars.  This
  *    is necessary since we will not scan the JOIN as a base relation, which
  *    is the only way that the executor can directly handle whole-row Vars.
  *
  * This also adjusts relid sets found in some expression node types to
  * substitute the contained base rels for any join relid.
  *
  * If a JOIN contains sub-selects that have been flattened, its join alias
  * entries might now be arbitrary expressions, not just Vars.  This affects
  * this function in one important way: we might find ourselves inserting
  * SubLink expressions into subqueries, and we must make sure that their
  * Query.hasSubLinks fields get set to true if so.  If there are any
  * SubLinks in the join alias lists, the outer Query should already have
  * hasSubLinks = true, so this is only relevant to un-flattened subqueries.
  *
  * NOTE: this is used on not-yet-planned expressions.  We do not expect it
  * to be applied directly to the whole Query, so if we see a Query to start
  * with, we do want to increment sublevels_up (this occurs for LATERAL
  * subqueries).
  */
 Node *
 flatten_join_alias_vars(PlannerInfo *root, Node *node)
 {
     flatten_join_alias_vars_context context;
 
     context.root = root;
     context.sublevels_up = 0;
     /* flag whether join aliases could possibly contain SubLinks */
     context.possible_sublink = root->parse->hasSubLinks;
     /* if hasSubLinks is already true, no need to work hard */
     context.inserted_sublink = root->parse->hasSubLinks;
     //调用flatten_join_alias_vars_mutator处理Vars
     return flatten_join_alias_vars_mutator(node, &context);
 }
 
 static Node *
 flatten_join_alias_vars_mutator(Node *node,
                                 flatten_join_alias_vars_context *context)
 {
     if (node == NULL)
         return NULL;
     if (IsA(node, Var))//Var类型
     {
         Var        *var = (Var *) node;
         RangeTblEntry *rte;
         Node       *newvar;
 
         /* No change unless Var belongs to a JOIN of the target level */
         if (var->varlevelsup != context->sublevels_up)
             return node;        /* no need to copy, really */
         rte = rt_fetch(var->varno, context->root->parse->rtable);
         if (rte->rtekind != RTE_JOIN)
             return node;
         //在rte->rtekind == RTE_JOIN时才需要处理
         if (var->varattno == InvalidAttrNumber)
         {
             /* Must expand whole-row reference */
             RowExpr    *rowexpr;
             List       *fields = NIL;
             List       *colnames = NIL;
             AttrNumber  attnum;
             ListCell   *lv;
             ListCell   *ln;
 
             attnum = 0;
             Assert(list_length(rte->joinaliasvars) == list_length(rte->eref->colnames));
             forboth(lv, rte->joinaliasvars, ln, rte->eref->colnames)
             {
                 newvar = (Node *) lfirst(lv);
                 attnum++;
                 /* Ignore dropped columns */
                 if (newvar == NULL)
                     continue;
                 newvar = copyObject(newvar);
 
                 /*
                  * If we are expanding an alias carried down from an upper
                  * query, must adjust its varlevelsup fields.
                  */
                 if (context->sublevels_up != 0)
                     IncrementVarSublevelsUp(newvar, context->sublevels_up, 0);
                 /* Preserve original Var's location, if possible */
                 if (IsA(newvar, Var))
                     ((Var *) newvar)->location = var->location;
                 /* Recurse in case join input is itself a join */
                 /* (also takes care of setting inserted_sublink if needed) */
                 newvar = flatten_join_alias_vars_mutator(newvar, context);
                 fields = lappend(fields, newvar);
                 /* We need the names of non-dropped columns, too */
                 colnames = lappend(colnames, copyObject((Node *) lfirst(ln)));
             }
             rowexpr = makeNode(RowExpr);
             rowexpr->args = fields;
             rowexpr->row_typeid = var->vartype;
             rowexpr->row_format = COERCE_IMPLICIT_CAST;
             rowexpr->colnames = colnames;
             rowexpr->location = var->location;
 
             return (Node *) rowexpr;
         }
 
         /* Expand join alias reference */
         //扩展join alias Var
         Assert(var->varattno > 0);
         newvar = (Node *) list_nth(rte->joinaliasvars, var->varattno - 1);
         Assert(newvar != NULL);
         newvar = copyObject(newvar);
 
         /*
          * If we are expanding an alias carried down from an upper query, must
          * adjust its varlevelsup fields.
          */
         if (context->sublevels_up != 0)
             IncrementVarSublevelsUp(newvar, context->sublevels_up, 0);
 
         /* Preserve original Var's location, if possible */
         if (IsA(newvar, Var))
             ((Var *) newvar)->location = var->location;
 
         /* Recurse in case join input is itself a join */
         newvar = flatten_join_alias_vars_mutator(newvar, context);
 
         /* Detect if we are adding a sublink to query */
         if (context->possible_sublink && !context->inserted_sublink)
             context->inserted_sublink = checkExprHasSubLink(newvar);
 
         return newvar;
     }
     if (IsA(node, PlaceHolderVar))//占位符
     {
         /* Copy the PlaceHolderVar node with correct mutation of subnodes */
         PlaceHolderVar *phv;
 
         phv = (PlaceHolderVar *) expression_tree_mutator(node,
                                                          flatten_join_alias_vars_mutator,
                                                          (void *) context);
         /* now fix PlaceHolderVar's relid sets */
         if (phv->phlevelsup == context->sublevels_up)
         {
             phv->phrels = alias_relid_set(context->root,
                                           phv->phrels);
         }
         return (Node *) phv;
     }
 
     if (IsA(node, Query))//查询树
     {
         /* Recurse into RTE subquery or not-yet-planned sublink subquery */
         Query      *newnode;
         bool        save_inserted_sublink;
 
         context->sublevels_up++;
         save_inserted_sublink = context->inserted_sublink;
         context->inserted_sublink = ((Query *) node)->hasSubLinks;
         newnode = query_tree_mutator((Query *) node,
                                      flatten_join_alias_vars_mutator,
                                      (void *) context,
                                      QTW_IGNORE_JOINALIASES);
         newnode->hasSubLinks |= context->inserted_sublink;
         context->inserted_sublink = save_inserted_sublink;
         context->sublevels_up--;
         return (Node *) newnode;
     }
     /* Already-planned tree not supported */
     Assert(!IsA(node, SubPlan));
     /* Shouldn't need to handle these planner auxiliary nodes here */
     Assert(!IsA(node, SpecialJoinInfo));
     Assert(!IsA(node, PlaceHolderInfo));
     Assert(!IsA(node, MinMaxAggInfo));
     //其他表达式
     return expression_tree_mutator(node, flatten_join_alias_vars_mutator,
                                    (void *) context);
 }
 

query_tree_mutator

 /*
  * query_tree_mutator --- initiate modification of a Query's expressions
  *
  * This routine exists just to reduce the number of places that need to know
  * where all the expression subtrees of a Query are.  Note it can be used
  * for starting a walk at top level of a Query regardless of whether the
  * mutator intends to descend into subqueries.  It is also useful for
  * descending into subqueries within a mutator.
  *
  * Some callers want to suppress mutating of certain items in the Query,
  * typically because they need to process them specially, or don't actually
  * want to recurse into subqueries.  This is supported by the flags argument,
  * which is the bitwise OR of flag values to suppress mutating of
  * indicated items.  (More flag bits may be added as needed.)
  *
  * Normally the Query node itself is copied, but some callers want it to be
  * modified in-place; they must pass QTW_DONT_COPY_QUERY in flags.  All
  * modified substructure is safely copied in any case.
  */
 Query *
 query_tree_mutator(Query *query,
                    Node *(*mutator) (),
                    void *context,
                    int flags)//遍历查询树
 {
     Assert(query != NULL && IsA(query, Query));
 
     if (!(flags & QTW_DONT_COPY_QUERY))
     {
         Query      *newquery;
 
         FLATCOPY(newquery, query, Query);
         query = newquery;
     }
 
     MUTATE(query->targetList, query->targetList, List *);//投影列
     MUTATE(query->withCheckOptions, query->withCheckOptions, List *);
     MUTATE(query->onConflict, query->onConflict, OnConflictExpr *);
     MUTATE(query->returningList, query->returningList, List *);
     MUTATE(query->jointree, query->jointree, FromExpr *);
     MUTATE(query->setOperations, query->setOperations, Node *);
     MUTATE(query->havingQual, query->havingQual, Node *);
     MUTATE(query->limitOffset, query->limitOffset, Node *);
     MUTATE(query->limitCount, query->limitCount, Node *);
     if (!(flags & QTW_IGNORE_CTE_SUBQUERIES))
         MUTATE(query->cteList, query->cteList, List *);
     else                        /* else copy CTE list as-is */
         query->cteList = copyObject(query->cteList);
     query->rtable = range_table_mutator(query->rtable,
                                         mutator, context, flags);//RTE
     return query;
 }

range_table_mutator

 /*
  * range_table_mutator is just the part of query_tree_mutator that processes
  * a query's rangetable.  This is split out since it can be useful on
  * its own.
  */
 List *
 range_table_mutator(List *rtable,
                     Node *(*mutator) (),
                     void *context,
                     int flags)
 {
     List       *newrt = NIL;
     ListCell   *rt;
 
     foreach(rt, rtable)//遍历RTE
     {
         RangeTblEntry *rte = (RangeTblEntry *) lfirst(rt);
         RangeTblEntry *newrte;
 
         FLATCOPY(newrte, rte, RangeTblEntry);
         switch (rte->rtekind)
         {
             case RTE_RELATION:
                 MUTATE(newrte->tablesample, rte->tablesample,
                        TableSampleClause *);
                 /* we don't bother to copy eref, aliases, etc; OK? */
                 break;
             case RTE_CTE:
             case RTE_NAMEDTUPLESTORE:
                 /* nothing to do */
                 break;
             case RTE_SUBQUERY:
                 if (!(flags & QTW_IGNORE_RT_SUBQUERIES))
                 {
                     CHECKFLATCOPY(newrte->subquery, rte->subquery, Query);
                     MUTATE(newrte->subquery, newrte->subquery, Query *);//遍历处理子查询
                 }
                 else
                 {
                     /* else, copy RT subqueries as-is */
                     newrte->subquery = copyObject(rte->subquery);
                 }
                 break;
             case RTE_JOIN://连接,遍历处理joinaliasvars
                 if (!(flags & QTW_IGNORE_JOINALIASES))
                     MUTATE(newrte->joinaliasvars, rte->joinaliasvars, List *);
                 else
                 {
                     /* else, copy join aliases as-is */
                     newrte->joinaliasvars = copyObject(rte->joinaliasvars);
                 }
                 break;
             case RTE_FUNCTION:
                 MUTATE(newrte->functions, rte->functions, List *);
                 break;
             case RTE_TABLEFUNC:
                 MUTATE(newrte->tablefunc, rte->tablefunc, TableFunc *);
                 break;
             case RTE_VALUES:
                 MUTATE(newrte->values_lists, rte->values_lists, List *);
                 break;
         }
         MUTATE(newrte->securityQuals, rte->securityQuals, List *);
         newrt = lappend(newrt, newrte);
     }
     return newrt;
 }

三、跟踪分析

在PG11中,没有进入"Expand join alias reference"的实现逻辑,猜测在上拉子查询的时候已作优化.

四、小结

1、优化过程:介绍了通过遍历的方式实现joinaliasvars链表变量中的处理;
2、遍历处理:通过统一的遍历方式,改变XX_mutator函数对Node进行处理。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 205,033评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 87,725评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 151,473评论 0 338
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,846评论 1 277
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,848评论 5 368
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,691评论 1 282
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,053评论 3 399
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,700评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 42,856评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,676评论 2 323
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,787评论 1 333
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,430评论 4 321
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,034评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,990评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,218评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,174评论 2 352
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,526评论 2 343

推荐阅读更多精彩内容