XQuery Optimizations

From BaseX Documentation
Revision as of 12:23, 13 July 2020 by CG (talk | contribs)
Jump to navigation Jump to search

This article is part of the XQuery Portal. Optimizations are presented that speed up the execution time and reduce memory consumption. The article will be incrementally enhanced with more examples.

Introduction

An XQuery expression is evaluated in multiple steps:

  1. At parse time, the query string – an XQuery main module – is transformed to a tree representation, the abstract syntax tree (AST).
  2. At compile time, the syntax tree is decorated with additional information (type information, expression properties); expressions are relocated, simplified, or pre-evaluated.
  3. At evaluation time, the resulting expression tree is processed.
  4. The results are returned to the user. Some expression (such as simple loops) can be evaluated in iterative manner, whereas others (such as sort operations) need to be fully evaluated before the first result is available.

Each of the steps allows for numerous optimizations, some of which are described in this article.

If you run a query on command-line, you can use -V to output detailed query information. In the GUI, you can enable the Info View panel.

Compile-Time Optimizations

Pre-Evaluation

Parts of the query that are static and would be executed multiple times can already be evaluated at compile time:

<syntaxhighlight lang="xquery"> for $i in 1 to 10 return 2 * 3

(: will be rewritten to :) for $i in 1 to 10 return 6 </syntaxhighlight>

Variable Inlining

The value of a variable can be inlined: The variables references are replaced by the expression that is bound to the variable. The resulting expression can often be simplified, and further optimizations can be triggered:

<syntaxhighlight lang="xquery"> declare variable $INFO := true();

let $nodes := //nodes where $INFO return 'Results: ' || count($nodes)

(: will be rewritten to :) let $nodes := //nodes where true() return 'Results: ' || count($nodes)

(: will be rewritten to :) let $nodes := //nodes return 'Results: ' || count($nodes)

(: will be rewritten to :) 'Results: ' || count(//nodes) </syntaxhighlight>

As the example shows, variable declarations might be located in the query prolog and in FLWOR expressions. They may also occur (and be inlined) in try/catch, switch or typeswitch expressions.

Function Inlining

Functions can be inlined as well. The parameters are rewitten to let clauses and the function is body is bound to the return clause.

<syntaxhighlight lang="xquery"> declare function local:inc($i) { $i + 1 }; for $n in 1 to 5 return local:inc($n)

(: will be rewritten to :) for $n in 1 to 5 return (

 let $_ := $n
 return $_ + 1

)

(: will be rewritten to :) for $n in 1 to 5 return $n + 1 </syntaxhighlight>

Subsequent rewritings might result in query plans that differ a lot from the original query. As this might complicate debugging, you can disable function inling during development by setting INLINELIMIT to 0.

Static Typing

If the type of a value is known at compile time, type checks can be removed. In the example below, the static information that $i will always reference items of type xs:integer can be utilized to simplify the expression:

<syntaxhighlight lang="xquery"> for $i in 1 to 5 return typeswitch($i)

 case xs:numeric return 'number'
 default return 'string'

(: will be rewritten to :) for $i in 1 to 5 return 'number' </syntaxhighlight>

FLWOR Rewritings

FLWOR expressions are central to XQuery and the most complex constructs the language offers. Numerous optimizations have been realized to improve the execution time:

  • Nested FLWOR expressions are flattened.
  • for clauses with single items are rewritten to let clauses.
  • let clauses that are iterated multiple times are lifted up.
  • Expressions of let clauses are inlined.
  • Unused variables are removed.
  • where clauses are rewritten to predicates.
  • if expressions in the return clause are rewritten to where clauses.

Since Version 9.4, the last for clause is merged into the return clause and rewritten to a Simple_Map_Operator|simple map expression.

Various of these rewriting are demonstrated in the following example:

<syntaxhighlight lang="xquery"> for $a in 1 to 5 for $b in 2 where $a > 3 let $c := $a + $b return $c

(: for is rewritten to let :) for $a in 1 to 5 let $b := 2 where $a > 3 let $c := $a + $b return $c

(: let is lifted up :) let $b := 2 for $a in 1 to 5 where $a > 3 let $c := $a + $b return $c

(: the where expression is rewritten to a predicate :) let $b := 2 for $a in 1 to 5[. > 3] let $c := $a + $b return $c

(: $b is inlined :) for $a in 1 to 5[. > 3] let $c := $a + 2 return $c

(: $c is inlined :) for $a in 1 to 5[. > 3] return $a + 2

(: the for and return clauses are merged and rewritten to a simple map :) (1 to 5)[. > 3] ! (. + 2) </syntaxhighlight>

Changelog

Introduced with Version 9.4.