Difference between revisions of "XQuery Extensions"

From BaseX Documentation
Jump to navigation Jump to search
(6 intermediate revisions by the same user not shown)
Line 34: Line 34:
 
=Pragmas=
 
=Pragmas=
  
[[Options|Local database options]] can be assigned locally via pragmas:
+
A [[Options|local database option]] can also be assigned locally via a pragma. Examples:
 +
 
 +
* Whitespace chopping is disabled for a particular document (see {{Option|CHOP}}):
  
 
<pre class="brush:xquery">
 
<pre class="brush:xquery">
Line 40: Line 42:
 
</pre>
 
</pre>
  
Various optimizations can be disabled by marking an expression as non-deterministic:
+
* {{Version|9.0}}: Enforce index rewriting if database name is not static (see [[Indexes#Enforce Rewritings|Enforce Rewritings]] for more examples):
 +
 
 +
<pre class="brush:xquery">
 +
(# db:enforceindex #) {
 +
  for $db in ('persons1', 'persons2', 'persons3')
 +
  return db:open($db)//name[text() = 'John']
 +
}
 +
</pre>
 +
 
 +
Many optimizations and query rewritings can be disabled by marking an expression as non-deterministic:
  
 
<pre class="brush:xquery">
 
<pre class="brush:xquery">
Line 48: Line 59:
 
=Annotations=
 
=Annotations=
  
The following implementation-defined annotations are available:
+
==basex:inline==
  
* {{Code|%basex:inline([limit])}} controls if functions will be inlined.
+
{{Code|%basex:inline([limit])}} controls if functions will be inlined.
  
 
If XQuery functions are ''inlined'', the function call will be replaced by a FLWOR expression, in which the function variables are bound to let clauses, and in which the function body is returned. This optimization triggers further query rewritings that will speed up your query. An example:
 
If XQuery functions are ''inlined'', the function call will be replaced by a FLWOR expression, in which the function variables are bound to let clauses, and in which the function body is returned. This optimization triggers further query rewritings that will speed up your query. An example:
Line 78: Line 89:
 
</pre>
 
</pre>
  
By default, XQuery functions will be ''inlined'' if the query body does not exceed the value assigned to the {{Option|INLINELIMIT}} option.
+
By default, XQuery functions will be ''inlined'' if the query body is not too large and does not exceed a fixed number of expressions, which can be adjusted via the {{Option|INLINELIMIT}} option.
  
 
The annotation can be used to overwrite this global limit: Function inlining can be enforced if no argument is specified. Inlining will be disabled if {{Code|0}} is specified.
 
The annotation can be used to overwrite this global limit: Function inlining can be enforced if no argument is specified. Inlining will be disabled if {{Code|0}} is specified.
Line 100: Line 111:
 
</pre>
 
</pre>
  
* {{Code|%basex:lazy}} enforces the lazy evaluation of a global variable. Example:
+
==basex:lazy==
 +
 
 +
{{Code|%basex:lazy}} enforces the lazy evaluation of a global variable. An example:
  
 
'''Example:'''  
 
'''Example:'''  
Line 108: Line 121:
 
</pre>
 
</pre>
  
The annotation ensures that an error will only be thrown if the condition yields true. Without the annotation, the error will always be thrown, because the referenced document is not found.
+
The annotation ensures that an error will only be thrown if the condition yields true. Without the annotation, the error will always be raised, because the referenced document is not found.
  
 
=Serialization=
 
=Serialization=

Revision as of 18:03, 16 October 2017

This article is part of the XQuery Portal. It lists extensions and optimizations that are specific to the BaseX XQuery processor.

Suffixes

In BaseX, files with the suffixes .xq, .xqm, .xqy, .xql, .xqu and .xquery are treated as XQuery files. In XQuery, there are main and library modules:

  • Main modules have an expression as query body. Here is a minimum example:
'Hello World!'
  • Library modules start with a module namespace declaration and have no query body:
module namespace hello = 'http://basex.org/examples/hello';

declare function hello:world() {
  'Hello World!'
};

We recommend .xq as suffix for for main modules, and .xqm for library modules. However, the actual module type will dynamically be detected when a file is opened and parsed.

Option Declarations

Local database options can be set in the prolog of an XQuery main module. In the option declaration, options need to be bound to the Database Module namespace. All values will be reset after the evaluation of a query:

declare option db:chop 'false';
doc('doc.xml')

Pragmas

A local database option can also be assigned locally via a pragma. Examples:

  • Whitespace chopping is disabled for a particular document (see CHOP):
(# db:chop false #) { doc('doc.xml') }
  • Version 9.0: Enforce index rewriting if database name is not static (see Enforce Rewritings for more examples):
(# db:enforceindex #) {
  for $db in ('persons1', 'persons2', 'persons3')
  return db:open($db)//name[text() = 'John']
}

Many optimizations and query rewritings can be disabled by marking an expression as non-deterministic:

count( (# basex:non-deterministic #) { 1 to 10 })

Annotations

basex:inline

%basex:inline([limit]) controls if functions will be inlined.

If XQuery functions are inlined, the function call will be replaced by a FLWOR expression, in which the function variables are bound to let clauses, and in which the function body is returned. This optimization triggers further query rewritings that will speed up your query. An example:

Query:

declare function local:square($a) { $a * $a };
for $i in 1 to 3
return local:square($i)

Query after function inlining:

for $i in 1 to 3
return
  let $a := $i
  return $a * $a

Query after further optimizations:

for $i in 1 to 3
return $i * $i

By default, XQuery functions will be inlined if the query body is not too large and does not exceed a fixed number of expressions, which can be adjusted via the INLINELIMIT option.

The annotation can be used to overwrite this global limit: Function inlining can be enforced if no argument is specified. Inlining will be disabled if 0 is specified.

Example:

(: disable function inlining; the full stack trace will be shown... :)
declare %basex:inline(0) function local:e() { error() };
local:e()

Result:

Stopped at query.xq, 1/53:
[FOER0000] Halted on error().

Stack Trace:
- query.xq, 2/9

basex:lazy

%basex:lazy enforces the lazy evaluation of a global variable. An example:

Example:

declare %basex:lazy variable $january := doc('does-not-exist');
if(month-from-date(current-date()) == 1) then $january else ()

The annotation ensures that an error will only be thrown if the condition yields true. Without the annotation, the error will always be raised, because the referenced document is not found.

Serialization

  • basex is used as the default serialization method: nodes are serialized as XML, atomic values are serialized as string, and items of binary type are output in their native byte representation. Function items (including maps and arrays) are output just like with the adaptive method.
  • csv allows you to output XML nodes as CSV data (see the CSV Module for more details).

For more information and some additional BaseX-specific parameters, see the article on Serialization.

Non-determinism

In XQuery, deterministic functions are “guaranteed to produce ·identical· results from repeated calls within a single ·execution scope· if the explicit and implicit arguments are identical”. In BaseX, many extension functions are non-deterministic or side-effecting. If an expression is internally flagged as non-deterministic, various optimizations that might change their execution order will not be applied.

(: QUERY A... :)
let $n := 456
for $i in 1 to 2
return $n

(: ...will be optimized to :)
for $i in 1 to 2
return 456

(: QUERY B will not be rewritten :)
let $n := random:integer()
for $i in 1 to 2
return $n

In some cases, functions may contain non-deterministic code, but the query compiler may not be able to detect this statically. See the following example:

for $read in (file:read-text#1, file:read-binary#1)
let $ignored := non-deterministic $read('input.file')
return ()

Two non-deterministic functions will be bound to $read, and the result of the function call will be bound to $ignored. As the variable is not referenced in the subsequent code, the let clause would usually be discarded by the compiler. In the given query, however, execution will be enforced because of the BaseX-specific non-deterministic keyword.

Miscellaneous

Various other extensions are described in the articles on XQuery Full Text and XQuery Update.