Difference between revisions of "XQuery Extensions"
m (Text replacement - "syntaxhighlight" to "pre") |
|||
(63 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
This article is part of the [[XQuery|XQuery Portal]]. It lists extensions and optimizations that are specific to the BaseX XQuery processor. | This article is part of the [[XQuery|XQuery Portal]]. It lists extensions and optimizations that are specific to the BaseX XQuery processor. | ||
− | = | + | =Expressions= |
+ | |||
+ | {{Announce|Removed with Version 11}}: Elvis operator <code>?:</code>, in favor of the new <code>[https://qt4cg.org/specifications/xquery-40/xquery-40.html#id-otherwise otherwise]</code> expression. | ||
+ | |||
+ | ==Ternary If== | ||
+ | |||
+ | The [https://en.wikipedia.org/wiki/%3F: ternary if] operator provides a short syntax for conditions. It is also called '''conditional operator''' or '''ternary operator'''. In most languages, the syntax is <code>a ? b : c</code>. As <code>?</code> and <code>:</code> have already been taken in XQuery, the syntax of Perl 6 is used: | ||
+ | |||
+ | <pre lang='xquery'> | ||
+ | (: if/then/else :) | ||
+ | if ($ok) then 1 else 0, | ||
+ | (: ternary if :) | ||
+ | $ok ?? 1 !! 0 | ||
+ | </pre> | ||
+ | |||
+ | The expression returns <code>ok</code> if the effective boolean value of <code>$test</code> is true, and it returns <code>fails</code> otherwise. | ||
+ | |||
+ | ==If Without Else== | ||
+ | |||
+ | In XQuery 3.1, both branches of the <code>if</code> expression need to be specified. In many cases, only one branch is required, so the <code>else</code> branch was made optional in BaseX. If the second branch is omitted, an empty sequence will be returned if the effective boolean value of the test expression is false. Some examples: | ||
+ | |||
+ | <pre lang='xquery'> | ||
+ | if (doc-available($doc)) then doc($doc), | ||
+ | if (file:exists($file)) then file:delete($file), | ||
+ | if (permissions:valid($user)) then <html>Welcome!</html> | ||
+ | </pre> | ||
+ | |||
+ | If conditions are nested, a trailing else branch will be associated with the innermost <code>if</code>: | ||
− | + | <pre lang='xquery'> | |
+ | if ($a) then if($b) then '$a and $b is true' else 'only $a is true' | ||
+ | </pre> | ||
− | + | In general, if you have multiple or nested if expressions, additional parentheses can improve the readibility of your code: | |
− | <pre | + | <pre lang='xquery'> |
− | ' | + | if ($a) then ( |
+ | if($b) then '$a and $b is true' else 'only $a is true' | ||
+ | ) | ||
</pre> | </pre> | ||
− | + | =Functions= | |
+ | |||
+ | ==Regular Expressions== | ||
− | + | In analogy with Saxon, you can specify the flag {{Code|j}} to revert to Java’s default regex parser. For example, this allows you to use the word boundary option {{Code|\b}}, which has not been included in the XQuery grammar for regular expressions: | |
− | |||
− | + | '''Example:''' | |
− | + | <pre lang='xquery'> | |
− | + | (: yields "!Hi! !there!" :) | |
+ | replace('Hi there', '\b', '!', 'j') | ||
</pre> | </pre> | ||
− | + | =Serialization= | |
+ | |||
+ | * <code>basex</code>is used as the default serialization method: nodes are serialized as XML, atomic values are serialized as string, and items of binary type are output in their native byte representation. Function items (including maps and arrays) are output just like with the [[XQuery 3.1#Adaptive Serialization|adaptive]] method. | ||
+ | * With {{Code|csv}}, you can output XML nodes as CSV data (see the [[CSV Module]] for more details). | ||
+ | * With {{Code|json}}, items are output as JSON as described in the [https://www.w3.org/TR/xslt-xquery-serialization-31/#json-output official specification]. If the root node is of type {{Code|element(json)}}, items are serialized as described for the {{Code|direct}} format in the [[JSON Module]]. | ||
+ | |||
+ | For more information and some additional BaseX-specific parameters, see the article on [[Serialization]]. | ||
=Option Declarations= | =Option Declarations= | ||
+ | |||
+ | ==Database Options== | ||
[[Options|Local database options]] can be set in the prolog of an XQuery main module. In the option declaration, options need to be bound to the [[Database Module]] namespace. All values will be reset after the evaluation of a query: | [[Options|Local database options]] can be set in the prolog of an XQuery main module. In the option declaration, options need to be bound to the [[Database Module]] namespace. All values will be reset after the evaluation of a query: | ||
− | <pre | + | <pre lang='xquery'> |
− | declare option db: | + | declare option db:catalog 'etc/w3-catalog.xml'; |
doc('doc.xml') | doc('doc.xml') | ||
</pre> | </pre> | ||
+ | |||
+ | ==XQuery Locks== | ||
+ | |||
+ | If locks are declared in the query prolog of a module via the {{Code|basex:lock}} option, access to functions of this module locks will be controlled by the central transaction management. See [[Transaction Management#Options|Transaction Management]] for further details. | ||
=Pragmas= | =Pragmas= | ||
− | + | ==BaseX Pragmas== | |
+ | |||
+ | {{Announce|Updated with Version 11}}: Renamed from {{Code|non-deterministic}} to {{Code|nondeterministic}}. | ||
+ | |||
+ | Many optimizations in BaseX will only be performed if an expression is ''deterministic'' (i. e., if it always yields the same output and does not have side effects). By flagging an expression as nondeterministic, optimizations and query rewritings can be suppressed: | ||
+ | |||
+ | <pre lang='xquery'> | ||
+ | sum( (# basex:nondeterministic #) { | ||
+ | 1 to 100000000 | ||
+ | }) | ||
+ | </pre> | ||
+ | |||
+ | This pragma can be helpful when debugging your code. | ||
− | + | In analogy with option declarations and function annotations, XQuery locks can also set via pragmas. See [[Transaction Management#Options|Transaction Management]] for details and examples. | |
− | <pre | + | <pre lang='xquery'> |
− | (# | + | (# basex:write-lock CONFIGLOCK #) { |
+ | file:write('config.xml', <config/>) | ||
+ | } | ||
</pre> | </pre> | ||
− | * | + | ==Database Pragmas== |
+ | |||
+ | [[Options|Local database options]] can also be assigned via pragmas: | ||
+ | |||
+ | * [[Indexes|Index access rewritings]] can be enforced. This is helpful if the name of a database is not static (see [[Indexes#Enforce Rewritings|Enforce Rewritings]] for more details): | ||
− | <pre | + | <pre lang='xquery'> |
(# db:enforceindex #) { | (# db:enforceindex #) { | ||
for $db in ('persons1', 'persons2', 'persons3') | for $db in ('persons1', 'persons2', 'persons3') | ||
− | return db: | + | return db:get($db)//name[text() = 'John'] |
} | } | ||
</pre> | </pre> | ||
− | * | + | * Node copying in node constructors can be disabled (see {{Option|COPYNODE}} for more details). The following query will consume much less memory than without pragma as the database nodes will not be fully duplicated, but only attached to the {{Code|xml}} parent element: |
− | <pre | + | <pre lang='xquery'> |
file:write( | file:write( | ||
'wrapped-db-nodes.xml', | 'wrapped-db-nodes.xml', | ||
(# db:copynode false #) { | (# db:copynode false #) { | ||
− | <xml>{ db: | + | <xml>{ db:get('huge') }</xml> |
} | } | ||
) | ) | ||
</pre> | </pre> | ||
− | + | * An XML catalog can be specified for URI rewritings. See the [[Catalog Resolver]] section for an example. | |
− | |||
− | |||
− | |||
− | |||
=Annotations= | =Annotations= | ||
− | == | + | ==Function Inlining== |
{{Code|%basex:inline([limit])}} controls if functions will be inlined. | {{Code|%basex:inline([limit])}} controls if functions will be inlined. | ||
Line 78: | Line 137: | ||
'''Query:''' | '''Query:''' | ||
− | <pre | + | <pre lang='xquery'> |
declare function local:square($a) { $a * $a }; | declare function local:square($a) { $a * $a }; | ||
for $i in 1 to 3 | for $i in 1 to 3 | ||
Line 86: | Line 145: | ||
'''Query after function inlining:''' | '''Query after function inlining:''' | ||
− | <pre | + | <pre lang='xquery'> |
for $i in 1 to 3 | for $i in 1 to 3 | ||
return | return | ||
Line 95: | Line 154: | ||
'''Query after further optimizations:''' | '''Query after further optimizations:''' | ||
− | <pre | + | <pre lang='xquery'> |
for $i in 1 to 3 | for $i in 1 to 3 | ||
return $i * $i | return $i * $i | ||
Line 106: | Line 165: | ||
'''Example:''' | '''Example:''' | ||
− | <pre | + | <pre lang='xquery'> |
(: disable function inlining; the full stack trace will be shown... :) | (: disable function inlining; the full stack trace will be shown... :) | ||
declare %basex:inline(0) function local:e() { error() }; | declare %basex:inline(0) function local:e() { error() }; | ||
Line 114: | Line 173: | ||
'''Result:''' | '''Result:''' | ||
− | <pre | + | <pre lang="xml"> |
Stopped at query.xq, 1/53: | Stopped at query.xq, 1/53: | ||
[FOER0000] Halted on error(). | [FOER0000] Halted on error(). | ||
Line 122: | Line 181: | ||
</pre> | </pre> | ||
− | == | + | ==Lazy Evaluation== |
− | {{Code|%basex:lazy}} enforces | + | {{Code|%basex:lazy}} enforces lazy evaluation of a global variable. An example: |
'''Example:''' | '''Example:''' | ||
− | <pre | + | <pre lang='xquery'> |
− | declare %basex:lazy variable $january := doc('does-not-exist'); | + | declare %basex:lazy variable $january := doc('does-not-exist.xml'); |
− | if(month-from-date(current-date()) | + | if(month-from-date(current-date()) = 1) then $january else () |
</pre> | </pre> | ||
− | The annotation ensures that an error | + | The annotation ensures that an error is only raised if the condition yields true. Without the annotation, the error is always raised if the referenced document is not found. |
+ | |||
+ | ==XQuery Locks== | ||
+ | |||
+ | In analogy with option declarations and pragmas, locks can also set via annotations. See [[Transaction Management#Annotations|Transaction Management]] for details and examples. | ||
+ | |||
+ | =Namespaces= | ||
− | + | In XQuery, some namespaces are statically bound to prefixes. The following query requires no additional namespaces declarations in the query prolog: | |
− | + | <pre lang='xquery'> | |
− | + | <xml:abc xmlns:prefix='uri' local:fn='x'/>, | |
+ | fn:exists(1) | ||
+ | </pre> | ||
− | + | In BaseX, various other namespaces are predefined. Apart from the namespaces that are listed on the [[Module Library]] page, the following namespaces are statically bound: | |
− | = | + | {| class="wikitable sortable" |
+ | |- | ||
+ | ! Description | ||
+ | ! Prefix | ||
+ | ! Namespace URI | ||
+ | |- | ||
+ | | [[#Annotations|BaseX Annotations]], [[#Pragmas|Pragmas]], … | ||
+ | | <code>basex</code> | ||
+ | | <code><nowiki>http://basex.org</nowiki></code> | ||
+ | |- | ||
+ | | [[RESTXQ#Input Options|RESTXQ: Input Options]] | ||
+ | | <code>input</code> | ||
+ | | <code><nowiki>http://basex.org/modules/input</nowiki></code> | ||
+ | |- | ||
+ | | [[Repository#EXPath_Packaging|EXPath Packages]] | ||
+ | | <code>pkg</code> | ||
+ | | <code><nowiki>http://expath.org/ns/pkg</nowiki></code> | ||
+ | |- | ||
+ | | [[XQuery Errors]] | ||
+ | | <code>err</code> | ||
+ | | <code><nowiki>http://www.w3.org/2005/xqt-errors</nowiki></code> | ||
+ | |- | ||
+ | | [[Serialization]] | ||
+ | | <code>output</code> | ||
+ | | <code><nowiki>http://www.w3.org/2010/xslt-xquery-serialization</nowiki></code> | ||
+ | |} | ||
− | + | =Suffixes= | |
− | + | In BaseX, files with the suffixes {{Code|.xq}}, {{Code|.xqm}}, {{Code|.xqy}}, {{Code|.xql}}, {{Code|.xqu}} and {{Code|.xquery}} are treated as XQuery files. In XQuery, there are main and library modules: | |
− | |||
− | |||
− | |||
− | |||
− | + | * Main modules have an expression as query body. Here is a minimum example: | |
− | |||
− | |||
− | + | <pre lang='xquery'> | |
− | + | 'Hello World!' | |
− | |||
− | |||
</pre> | </pre> | ||
− | + | * Library modules start with a module namespace declaration and have no query body: | |
+ | |||
+ | <pre lang='xquery'> | ||
+ | module namespace hello = 'http://basex.org/examples/hello'; | ||
− | + | declare function hello:world() { | |
− | + | 'Hello World!' | |
− | + | }; | |
− | |||
</pre> | </pre> | ||
− | + | We recommend {{Code|.xq}} as suffix for for main modules, and {{Code|.xqm}} for library modules. However, the actual module type will dynamically be detected when a file is opened and parsed. | |
=Miscellaneous= | =Miscellaneous= | ||
Various other extensions are described in the articles on [[Full-Text#BaseX Features|XQuery Full Text]] and [[Updates|XQuery Update]]. | Various other extensions are described in the articles on [[Full-Text#BaseX Features|XQuery Full Text]] and [[Updates|XQuery Update]]. | ||
+ | |||
+ | =Changelog= | ||
+ | |||
+ | ;Version 11: | ||
+ | |||
+ | * Removed: Elvis operator <code>?:</code>, in favor of the new <code>[https://qt4cg.org/specifications/xquery-40/xquery-40.html#id-otherwise otherwise]</code> expression. | ||
+ | * Updated: Renamed from {{Code|non-deterministic}} to {{Code|nondeterministic}}. | ||
+ | |||
+ | ;Version 9.1: | ||
+ | |||
+ | * Added: New [[#Expressions|Expressions]]: Ternary if, elvis Operator, if without else | ||
+ | * Added: XQuery Locks via pragmas and function annotations. | ||
+ | * Added: [[#Regular expressions|Regular Expressions]], {{Code|j}} flag for using Java’s default regex parser. |
Latest revision as of 18:39, 1 December 2023
This article is part of the XQuery Portal. It lists extensions and optimizations that are specific to the BaseX XQuery processor.
Contents
Expressions[edit]
Removed with Version 11: Elvis operator ?:
, in favor of the new otherwise
expression.
Ternary If[edit]
The ternary if operator provides a short syntax for conditions. It is also called conditional operator or ternary operator. In most languages, the syntax is a ? b : c
. As ?
and :
have already been taken in XQuery, the syntax of Perl 6 is used:
(: if/then/else :)
if ($ok) then 1 else 0,
(: ternary if :)
$ok ?? 1 !! 0
The expression returns ok
if the effective boolean value of $test
is true, and it returns fails
otherwise.
If Without Else[edit]
In XQuery 3.1, both branches of the if
expression need to be specified. In many cases, only one branch is required, so the else
branch was made optional in BaseX. If the second branch is omitted, an empty sequence will be returned if the effective boolean value of the test expression is false. Some examples:
if (doc-available($doc)) then doc($doc),
if (file:exists($file)) then file:delete($file),
if (permissions:valid($user)) then <html>Welcome!</html>
If conditions are nested, a trailing else branch will be associated with the innermost if
:
if ($a) then if($b) then '$a and $b is true' else 'only $a is true'
In general, if you have multiple or nested if expressions, additional parentheses can improve the readibility of your code:
if ($a) then (
if($b) then '$a and $b is true' else 'only $a is true'
)
Functions[edit]
Regular Expressions[edit]
In analogy with Saxon, you can specify the flag j
to revert to Java’s default regex parser. For example, this allows you to use the word boundary option \b
, which has not been included in the XQuery grammar for regular expressions:
Example:
(: yields "!Hi! !there!" :)
replace('Hi there', '\b', '!', 'j')
Serialization[edit]
basex
is used as the default serialization method: nodes are serialized as XML, atomic values are serialized as string, and items of binary type are output in their native byte representation. Function items (including maps and arrays) are output just like with the adaptive method.- With
csv
, you can output XML nodes as CSV data (see the CSV Module for more details). - With
json
, items are output as JSON as described in the official specification. If the root node is of typeelement(json)
, items are serialized as described for thedirect
format in the JSON Module.
For more information and some additional BaseX-specific parameters, see the article on Serialization.
Option Declarations[edit]
Database Options[edit]
Local database options can be set in the prolog of an XQuery main module. In the option declaration, options need to be bound to the Database Module namespace. All values will be reset after the evaluation of a query:
declare option db:catalog 'etc/w3-catalog.xml';
doc('doc.xml')
XQuery Locks[edit]
If locks are declared in the query prolog of a module via the basex:lock
option, access to functions of this module locks will be controlled by the central transaction management. See Transaction Management for further details.
Pragmas[edit]
BaseX Pragmas[edit]
Updated with Version 11: Renamed from non-deterministic
to nondeterministic
.
Many optimizations in BaseX will only be performed if an expression is deterministic (i. e., if it always yields the same output and does not have side effects). By flagging an expression as nondeterministic, optimizations and query rewritings can be suppressed:
sum( (# basex:nondeterministic #) {
1 to 100000000
})
This pragma can be helpful when debugging your code.
In analogy with option declarations and function annotations, XQuery locks can also set via pragmas. See Transaction Management for details and examples.
(# basex:write-lock CONFIGLOCK #) {
file:write('config.xml', <config/>)
}
Database Pragmas[edit]
Local database options can also be assigned via pragmas:
- Index access rewritings can be enforced. This is helpful if the name of a database is not static (see Enforce Rewritings for more details):
(# db:enforceindex #) {
for $db in ('persons1', 'persons2', 'persons3')
return db:get($db)//name[text() = 'John']
}
- Node copying in node constructors can be disabled (see
COPYNODE
for more details). The following query will consume much less memory than without pragma as the database nodes will not be fully duplicated, but only attached to thexml
parent element:
file:write(
'wrapped-db-nodes.xml',
(# db:copynode false #) {
<xml>{ db:get('huge') }</xml>
}
)
- An XML catalog can be specified for URI rewritings. See the Catalog Resolver section for an example.
Annotations[edit]
Function Inlining[edit]
%basex:inline([limit])
controls if functions will be inlined.
If XQuery functions are inlined, the function call will be replaced by a FLWOR expression, in which the function variables are bound to let clauses, and in which the function body is returned. This optimization triggers further query rewritings that will speed up your query. An example:
Query:
declare function local:square($a) { $a * $a };
for $i in 1 to 3
return local:square($i)
Query after function inlining:
for $i in 1 to 3
return
let $a := $i
return $a * $a
Query after further optimizations:
for $i in 1 to 3
return $i * $i
By default, XQuery functions will be inlined if the query body is not too large and does not exceed a fixed number of expressions, which can be adjusted via the INLINELIMIT
option.
The annotation can be used to overwrite this global limit: Function inlining can be enforced if no argument is specified. Inlining will be disabled if 0
is specified.
Example:
(: disable function inlining; the full stack trace will be shown... :)
declare %basex:inline(0) function local:e() { error() };
local:e()
Result:
Stopped at query.xq, 1/53:
[FOER0000] Halted on error().
Stack Trace:
- query.xq, 2/9
Lazy Evaluation[edit]
%basex:lazy
enforces lazy evaluation of a global variable. An example:
Example:
declare %basex:lazy variable $january := doc('does-not-exist.xml');
if(month-from-date(current-date()) = 1) then $january else ()
The annotation ensures that an error is only raised if the condition yields true. Without the annotation, the error is always raised if the referenced document is not found.
XQuery Locks[edit]
In analogy with option declarations and pragmas, locks can also set via annotations. See Transaction Management for details and examples.
Namespaces[edit]
In XQuery, some namespaces are statically bound to prefixes. The following query requires no additional namespaces declarations in the query prolog:
<xml:abc xmlns:prefix='uri' local:fn='x'/>,
fn:exists(1)
In BaseX, various other namespaces are predefined. Apart from the namespaces that are listed on the Module Library page, the following namespaces are statically bound:
Description | Prefix | Namespace URI |
---|---|---|
BaseX Annotations, Pragmas, … | basex
|
http://basex.org
|
RESTXQ: Input Options | input
|
http://basex.org/modules/input
|
EXPath Packages | pkg
|
http://expath.org/ns/pkg
|
XQuery Errors | err
|
http://www.w3.org/2005/xqt-errors
|
Serialization | output
|
http://www.w3.org/2010/xslt-xquery-serialization
|
Suffixes[edit]
In BaseX, files with the suffixes .xq
, .xqm
, .xqy
, .xql
, .xqu
and .xquery
are treated as XQuery files. In XQuery, there are main and library modules:
- Main modules have an expression as query body. Here is a minimum example:
'Hello World!'
- Library modules start with a module namespace declaration and have no query body:
module namespace hello = 'http://basex.org/examples/hello';
declare function hello:world() {
'Hello World!'
};
We recommend .xq
as suffix for for main modules, and .xqm
for library modules. However, the actual module type will dynamically be detected when a file is opened and parsed.
Miscellaneous[edit]
Various other extensions are described in the articles on XQuery Full Text and XQuery Update.
Changelog[edit]
- Version 11
- Removed: Elvis operator
?:
, in favor of the newotherwise
expression. - Updated: Renamed from
non-deterministic
tonondeterministic
.
- Version 9.1
- Added: New Expressions: Ternary if, elvis Operator, if without else
- Added: XQuery Locks via pragmas and function annotations.
- Added: Regular Expressions,
j
flag for using Java’s default regex parser.