Main Page » XQuery » XQuery 4.0

XQuery 4.0

This article provides a summary of the most important features of the upcoming XQuery 4.0 specification that are already available in BaseX.

Please note that the current specification is in draft status, so everything is subject to change until it is finalized. We have chosen to present only those features that appear sufficiently stable, but you may come across various other features in our implementation that we have not selected for documentation yet.

Operators and Expressions

Otherwise

The otherwise operator has two operands. The value of the first operand will be returned only if non-empty; otherwise, the value of the second operand will be returned. The expression…
$value otherwise $fallback
…is equivalent to:
if (exists($value)) then $value else $fallback
The operator can also be chained:
$node/(@full-name otherwise @name otherwise @id)

Pipeline Operator

The pipeline operator -> evaluates an expression and binds the result to the context value before evaluating another expression. It can be used to easily chain multiple expressions that would otherwise be expressed with verbose FLWOR expressions. Examples:

Tokenizes a string, counts the tokens, creates a concatenated string and returns count=3:

'a b c' -> tokenize(.) -> count(.) -> concat('count=', .)

Calculates the sum of powers of 2 and returns 2046:

(1 to 10) ! math:pow(2, .) -> sum(.)

Reduces a long sequence to at most 9 elements, with dots appended, and returns a single string:

$dictionary/word
-> (if(count(.) < 10) then . else (.[1 to 9], '…'))
-> string-join(., '; ')

Mapping Arrow

The mapping arrow operator =!> complements the XQuery 3.1 arrow operator =>. It applies a function to each item in a sequence. In the following example, an input string is tokenized, the strings are replaced by an integer indicating the string length, and the resulting tokens are concatenated as string:
'You never know'
=> tokenize()
=!> string-length()
=> string-join(', ')
The result is:
3, 5, 4

Maps and Arrays

Map Order

Unlike in the past, map entries will retain the order in which they are inserted into a map. The order is preserved when creating maps with the Map Constructor, with map:build, map:merge, or map:of-pairs. If map:put is used, new entries will be appended to the existing entries. The order gets visible when a map is serialized, or when iterating over its entries:
map:for-each(
  { 3: 'three', 2: 'two', 1: 'one' },
  fn($k, $v) { $k || ': ' || $v }
)
…returns:
3: three
2: two
1: one
Note that, due to backward-compatibility, fn:deep-equal ignores the order. The following query returns true:
deep-equal(
  { 3: 'three', 2: 'two', 1: 'one' },
  { 1: 'one', 2: 'two', 3: 'three' }
)

Map Constructors

The keyword map in map constructors has become optional. The following expressions return the same map:
{ 1: 'one', 2: 'two' }
map { 1: 'one', 2: 'two' }
Note that the new syntax is similar to the JSON syntax. However, it is not identical, and it is more powerful because you can include arbitrary expressions that are evaluated when the map is generated. For example, you can use variables and arbitrary complex expressions to create the keys and values:
let $xml := document {
  <text>
    <line>Hi there</line>
    <line>Bye there</line>
  </text>
}
for $line at $pos in $xml//line
return { $pos: data($line) }

Map and Array Filter

Added:

Maps and arrays can be filtered with the syntax EXPR?[PREDICATE]:

Map Filter

If EXPR yields an map, PREDICATE is evaluated for each pair of the map, represented as { 'key': KEY, 'value': VALUE }. Again, the result is a map containing those entries for which the predicate returns true.

The following example returns { 1: 'one' }:

{ 1: 'one', 2: 'two', 3: 'three' }?[?key = 1]

The following example creates a map with 12 entries and returns { 'jul', 7: 'aug': 8 }, i.e., a map with all entries whose key contains u and whose value is greater than 6:

let $map := map:build(
  tokenize('jan feb mar apr may jun jul aug sep oct nov dec'),
  fn($value, $pos) { $value },
  fn($value, $pos) { $pos }
)
return $map?[contains(?key, 'u') and ?value > 6]

Array Filter

If EXPR yields an array, PREDICATE is evaluated for each member of the array. The result is an array containing those members for which the predicate returns true.

The following example returns [ 1, 2, 3 ]:

array { 1 to 10 }?[. < 6]

The following example returns [ "all", "but", "the" ]:

[ "all", "but", "the", "last" ]?[position() < last()]

Syntax

String Templates

Single backticks can be used to create literal strings with ampersands and angle brackets:
`<Jill & Jack>`
Curly braces can be used to include expressions in the string:
let $name := 'Shiva'
let $age := 43
return `Person: { $name }, { $age }`
The syntax represents a compact alternative for classical string concatenations…
'Person: ' || $name || ', ' || $age
…and the string constructor:
``[Person: `{ $name }`, `{ $age }`]``

Numeric Literals

Underscores can be included in numeric literals to improve readability:
(: returns the decimal value 1001000.1001 :)
1_000_000 + 1_000 + 0.100_1
If numbers are prefixed with 0x or 0b, the input will be interpreted as a hexadecimal or a binary number:
(: returns (15, 15, 15) :)
(15, 0xF, 0b1111)

QName Literals

The notation of QNames is cumbersome as it requires a function call:
<xml/>[node-name() = QName('URI', 'name')]
element { parse-QName('Q{URI}name') } {}
error(xs:QName('err:UNEXPECTED'))
The QName literal syntax allows you to specify a QName by prefixing an EQName with the character #. Examples:
<xml/>[node-name() = #xml]
element #Q{URI}name {}
error(#err:UNEXPECTED)
Note that error(err:UNEXPECTED) would not work: Without the #, the QName would be interpreted as an XPath step.

Node constructors

With the traditional syntax for node constructors, you can specify the name of an element, attribute, processing instruction or namespace directly after the keyword that denotes the node type:
element div { },
attribute eq { }
Due to the 4.0 change that made the map keyword optional, certain constructs in the language have become ambiguous. For example, the following expression…
let $f := element return {}
…could be interpreted as let $f := element and return {}, or let $f := and element return {}. The following keywords may cause ambiguity when used as names in node constructors, and are expected to be disallowed in version 4.0:
and case div else eq except ge gt idiv intersect is le lt
mod ne or otherwise return satisfies to union where while
BaseX still accepts the keywords – in the example above, element return {} is parsed as an element constructor – but it is advisable from now on to use either the new QName Literals syntax…
element #div { },
attribute #eq { }
…or resort to the alternative 3.1 curly-bracket syntax:
element { 'div' } { },
attribute { 'eq' } { }

Braced if

In many programming languages, the branches of conditional expressions can be enclosed in curly braces. This syntax is now supported in XQuery as well. It can be used to omit the else branch:
if ($a) { $b }
if ($a) { if ($b) { $c } }
…is equivalent to:
if ($a) then $b else ()
if ($a) then (if ($b) then $c else ()) else ()
With BaseX, the else branch can be specified or omitted with both constructs (see if without else).

Lookup Operator

If the righthand operator of a loop expression is a variable or a string, the parentheses can be omitted:
$address?$name
$city?'city code'
…is equivalent to:
$address?($name)
$city?('city code')

Switch

To simplify formatting, the cases of switch expressions can now be enclosed in curly braces. In addition, multiple values can be supplied in a case branch:
for $city in ('tokyo', 'lima')
return switch($city) {
  case ('tokyo', 'kyoto')
    return 'Japan'
  default
    return 'World'
}
If the comparand is omitted, it defaults to true(), and the construct can be used to specify multiple boolean conditions:
for $name in ('ANNA', 'Hanna', 'hannah')
return switch() {
  case matches($name, '^[a-z]+$')
    return 'lower-case'
  case matches($name, '^[A-Z]+$')
    return 'upper-case'
  default
    return 'mixed'
}

Typeswitch

Similarly, curly braces can also be used for typeswitch:
typeswitch(<xml/>) {
  case node()
    return 'node'
  default
    return 'no node'
}

Annotations

The boolean functions true() and false() are now valid annotation values.

With the new fn:annotation-values function (which replaces inspect:function-annotations), you can retrieve the annotations of a function:

declare %local:deprecated(true()) function local:inc($n) { $n + 1 };

let $anns := function-annotations(local:inc#1)
where map:get($anns, xs:QName('local:deprecated'))
return 'Deprecated!'

Node Tests

Multiple element/attribute names can be specified in the element() and attribute() tests of axis steps:
(: returns all child elements named 'a' or 'b' :)
let $xml := <xml><a/><b/></xml>
return $xml/element(a|b)

Updated:

Multiple node tests can be supplied with an axis:
(: returns all descendant elements and comments :)
let $xml := <xml><a/><!-- abc --></xml>
return $xml/descendant::(element()|comment())

The name of the root element can be embedded in the document-node test:

let $xml := document { <world><country>...</country></world> }
return $xml instance of document-node(world)

The syntax is a shortcut for the existing document-node(element(world)) test.

FLWOR Expressions

For member

Iterating over an array has been simplified. The member keyword can be used to bind the members of an array to a variable:
(: returns (0, 1, 5) :)
for member $m in [ (), 1, (2, 3) ]
return sum($m)

For key and value

Similarly, with the key and value keywords, the components of map entries can be bound to variables. The following example returns 0, 1 and 5:

for key $k value $v in { 1: 'one', 2: 'two', 3: 'three' }
return $k || ': ' || $v
It is possible to omit either key or value.

Let Clause

When variables are bound with the let clause, and when a type is specified, the input is now coerced (i.e., converted) to the target type:
let $name as xs:string := <name>Midori</name>
return $name
With XQuery 3.1, this expression would have been rejected, as the node could not be treated as string.

While Clause

Added: New clause.

With the while clause, an iteration is interrupted as soon as a condition fails:
for $i in 1 to 100000000000
while ($i * $i) < 10
return $i
This construct has advantages over where if it is known that no more results are to be expected after the first negative test.

Window Clause

The window clause has become more flexible. The start condition can now be omitted, as well as the when keyword (which both default to true()). The following expression…
for tumbling window $w in (2, 4, 6, 8, 10, 12, 14)
  (: start when true() :)
  end $last when $last mod 3 = 0
return <window>{ $w }</window>
…returns:
<window>2 4 6</window>
<window>8 10 12</window>
<window>14</window>
For the following query…
for sliding window $w in 1 to 3
  start at $s (: when true() :)
  end at $e when $e - $s eq 2
return array { $w }
…you get:
[ 1, 2, 3 ]
[ 2, 3 ]
[ 3 ]

Types

Alternatives

With the new “Choice Item Type”, multiple types can be specified as a single type. An item matches this type if it matches one of the alternative target types. The following function accepts a string or an integer:
declare function local:number(
  $value  as (xs:integer|xs:string)
) as xs:integer {
  if ($value instance of xs:integer) then (
    $value
  ) else (
    parse-integer($value)
  )
};
local:number('123')
If none of the types match, the argument will be converted to the target types in the given order. Once a conversion succeeds, the converted value becomes the effective value. This means that the order is important in which the types are specified. For example, if the function above is invoked with the element <value>123</value>, the effective value will be an integer. If the two target types are swapped, the value will be a string.

Enumerations

The type system has been extended to restrict a string argument to an enumeration of strings. The following function can only be called with one of the specified continent names:
declare function local:country(
  $name       as xs:string,
  $continent  as enum('Africa', 'America', 'Asia', 'Australia', 'Europe')
) as element(country) {
  <country name='{ $name }' continent='{ $continent }'/>
};
local:country('Ghana', 'Africa')
Note that a string is not an instance of an enumeration. In the given example, the supplied string Africa is not an instance of enum('Africa', 'America', 'Asia', 'Australia', 'Europe'). Instead, it is converted to an instance of the enumeration type when the above function is called.

Record Types

Added: Record types added.

With record types, the structure of a map can be further defined. A record type contains a list of keys with optional types, and an optional wildcard, with allows abitrary other map entries. For example, the following test returns true:
{
  'latitude' : -17.8292,
  'longitude': 31.0522,
  'info'     : 'Harare'
} instance of record(
  latitude   as xs:decimal,
  longitude  as xs:decimal,
  *
)
With Record Declarations, record types can be named.

Type Declarations

Added: Type declarations added.

With the arrival of Alternatives and Enumerations, the definitions of types can become pretty verbose. Type declarations can be used to counter that: Names can be assigned to such types, and the names can be used anywhere in the code, for example in function declarations…

module namespace data = 'http://basex.org/modules/data';

declare type data:person as element(person);
declare type data:index as map(xs:integer, xs:string);

declare function data:prepare($person as data:person) as data:index {
  map:merge(
    for $data at $pos in $person/*
    return map:entry($pos, $data)
  )
};

…or in FLWOR expressions:

declare type continent as enum('Africa', 'America', 'Asia', 'Australia', 'Europe');
declare type binary as (xs:base64Binary | xs:hexBinary);

for $name as continent in ('Africa', 'Asia')
let $data as binary := file:read-binary($name || '.data')
return bin:length($data)
As seen in the first example, types must be prefixed in library modules. Prefixes are optional in main modules.

Record Declarations

Added: Record declaration added.

Records are maps with a defined structure. With a record declaration,

  • a record type is defined,
  • a name is assigned to that record, and
  • a constructor function is provided for creating maps that adhere to that record definition.

The following example declares a data:vector record type with two or more entries. Multiple instances of this record are created, and a function is called that computes the dimension (the number of values) of the resulting records:

declare namespace data = 'data';

declare record data:vector(
  a  as xs:double,
  b  as xs:double := 1,
  c?,
  *
);

declare function data:dimension($vector as data:vector) as xs:integer {
  count($vector?*)
};

for $vector in (
  data:vector(1),
  data:vector(1, 2),
  data:vector(1, 2, 3),
  data:vector(a := 1, c := 3),
  data:vector(1, 2, 3, { 'd': 4, 'e': 5 })
)
return $vector => data:dimension()
The query returns 2, 2, 3, 3, 5. The data:vector record declaration contains the following definitions:
  • a is a mandatory double value. The constructor function call data:vector() would raise an error, as at least one value must be supplied for this entry.
  • b comes with a default value. This value is added to the resulting map if there is no matching value in the constructor.
  • c is optional and can have any type. It is only added to the map if it is supplied in the constructor.
  • Due to the wildcard *, the record can contain arbitrary other entries. With the constructor function, remaining map entries can be supplied by a map.

Records can also be created without the constructor function. The standard map constructor is sufficient, provided that the map created matches the targeted record definition:

let $vector as data:vector := { 'a': 1, 'b': 2, 'c': 3, 'd': 4 }
return data:dimension($vector),

(: the map is converted to a record when the function is called :)
data:dimension({ 'a': 1, 'b': 2 })

Functions

Built-in Functions

Numerous new functions have been added to the specification. Please visit Standard Functions, Array Functions, Map Functions and Math Functions for more details:

A separate article exists for fn:invisible-xml, which allows you to parse Invisible XML grammars.

Positional Access

An optional position parameter has been added to most higher-order functions. It can be used, for example, to enumerate results:

Query:
for-each(
  ('one', 'two', 'three'),
  fn($item, $pos) { $pos || '. ' || $item }
)

Calling Higher-Order Functions

When a higher-order function expects another function as an argument, it is now possible to supply a function with fewer arguments than specified.

For example, the signature of fn:for-each is:

fn:for-each(
  $input   as item()*,
  $action  as fn($item as item(), $pos as xs:integer) as item()*
) as item()*

The type of the $action parameter indicates that a passed function will be called internally with two arguments: The first argument is the item currently being processed, and the second argument is the current position of this item. Even though the type defines two parameters, it is possible to pass a function with 1 or 0 parameters. The following function calls are all valid:

(: function argument with 2 parameters :)
for-each(1 to 5, fn($item, $pos) { $pos || '. ' || format-integer($item, 'w') }),

(: function argument with 1 parameter :)
for-each(1 to 5, fn($item) { $item * $item }),

(: function argument with 0 parameters :)
for-each(1 to 5, fn() { 42 })

Default Values

Function parameters can be enriched with default values, thus making these parameters optional. If the following function…
(:~
 : Creates a string representation of a value with a specific serialization method.
 : @param  $value   value to be serialized
 : @param  $method  serialization method (default: html)
 :)
declare function local:serialize(
  $value,
  $method := 'html'
) {
  serialize($value, { 'method': $method })
}
…is called with with one argument, the default value is assigned to $method. If a second argument is supplied, the default value is ignored:
local:serialize(<html/>)
local:serialize(<html/>, 'xhtml')
If a default value is attached to a function parameter, all subsequent parameters must have default values, too.

Keyword Arguments

Code can become more readable if the parameters are addressed by their name. In the following example…
declare function local:coordinates(
  $x  as xs:integer := 0,
  $y  as xs:integer := 0,
  $z  as xs:integer := 0
) {
  { "x" : $x, "y" : $y, "z": $z }
};
local:coordinates(y := 56)
…a map with three keys will be returned:
{ "x" : 0, "y" : 56, "z": 0 }

Inline Functions

Function items have been introduced with XQuery 3.0. As the existing syntax of inline functions is pretty verbose…
let $inc := function($n) { $n + 1 }
return $inc(99)
…a short fn keyword has been added to the language. It is a plain alias:
let $inc := fn($n) { $n + 1 }
return $inc(99)

Focus Functions

Many functions have a single parameter. The parenthesized parameter declaration can now be omitted, and the generated function item will take one argument that is bound to the context value and can be addressed with the dot syntax . in the function body. The example from the previous paragraph can be further simplified to:
let $inc := fn { . + 1 }
return $inc(99)
This syntax is particularly concise when passing function items as arguments in function calls:
filter(1 to 10, fn { . > 5 }),
array:build(1 to 10, fn { format-integer(., 'w') })

Methods

Added:

If functions in maps are decorated with the %method function it becomes possible to access the map entries in the function body. When the function is retrieved with the lookup operator, the map is bound to the context value. The following example calls the function sum, which multiplies the values of the map entries x and y. It returns 11:
let $vector := {
  'x': 5,
  'y': 6,
  'sum': %method fn() { ?x + ?y }
} 
return $vector =?> sum()
Please note that this code calls the function sum that is defined inside the map (it does not call the built-in function fn:sum). The second example creates a map with a numeric value, and a function to create a new version of this map with the value incremented by 1:
let $number := {
  'value': 1,
  'inc': %method fn() { map:put(., 'value', ?value + 1) }
} 
return $number?inc()

Miscellaneous

Context Value

The context item has been generalized to the context value. Sequences can either be bound to the context with the context value declaration…
(: the 'countries' database can contains zero, one, or more documents :)
declare context value := db:get('countries');
.//name
…or without external bindings.

As previously, the context value can be addressed with ., the context item reference. In BaseX, this was already possible in previous versions, and the generalization is now standardized.

Try/Catch

Finally block

Added:

The try/catch expression has a new finally clause. Its expression will be evaluated after the expressions of the try clause and a possibly evaluated catch clause. If it raises an error, this error is returned instead of a result or an error that resulted from a try or catch expression. If it raises no error, it must yield an empty sequence; otherwise, a dynamic error is raised.

In the following example, fn:message will always be called, no matter if the division succeeds:

for $i in 0 to 2
return try {
  1 div $i
} catch err:FOAR0001 {
  'division error'
} finally {
  message('1 was divided by ' || $i)
}

Stack Trace

With $err:stack-trace, the stack trace can be retrieved within a try/catch clause. In BaseX, the returned trace is a string that refers to the position in the code that failed and the function call that led to the error. Every string consists of the module path and a line and column number.

Note that the executed code is an optimized version of the source code. As a consequence, the trace does not necessarily contain all function calls of the original code. For example, the following returns contains a single string test.xq, 1/30, because $f will be inlined by the compiler:

(: test.xq :)
let $f := fn() { error() }
return try { $f() } catch * { $err:stack-trace }
You can disable function inlining by setting INLINELIMIT to 0, either globally or locally:

let $f := %basex:inline(0) fn() { error() }
return try { $f() } catch * { $err:stack-trace }
…returns…
test.xq, 1/47
test.xq, 2/16
In previous versions of BaseX, the stack trace could be retrieved with $err:additional.

Coercion Rules

The coercion rules define the way how a value is converted to a target type. These rules have been enhanced to reduce the necessity of explicit type conversions: Strings/URIs, binary data and numbers are implicitly cast to the target type, or relabeled, if the input does not exceed the range of the target type. The following type assignments would all have raised an error with XQuery 3.1:
let $base64  as xs:base64Binary := xs:hexBinary('41')
let $uri     as xs:anyURI       := 'string'  (: string :)
let $integer as xs:integer      := 1.0  (: decimal :)
let $byte    as xs:byte         := 127  (: integer :)
return '✓'
The same rules apply when calling functions:
declare function local:f($value as xs:byte) { (: ... :) };
local:f(127)   (: succeeds :)
local:f(12345) (: fails :)

Predeclared Namespaces

All implementations must now predeclare the namespace prefixes math, map, array, err, and output. Which means that nothing changes for users of BaseX.

Changelog

Version 12.0
  • Added: Map Order: The insertion order is preserved when creating maps.
  • Added: Pipeline Operator: Example: (1 to 5) -> count(.).
  • Added: Type declarations. Example: declare type person as element(person).
  • Added: Record types. Example: { 'a': 1 } instance of record(a).
  • Added: Record declarations. Example: declare record coord(x, y).
  • Added: While clause. Example: for $i in 1 to 5 while ($i * $i) < 10 return $i.
  • Added: Map and array filter. Example: [ 1, 2, 3 ]?[. != 2].
  • Added: Methods: functions belonging to maps.
  • Added: Retrieval of stack trace in errors. Example: $err:stack-trace.
  • Added: Finally block: The try/catch expression has a new finally clause.
  • Added: QName Literals: Example: error(#err:UNEXPECTED).
  • Updated: Multiple node tests can be supplied with an axis. Example: child::(element()|text()).
  • Updated: Document node tests with name of root node. Example: $node instance of document-node(root).
  • Updated: QName literals in node constructors: element #div {}.
Version 11.0
  • Added: First release with support for various new XQuery 4.0 features.

⚡Generated with XQuery