https://docs.basex.org/api.php?action=feedcontributions&user=James+Ball&feedformat=atomBaseX Documentation - User contributions [en]2024-03-29T09:34:33ZUser contributionsMediaWiki 1.34.0https://docs.basex.org/index.php?title=MacOS&diff=16300MacOS2023-02-25T16:14:03Z<p>James Ball: Removed unnecessary brackets</p>
<hr />
<div>Tested using Amazon Corretto JDK on an M1 MacBook Air<br />
<br />
Java version when tested:<br />
<br />
<pre><br />
% java --version<br />
openjdk 17.0.2 2022-01-18 LTS<br />
OpenJDK Runtime Environment Corretto-17.0.2.8.1 (build 17.0.2+8-LTS)<br />
OpenJDK 64-Bit Server VM Corretto-17.0.2.8.1 (build 17.0.2+8-LTS, mixed mode, sharing)<br />
</pre><br />
<br />
1) Create a working folder<br />
<br />
2) Download the version of BaseX that you would like to use - download the Zip archive<br />
<br />
3) Unzip the archive so you have the '''basex''' folder<br />
<br />
4) Remove the .basex configuration files from the folder - to ensure that each user gets a personal file generated automatically<br />
<br />
<pre>% rm ./basex/.basex*</pre><br />
<br />
5) Create a new file <pre>basex-config.cfg</pre> in your working folder and add the content. Remember to change the app-version (here 9.7) to your selected version<br />
<br />
<pre><br />
[Application]<br />
app.classpath=$APPDIR/BaseX.jar<br />
app.mainclass=org.basex.BaseXGUI<br />
app.classpath=$APPDIR/lib/*<br />
app.classpath=$APPDIR/lib/custom/*<br />
<br />
[JavaOptions]<br />
java-options=-Djpackage.app-version=9.7<br />
</pre><br />
<br />
6) If you would like to included some custom libraries in the build add them to the folder in basex/lib/custom.<br />
<br />
7) Add the basex.icns file to your working folder. [[basex.icns]]<br />
<br />
8) Run the command to make a disk image (.dmg) of BaseX - remember to set the app version to match your selected version, here 9.7<br />
<br />
<pre><br />
jpackage --input target/ \<br />
--input basex \<br />
--name "BaseX" \<br />
--main-jar BaseX.jar \<br />
--main-class org.basex.BaseXGUI \<br />
--type dmg \<br />
--icon "basex.icns" \<br />
--app-version "9.7" \<br />
--vendor "BaseX GmbH" \<br />
--copyright "Copyright 2023 BaseX GmbH" \<br />
--mac-package-name "BaseX" \<br />
--module-path "lib" \<br />
--verbose<br />
</pre><br />
<br />
9) A disk image will be created in your working folder. Open it and you can copy the application to your Applications folder or any chosen location<br />
<br />
10) ''Optional'' If you want to add custom libraries later then show the package contents of the application. Replace the content of the file BaseX {version}.cfg with the original from step 5. Now you can add and remove libraries and they will be loaded/unloaded with application restart.</div>James Ballhttps://docs.basex.org/index.php?title=Developing&diff=16299Developing2023-02-25T16:13:24Z<p>James Ball: </p>
<hr />
<div>This page is one of the [[Main Page|Main Sections]] of the documentation.<br />
It provides useful information for developers. Here you can find information<br />
on various alternatives to integrate BaseX into your own project.<br />
<br />
<table><tr><td width='50%' valign='top'><br />
;Integrate & Contribute<br />
* [[Developing with Eclipse|Eclipse]]: Compile and run BaseX from within Eclipse<br />
* [[Git]]: Learn how to work with Git<br />
* [[Maven]]: Embed BaseX into your own projects<br />
* [[Releases]]: Official releases, snapshots, old versions<br />
* [[Translations]]: Contribute a new translation to BaseX!<br />
<br />
;Web Technology<br />
* [[RESTXQ]]: Write web services with XQuery<br />
* [[REST]]: Access and update databases via HTTP requests<br />
* [[WebDAV]]: Access databases from your filesystem<br />
</td><td width='50%' valign='top'><br />
<br />
;APIs<br />
* [[Clients]]: Communicate with BaseX using C#, PHP, Python, Perl, C, ...<br />
* [[Java Examples]]: Code examples for developing with BaseX<br />
* [http://xqj.net/basex XQJ API]: Closed source, implemented by Charles Foster (restricted to XQuery 3.0)<br />
* [https://github.com/fancellu/xqs XQuery for Scala API], based on XQJ and written by Dino Fancellu<br />
<br />
;Extensions<br />
* [[YAJSW|Service/daemon]]: Install BaseX server as a service<br />
* [[Android]]: Running BaseX with Android<br />
* [[macOS]]: How to build a standalone macOS application<br />
</td></tr><br />
</table><br />
<br />
;Code, Questions, Bugs<br />
* The [https://github.com/basexdb/basex Source Code] is available on GitHub.<br />
* For questions, bug reports and feature requests, please write to our [https://basex.org/open-source/ mailing list]<br />
* The [https://github.com/basexdb/basex/issues Issue Tracker] contains confirmed bugs and feature requests.<br />
__NOTOC__</div>James Ballhttps://docs.basex.org/index.php?title=MacOS&diff=16298MacOS2023-02-25T16:03:50Z<p>James Ball: New page to explain the steps required to make a standalone</p>
<hr />
<div>Tested using Amazon Corretto JDK () on an M1 MacBook Air<br />
<br />
My Java version:<br />
<br />
<pre><br />
% java --version<br />
openjdk 17.0.2 2022-01-18 LTS<br />
OpenJDK Runtime Environment Corretto-17.0.2.8.1 (build 17.0.2+8-LTS)<br />
OpenJDK 64-Bit Server VM Corretto-17.0.2.8.1 (build 17.0.2+8-LTS, mixed mode, sharing)<br />
</pre><br />
<br />
1) Create a working folder<br />
<br />
2) Download the version of BaseX that you would like to use - download the Zip archive<br />
<br />
3) Unzip the archive so you have the '''basex''' folder<br />
<br />
4) Remove the .basex configuration files from the folder - to ensure that each user gets a personal file generated automatically<br />
<br />
<pre>% rm ./basex/.basex*</pre><br />
<br />
5) Create a new file <pre>basex-config.cfg</pre> in your working folder and add the content. Remember to change the app-version (here 9.7) to your selected version<br />
<br />
<pre><br />
[Application]<br />
app.classpath=$APPDIR/BaseX.jar<br />
app.mainclass=org.basex.BaseXGUI<br />
app.classpath=$APPDIR/lib/*<br />
app.classpath=$APPDIR/lib/custom/*<br />
<br />
[JavaOptions]<br />
java-options=-Djpackage.app-version=9.7<br />
</pre><br />
<br />
6) If you would like to included some custom libraries in the build add them to the folder in basex/lib/custom.<br />
<br />
7) Add the basex.icns file to your working folder. [[basex.icns]]<br />
<br />
8) Run the command to make a disk image (.dmg) of BaseX - remember to set the app version to match your selected version, here 9.7<br />
<br />
<pre><br />
jpackage --input target/ \<br />
--input basex \<br />
--name "BaseX" \<br />
--main-jar BaseX.jar \<br />
--main-class org.basex.BaseXGUI \<br />
--type dmg \<br />
--icon "basex.icns" \<br />
--app-version "9.7" \<br />
--vendor "BaseX GmbH" \<br />
--copyright "Copyright 2023 BaseX GmbH" \<br />
--mac-package-name "BaseX" \<br />
--module-path "lib" \<br />
--verbose<br />
</pre><br />
<br />
9) A disk image will be created in your working folder. Open it and you can copy the application to your Applications folder or any chosen location<br />
<br />
10) ''Optional'' If you want to add custom libraries later then show the package contents of the application. Replace the content of the file BaseX {version}.cfg with the original from step 5. Now you can add and remove libraries and they will be loaded/unloaded with application restart.</div>James Ballhttps://docs.basex.org/index.php?title=Developing&diff=16297Developing2023-02-25T15:35:54Z<p>James Ball: Adding section for building a macOS bundle application of BaseX</p>
<hr />
<div>This page is one of the [[Main Page|Main Sections]] of the documentation.<br />
It provides useful information for developers. Here you can find information<br />
on various alternatives to integrate BaseX into your own project.<br />
<br />
<table><tr><td width='50%' valign='top'><br />
;Integrate & Contribute<br />
* [[Developing with Eclipse|Eclipse]]: Compile and run BaseX from within Eclipse<br />
* [[Git]]: Learn how to work with Git<br />
* [[Maven]]: Embed BaseX into your own projects<br />
* [[Releases]]: Official releases, snapshots, old versions<br />
* [[Translations]]: Contribute a new translation to BaseX!<br />
<br />
;Web Technology<br />
* [[RESTXQ]]: Write web services with XQuery<br />
* [[REST]]: Access and update databases via HTTP requests<br />
* [[WebDAV]]: Access databases from your filesystem<br />
</td><td width='50%' valign='top'><br />
<br />
;APIs<br />
* [[Clients]]: Communicate with BaseX using C#, PHP, Python, Perl, C, ...<br />
* [[Java Examples]]: Code examples for developing with BaseX<br />
* [http://xqj.net/basex XQJ API]: Closed source, implemented by Charles Foster (restricted to XQuery 3.0)<br />
* [https://github.com/fancellu/xqs XQuery for Scala API], based on XQJ and written by Dino Fancellu<br />
<br />
;Extensions<br />
* [[YAJSW|Service/daemon]]: Install BaseX server as a service<br />
* [[Android]]: Running BaseX with Android<br />
* [[macOS]]: How to build a macOS application bundle<br />
</td></tr><br />
</table><br />
<br />
;Code, Questions, Bugs<br />
* The [https://github.com/basexdb/basex Source Code] is available on GitHub.<br />
* For questions, bug reports and feature requests, please write to our [https://basex.org/open-source/ mailing list]<br />
* The [https://github.com/basexdb/basex/issues Issue Tracker] contains confirmed bugs and feature requests.<br />
__NOTOC__</div>James Ballhttps://docs.basex.org/index.php?title=XQuery_3.1&diff=15532XQuery 3.12021-10-26T11:00:22Z<p>James Ball: Removed rogue i in Enclosed Expressions example</p>
<hr />
<div>This article is part of the [[XQuery|XQuery Portal]]. It provides a summary of the most important features of the [https://www.w3.org/TR/xquery-31/ XQuery 3.1] Recommendation.<br />
<br />
=Maps=<br />
<br />
A ''map'' is a function that associates a set of keys with values, resulting in a collection of key/value pairs. Each key/value pair in a map is called an entry. A key is an arbitrary atomic value, and the associated value is an arbitrary sequence. Within a map, no two entries have the same key, when compared using the {{Code|eq}} operator. It is not necessary that all the keys should be mutually comparable (for example, they can include a mixture of integers and strings).<br />
<br />
Maps can be constructed as follows:<br />
<br />
<syntaxhighlight lang="xquery"><br />
map { }, (: empty map :)<br />
map { 'key': true(), 1984: (<a/>, <b/>) }, (: map with two entries :)<br />
map:merge( (: map with ten entries :)<br />
for $i in 1 to 10<br />
return map { $i: 'value' || $i }<br />
)<br />
</syntaxhighlight><br />
<br />
The function corresponding to the map has the signature {{Code|function($key as xs:anyAtomicType) as item()*}}. The expression {{Code|$map($key)}} returns the associated value; the function call {{Code|map:get($map, $key)}} is equivalent. For example, if {{Code|$books-by-isbn}} is a map whose keys are ISBNs and whose associated values are {{Code|book}} elements, then the expression {{Code|$books-by-isbn("0470192747")}} returns the {{Code|book}} element with the given ISBN. The fact that a map is a function item allows it to be passed as an argument to [[Higher-Order Functions]] that expect a function item as one of their arguments. As an example, the following query uses the higher-order function {{Code|fn:map($f, $seq)}} to extract all bound values from a map:<br />
<br />
<syntaxhighlight lang="xquery"><br />
let $map := map { 'foo': 42, 'bar': 'baz', 123: 456 }<br />
return fn:for-each(map:keys($map), $map)<br />
</syntaxhighlight><br />
<br />
This returns some permutation of {{Code|(42, 'baz', 456)}}.<br />
<br />
Because a map is a function item, functions that apply to functions also apply to maps. A map is an anonymous function, so {{Code|fn:function-name}} returns the empty sequence; {{Code|fn:function-arity}} always returns {{Code|1}}.<br />
<br />
Like all other values, maps are immutable. For example, the <code>[[Map Module#map:remove|map:remove]]</code> function creates a new map by removing an entry from an existing map, but the existing map is not changed by the operation. Like sequences, maps have no identity. It is meaningful to compare the contents of two maps, but there is no way of asking whether they are "the same map": two maps with the same content are indistinguishable.<br />
<br />
Maps may be compared using the {{Code|fn:deep-equal}} function. The [[Map Module]] describes the available set of map functions.<br />
<br />
=Arrays=<br />
<br />
An ''array'' is a function that associates a set of positions, represented as positive integer keys, with values. The first position in an array is associated with the integer {{Code|1}}. The values of an array are called its members. In the type hierarchy, array has a distinct type, which is derived from function. In BaseX, arrays (as well as sequences) are based on an efficient [https://en.wikipedia.org/wiki/Finger_tree Finger Tree] implementation.<br />
<br />
Arrays can be constructed in two ways. With the square bracket notation, the comma serves as delimiter:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
[], (: empty array :)<br />
[ (1, 2) ], (: array with single member :)<br />
[ 1 to 2, 3 ] (: array with two members; same as: [ (1, 2), 3 ] :)<br />
</syntaxhighlight><br />
<br />
With the {{Code|array}} keyword and curly brackets, the inner expression is evaluated as usual, and the resulting values will be the members of the array:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
array { }, (: empty array; same as: array { () } :) <br />
array { (1, 2) }, (: array with two members; same as: array { 1, 2 } :)<br />
array { 1 to 2, 3 } (: array with three members; same as: array { 1, 2, 3 } :)<br />
</syntaxhighlight><br />
<br />
The function corresponding to the array has the signature {{Code|function($index as xs:integer) as item()*}}. The expression {{Code|$array($index)}} returns an addressed member of the array. The following query returns the five array members {{Code|48 49 50 51 52}} as result:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
let $array := array { 48 to 52 }<br />
for $i in 1 to array:size($array)<br />
return $array($i)<br />
</syntaxhighlight><br />
<br />
Like all other values, arrays are immutable. For example, the <code>[[Array Module#array:reverse|array:reverse]]</code> function creates a new array containing a re-ordering of the members of an existing array, but the existing array is not changed by the operation. Like sequences, arrays have no identity. It is meaningful to compare the contents of two arrays, but there is no way of asking whether they are "the same array": two arrays with the same content are indistinguishable.<br />
<br />
==Atomization==<br />
<br />
If an array is ''atomized'', all of its members will be atomized. As a result, an atomized item may now result in more than one item. Some examples:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
fn:data([1 to 2]) (: returns the sequence 1, 2 :)<br />
[ 'a', 'b', 'c' ] = 'b' (: returns true :)<br />
<a>{ [ 1, 2 ] }</a> (: returns <a>1 2</a> :)<br />
array { 1 to 2 } + 3 (: error: the left operand returns two items :)<br />
</syntaxhighlight><br />
<br />
Atomization also applies to function arguments. The following query returns 5, because the array will be atomized to a sequence of 5 integers:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
let $f := function($x as xs:integer*) { count($x) }<br />
return $f([1 to 5])<br />
</syntaxhighlight><br />
<br />
However, the next query returns 1, because the array is already of the general type {{Code|item()}}, and no atomization will take place:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
let $f := function($x as item()*) { count($x) }<br />
return $f([1 to 5])<br />
</syntaxhighlight><br />
<br />
Arrays can be compared with the {{Code|fn:deep-equal}} function. The [[Array Module]] describes the available set of array functions.<br />
<br />
=Lookup Operator=<br />
<br />
The lookup operator provides some syntactic sugar to access values of maps or array members. It is introduced by the question mark ({{Code|?}}) and followed by a specifier. The specifier can be:<br />
<br />
# A wildcard {{Code|*}},<br />
# The name of the key,<br />
# The integer offset, or<br />
# Any other parenthesized expression.<br />
<br />
The following example demonstrates the four alternatives:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
let $map := map { 'R': 'red', 'G': 'green', 'B': 'blue' }<br />
return (<br />
$map?* (: 1. returns all values; same as: map:keys($map) ! $map(.) :),<br />
$map?R (: 2. returns the value associated with the key 'R'; same as: $map('R') :),<br />
$map?('G','B') (: 3. returns the values associated with the key 'G' and 'B' :)<br />
),<br />
<br />
let $array := [ 'one', 'two', 'three' ]<br />
return (<br />
$array?* (: 1. returns all values; same as: (1 to array:size($array)) ! $array(.) :),<br />
$array?1 (: 2. returns the first value; same as: $array(1) :),<br />
$array?(2 to 3) (: 3. returns the second and third values; same as: (1 to 2) ! $array(.) :)<br />
)<br />
</syntaxhighlight><br />
<br />
The lookup operator can also be used without left operand. In this case, the context item will be used as input. This query returns {{Code|Akureyri}}:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
let $maps := (<br />
map { 'name': 'Guðrún', 'city': 'Reykjavík' },<br />
map { 'name': 'Hildur', 'city': 'Akureyri' }<br />
)<br />
return $maps[?name = 'Hildur'] ?city<br />
</syntaxhighlight><br />
<br />
=Arrow Operator=<br />
<br />
The arrow operator <code>=></code> provides a convenient alternative syntax for passing on functions to a value. The expression that precedes the operator will be supplied as first argument of the function that follows the arrow. If <code>$v</code> is a value and <code>f()</code> is a function, then <code>$v => f()</code> is equivalent to <code>f($v)</code>, and <code>$v => f($j)</code> is equivalent to <code>f($v, $j)</code>:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
(: Returns 3 :)<br />
count(('A', 'B', 'C')),<br />
('A', 'B', 'C') => count(),<br />
('A', 'B', 'C') => (function( $sequence) { count( $sequence)})(),<br />
<br />
(: Returns W-E-L-C-O-M-E :)<br />
string-join(tokenize(upper-case('w e l c o m e')), '-'),<br />
'w e l c o m e' => upper-case() => tokenize() => string-join('-'),<br />
<br />
(: Returns xfmdpnf :)<br />
codepoints-to-string(<br />
for $i in string-to-codepoints('welcome')<br />
return $i + 1<br />
),<br />
(for $i in 'welcome' => string-to-codepoints()<br />
return $i + 1) => codepoints-to-string()<br />
</syntaxhighlight><br />
<br />
The syntax makes nested function calls more readable, as it is easy to see if parentheses are balanced.<br />
<br />
=String Constructor=<br />
<br />
The string constructor has been inspired by [https://en.wikipedia.org/wiki/Here_document here document] literals of the Unix shell and script languages. It allows you to generate strings that contain various characters that would otherwise be interpreted as XQuery delimiters.<br />
<br />
The string constructors syntax uses two backticks and a square bracket for opening and closing a string:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
(: Returns "This is a 'new' & 'flexible' syntax." :)<br />
``["This is a 'new' & 'flexible' syntax."]``<br />
</syntaxhighlight><br />
<br />
XQuery expressions can be embedded via backticks and a curly bracket. The evaluated results will be separated with spaces, and all strings will eventually be concatenated:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
(: Returns »Count 1 2 3, and I will be there.« :)<br />
let $c := 1 to 3<br />
return ``[»Count `{ $c }`, and I will be there.«]``</syntaxhighlight><br />
<br />
=Serialization=<br />
<br />
Two [[Serialization]] methods have been added to the [https://www.w3.org/TR/xslt-xquery-serialization-31 Serialization spec]:<br />
<br />
==Adaptive Serialization==<br />
<br />
The {{Code|adaptive}} serialization provides an intuitive textual representation for all XDM types, including maps and arrays, functions, attributes, and namespaces. All items will be separated by the value of the {{Code|item-separator}} parameter, which by default is a newline character. It is utilized by the functions <code>[[Profiling Module#prof:dump|prof:dump]]</code> and <code>[https://www.w3.org/TR/xpath-functions-31/#func-trace fn:trace]</code>.<br />
<br />
Example:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
declare option output:method 'adaptive';<br />
<element id='id0'/>/@id,<br />
xs:token("abc"),<br />
map { 'key': 'value' },<br />
true#0<br />
</syntaxhighlight><br />
<br />
Result:<br />
<br />
<syntaxhighlight lang="xml"><br />
id="id0"<br />
xs:token("abc"),<br />
map {<br />
"key": "value"<br />
}<br />
fn:true#0<br />
</syntaxhighlight><br />
<br />
==JSON Serialization==<br />
<br />
The new {{Code|json}} serialization output method can be used to serialize XQuery maps, arrays, atomic values and empty sequences as JSON.<br />
<br />
The {{Code|json}} output method has been introduced in BaseX before it was added to the official specification. It complies with the standard serialization rules and, at the same time, preserves the existing semantics:<br />
<br />
* If an XML node of type {{Code|element(json)}} is found, it will be serialized following the serialization rules of the [[JSON Module]].<br />
* Any other node or atomic value, map, array, or empty sequence will be serialized according to the [https://www.w3.org/TR/xslt-xquery-serialization-31/#json-output rules in the specification].<br />
<br />
The following two queries will both return the JSON snippet <code>{ "key": "value" }</code>:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
declare option output:method 'json';<br />
map { "key": "value" }<br />
</syntaxhighlight><br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
declare option output:method 'json';<br />
<json type='object'><br />
<key>value</key><br />
</json><br />
</syntaxhighlight><br />
<br />
=Functions=<br />
<br />
The following functions have been added in the [https://www.w3.org/TR/xpath-functions-31/ XQuery 3.1 Functions and Operators] Specification:<br />
<br />
==Map Functions==<br />
<br />
<code>map:merge</code>, <code>map:size</code>, <code>map:keys</code>, <code>map:contains</code>, <code>map:get</code>, <code>map:entry</code>, <code>map:put</code>, <code>map:remove</code>, <code>map:for-each</code><br />
<br />
Please check out the [[Map Module]] for more details.<br />
<br />
==Array Functions==<br />
<br />
<code>array:size</code>, <code>array:append</code>, <code>array:subarray</code>, <code>array:remove</code>, <code>array:insert-before</code>, <code>array:head</code>, <code>array:tail</code>, <code>array:reverse</code>, <code>array:join</code>, <code>array:flatten</code>, <code>array:for-each</code>, <code>array:filter</code>, <code>array:fold-left</code>, <code>array:fold-right</code>, <code>array:for-each-pair</code><br />
<br />
==JSON Functions==<br />
<br />
With XQuery 3.1, native support for JSON objects was added. Strings and resources can be parsed to XQuery items and, as [[#JSON Serialization|shown above]], serialized back to their original form.<br />
<br />
===fn:parse-json===<br />
<br />
; Signatures<br />
* <code>fn:parse-json($input as xs:string) as item()?</code><br />
* <code>fn:parse-json($input as xs:string, $options as map(*)) as item()?</code><br />
<br />
Parses the supplied string as JSON text and returns its item representation. The result may be a map, an array, a string, a double, a boolean, or an empty sequence. The allowed options can be looked up in the [https://www.w3.org/TR/xpath-functions-31/#func-parse-json specification].<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
parse-json('{ "name": "john" }') (: yields { "name": "json" } :),<br />
parse-json('[ 1, 2, 4, 8, 16]') (: yields [ 1, 2, 4, 8, 16 ] :)<br />
</syntaxhighlight><br />
<br />
===fn:json-doc===<br />
<br />
; Signatures<br />
* <code>fn:json-doc($uri as xs:string) as item()?</code><br />
* <code>fn:json-doc($uri as xs:string, $options as map(*)) as item()?</code><br />
<br />
Retrieves the text from the specified URI, parses the supplied string as JSON text and returns its item representation (see [[#fn:parse-json|fn:parse-json]] for more details).<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
json-doc("http://ip.jsontest.com/")('ip') (: returns your IP address :)<br />
</syntaxhighlight><br />
<br />
===fn:json-to-xml===<br />
<br />
; Signatures<br />
* <code>fn:json-to-xml($string as xs:string?) as node()?</code><br />
<br />
Converts a JSON string to an XML node representation. The allowed options can be looked up in the [https://www.w3.org/TR/xpath-functions-31/#func-json-to-xm specification].<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
json-to-xml('{ "message": "world" }')<br />
<br />
(: result:<br />
<map xmlns="http://www.w3.org/2005/xpath-functions"><br />
<string key="message">world</string><br />
</map> :)<br />
</syntaxhighlight><br />
<br />
===fn:xml-to-json===<br />
<br />
; Signatures<br />
* <code>fn:xml-to-json($node as node()?) as xs:string?</code><br />
<br />
Converts an XML node, whose format conforms to the results created by [[#fn:json-to-xml|fn:json-to-xml]], to a JSON string representation. The allowed options can be looked up in the [https://www.w3.org/TR/xpath-functions-31/#func-xml-to-json specification].<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
(: returns "JSON" :)<br />
xml-to-json(<string xmlns="http://www.w3.org/2005/xpath-functions">JSON</string>)<br />
</syntaxhighlight><br />
<br />
==fn:sort==<br />
<br />
; Signatures<br />
* <code>fn:sort($input as item()*) as item()*</code><br />
* <code>fn:sort($input as item()*, $collation as xs:string?) as xs:anyAtomicType*)) as item()*</code><br />
* <code>fn:sort($input as item()*, $collation as xs:string?, $key as function(item()*) as xs:anyAtomicType*)) as item()*</code><br />
<br />
Returns a new sequence with sorted {{Code|$input}} items, using an optional {{Code|$collation}}. If a {{Code|$key}} function is supplied, it will be applied on all items. The items of the resulting values will be sorted using the semantics of the {{Code|lt}} expression.<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
sort(reverse(1 to 3)) (: yields 1, 2, 3 :),<br />
reverse(sort(1 to 3)) (: returns the sorted order in descending order :),<br />
sort((3,-2,1), (), abs#1) (: yields 1, -2, 3 :),<br />
sort((1,2,3), (), function($x) { -$x }) (: yields 3, 2, 1 :),<br />
sort((1,'a')) (: yields an error, as strings and integers cannot be compared :)<br />
</syntaxhighlight><br />
<br />
==fn:contains-token==<br />
<br />
; Signatures<br />
* <code>fn:contains-token($input as xs:string*, $token as string) as xs:boolean</code><br />
* <code>fn:contains-token($input as xs:string*, $token as string, $collation as xs:string) as xs:boolean</code><br />
<br />
The supplied strings will be tokenized at whitespace boundaries. The function returns {{Code|true}} if one of the strings equals the supplied token, possibly under the rules of a supplied collation:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
contains-token(('a', 'b c', 'd'), 'c') (: yields true :)<br />
<xml class='one two'/>/contains-token(@class, 'one') (: yields true :)<br />
</syntaxhighlight><br />
<br />
==fn:parse-ietf-date==<br />
<br />
; Signature<br />
* <code>fn:parse-ietf-date($input as xs:string?) as xs:string?</code><br />
<br />
Parses a string in the IETF format (which is widely used on the Internet) and returns a {{Code|xs:dateTime}} item:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
fn:parse-ietf-date('28-Feb-1984 07:07:07')" (: yields 1984-02-28T07:07:07Z :),<br />
fn:parse-ietf-date('Wed, 01 Jun 2001 23:45:54 +02:00')" (: yields 2001-06-01T23:45:54+02:00 :)<br />
</syntaxhighlight><br />
<br />
==fn:apply==<br />
<br />
; Signatures<br />
* <code>fn:apply($function as function(*), $arguments as array(*)) as item()*</code><br />
<br />
The supplied {{Code|$function}} is invoked with the specified {{Code|$arguments}}. The arity of the function must be the same as the size of the array.<br />
<br />
Example:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
fn:apply(concat#5, array { 1 to 5 }) (: 12345 :)<br />
fn:apply(function($a) { sum($a) }, [ 1 to 5 ]) (: 15 :)<br />
fn:apply(count#1, [ 1,2 ]) (: error. the array has two members :)<br />
</syntaxhighlight><br />
<br />
==fn:random-number-generator==<br />
<br />
; Signatures<br />
* <code>fn:random-number-generator() as map(xs:string, item())</code><br />
* <code>fn:random-number-generator($seed as xs:anyAtomicType) as map(xs:string, item())</code><br />
<br />
Creates a random number generator, using an optional seed. The returned map contains three entries:<br />
<br />
* {{Code|number}} is a random double between 0 and 1<br />
* {{Code|next}} is a function that returns another random number generator<br />
* {{Code|permute}} is a function that returns a random permutation of its argument<br />
<br />
The returned random generator is ''deterministic'': If the function is called twice with the same arguments and in the same execution scope, it will always return the same result.<br />
<br />
Example:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
let $rng := fn:random-number-generator()<br />
let $number := $rng('number') (: returns a random number :)<br />
let $next-rng := $rng('next')() (: returns a new generator :)<br />
let $next-number := $next-rng('number') (: returns another random number :)<br />
let $permutation := $rng('permute')(1 to 5) (: returns a random permutation of (1,2,3,4,5) :)<br />
return ($number, $next-number, $permutation)<br />
</syntaxhighlight><br />
<br />
==fn:format-number==<br />
<br />
The function has been extended to support scientific notation:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
format-number(1984.42, '00.0e0') (: yields 19.8e2 :)<br />
</syntaxhighlight><br />
<br />
==fn:tokenize==<br />
<br />
If no separator is specified as second argument, a string will be tokenized at whitespace boundaries:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
fn:tokenize(" a b c d") (: yields "a", "b", "c", "d" :)<br />
</syntaxhighlight><br />
<br />
==fn:trace==<br />
<br />
The second argument can now be omitted:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
fn:trace(<xml/>, "Node: ")/node() (: yields the debugging output "Node: <xml/>" :),<br />
fn:trace(<xml/>)/node() (: returns the debugging output "<xml/>" :)<br />
</syntaxhighlight><br />
<br />
==fn:string-join==<br />
<br />
The type of the first argument is now <code>xs:anyAtomicType*</code>, and all items will be implicitly cast to strings:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
fn:string-join(1 to 3) (: yields the string "123" :)<br />
</syntaxhighlight><br />
<br />
==fn:default-language==<br />
<br />
Returns the default language used for formatting numbers and dates. BaseX always returns {{Code|en}}.<br />
<br />
==Appendix==<br />
<br />
The three functions <code>fn:transform</code>, <code>fn:load-xquery-module</code> and <code>fn:collation-key</code> may be added in a future version of BaseX as their implementation might require the use of additional external libraries.<br />
<br />
=Binary Data=<br />
<br />
Items of type <code>xs:hexBinary</code> and <code>xs:base64Binary</code> can be compared against each other. The following queries all yield {{Code|true}}:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
xs:hexBinary('') < xs:hexBinary('bb'),<br />
xs:hexBinary('aa') < xs:hexBinary('bb'),<br />
max((xs:hexBinary('aa'), xs:hexBinary('bb'))) = xs:hexBinary('bb')<br />
</syntaxhighlight><br />
<br />
=Collations=<br />
<br />
XQuery 3.1 provides a default collation, which allows for a case-insensitive comparison of ASCII characters (<code>A-Z</code> = <code>a-z</code>). This query returns <code>true</code>:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
declare default collation 'http://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive';<br />
'HTML' = 'html'<br />
</syntaxhighlight><br />
<br />
If the [http://site.icu-project.org/download ICU Library] is downloaded and added to the classpath, the full [https://www.w3.org/TR/xpath-functions-31/#uca-collations Unicode Collation Algorithm] features become available in BaseX:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
(: returns 0 (both strings are compared as equal) :)<br />
compare('a-b', 'ab', 'http://www.w3.org/2013/collation/UCA?alternate=shifted')<br />
</syntaxhighlight><br />
<br />
=Enclosed Expressions=<br />
<br />
''Enclosed expression'' is the syntactical term for the expressions that are specified inside a function body, try/catch clauses, node constructors and some other expressions. In the following example expressions, its the empty sequence:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
declare function local:x() { () };<br />
try { () } catch * { () },<br />
element x { () },<br />
text { () }<br />
</syntaxhighlight><br />
<br />
With XQuery 3.1, the expression can be omitted. The following query is equivalent to the upper one:<br />
<br />
<syntaxhighlight lang="xquery"><br />
<br />
declare function local:x() { };<br />
try { } catch * { },<br />
element x { }<br />
text { }<br />
</syntaxhighlight><br />
<br />
=Changelog=<br />
<br />
;Version 8.6<br />
<br />
* Updated: Collation argument was inserted between first and second argument.<br />
<br />
;Version 8.4<br />
<br />
* Added: [[#String Constructors|String Constructors]], [[#fn:default-language|fn:default-language]], [[#Enclosed Expressions|Enclosed Expressions]]<br />
* Updated: [[#Adaptive Serialization|Adaptive Serialization]], [[#fn:string-join|fn:string-join]]<br />
<br />
;Version 8.2<br />
<br />
* Added: [[#fn:json-to-xml|fn:json-to-xml]], [[#fn:xml-to-json|fn:xml-to-json]].<br />
<br />
;Version 8.1<br />
<br />
* Updated: arrays are now based on a [https://en.wikipedia.org/wiki/Finger_tree Finger Tree] implementation.<br />
<br />
Introduced with Version 8.0.</div>James Ballhttps://docs.basex.org/index.php?title=Repository&diff=15333Repository2020-12-18T18:03:15Z<p>James Ball: Minor grammatical updates</p>
<hr />
<div>This article is part of the [[XQuery|XQuery Portal]].<br />
It describes how external XQuery modules and Java code can be installed<br />
in the XQuery module repository, and how new packages are built and deployed.<br />
<br />
=Introduction=<br />
<br />
One of the things that makes languages successful is the availability of external libraries. As XQuery comes with only 150 pre-defined functions, which cannot meet all requirements, additional library modules exist – such as [http://www.functx.com/ FunctX] – which extend the language with new features.<br />
<br />
BaseX offers the following mechanisms to make external modules accessible to the XQuery processor:<br />
<br />
# The internal [[#Packaging|Packaging]] mechanism will install single XQuery and JAR modules in the repository.<br />
# The [[#EXPath Packaging|EXPath Packaging]] system provides a generic mechanism for adding XQuery modules to query processors. A package is defined as a {{Code|.xar}} archive, which encapsulates one or more extension libraries.<br />
<br />
==Accessing Modules==<br />
<br />
Library modules can be imported with the {{Code|import module}} statement, followed by a freely choosable prefix and the namespace of the target module. The specified location may be absolute or relative; in the latter case, it is resolved against the location (i.e., ''static base URI'') of the calling module. Import module statements must be placed at the beginning of a module:<br />
<br />
'''Main Module''' <code>hello-universe.xq</code>:<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace m = 'http://basex.org/modules/hello' at 'hello-world.xqm';<br />
m:hello("Universe")<br />
</syntaxhighlight><br />
<br />
'''Library Module''' <code>hello-world.xqm</code> (in the same directory):<br />
<br />
<syntaxhighlight lang="xquery"><br />
module namespace m = 'http://basex.org/modules/Hello';<br />
declare function m:hello($world) {<br />
'Hello ' || $world<br />
};<br />
</syntaxhighlight><br />
<br />
If no location is supplied, modules will be looked up in the repository. Repository modules are stored in the {{Code|repo}} directory, which resides in your [[Configuration#Home Directory|home directory]]. XQuery modules can be manually copied to the repository directory or installed and deleted via [[#Commands|commands]].<br />
<br />
The following example calls a function from the FunctX module in the repository:<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace functx = 'http://www.functx.com';<br />
functx:capitalize-first('test')<br />
</syntaxhighlight><br />
<br />
=Commands=<br />
<br />
There are various ways to organize your packages:<br />
<br />
* Execute BaseX REPO commands (listed below)<br />
* Call XQuery functions of the [[Repository Module]]<br />
* Use the GUI (''Options'' → ''Packages'')<br />
<br />
You can even manually add and remove packages in the repository directory; all changes will automatically be detected by BaseX.<br />
<br />
==Installation==<br />
<br />
A module or package can be installed with {{Command|REPO INSTALL}}. The path to the file has to be given as a parameter:<br />
<br />
REPO INSTALL http://files.basex.org/modules/expath/functx-1.0.xar<br />
REPO INSTALL hello-world.xqm<br />
<br />
The installation will only succeed if the specified file conforms to the constraints described below. If you know that your input is valid, you may as well copy the files directly to the repository directory, or edit its contents in the repository without deleting and reinstalling them.<br />
<br />
==Listing==<br />
<br />
All currently installed packages can be listed with {{Command|REPO LIST}}. The names of all packages are listed, along with their version, their package type, and the repository path:<br />
<br />
Name Version Type Path<br />
-----------------------------------------------------------------<br />
<nowiki>http://www.functx.com</nowiki> 1.0 EXPath http-www.functx.com-1.0<br />
<br />
==Removal==<br />
<br />
A package can be deleted with {{Command|REPO DELETE}} and an additional argument, containing its name or the name suffixed with a hyphen and the package version:<br />
<br />
REPO DELETE <nowiki>http://www.functx.com</nowiki><br />
REPO DELETE <nowiki>http://www.functx.com-1.0</nowiki><br />
<br />
=Packaging=<br />
<br />
==XQuery==<br />
<br />
If an XQuery file is specified as input for the install command, it will be parsed as XQuery library module. If the file can successfully be parsed, the module URI will be [[Java Bindings#URI Rewriting|rewritten]] to a file path and attached with the {{Code|.xqm}} file suffix, and the original file will possibly be renamed and copied to that path into the repository.<br />
<br />
'''Example:'''<br />
<br />
Installation (the original file will be copied to the {{Code|org/basex/modules/Hello}} sub-directory of the repository):<br />
<br />
REPO INSTALL http://files.basex.org/modules/org/basex/modules/Hello/HelloWorld.xqm<br />
<br />
Importing the repository module:<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace m = 'http://basex.org/modules/Hello';<br />
m:hello("Universe")<br />
</syntaxhighlight><br />
<br />
==Java==<br />
<br />
For general notes on importing Java classes, please read the Java Bindings article on [[Java Bindings#Module_Imports|Module Imports]].<br />
<br />
Java archives (JARs) may contain one or more class files. One of them will be chosen as main class, which must be specified in a {{Code|Main-Class}} entry in the manifest file ({{Code|META-INF/MANIFEST.MF}}). This fully qualified Java class name will be rewritten to a file path by replacing the dots with slashes and attaching the {{Code|.jar}} file suffix, and the original file will be renamed and copied to that path into the repository.<br />
<br />
If the class will be imported in the prolog of the XQuery module, an instance of it will be created, and its public functions can then be addressed from XQuery. A class may extend the {{Code|QueryModule}} class to get access to the current query context and to be enriched by some helpful annotations (see [[Java_Bindings#Annotations|Annotations]]).<br />
<br />
'''Example:'''<br />
<br />
Structure of the <code>[https://files.basex.org/modules/org/basex/modules/Hello/HelloWorld.jar HelloWorld.jar]</code> archive:<br />
<br />
META-INF/<br />
MANIFEST.MF<br />
org/basex/modules/<br />
Hello.class<br />
<br />
Contents of the file {{Code|MANIFEST.mf}} (the whitespaces are obligatory):<br />
<br />
Manifest-Version: 1.0<br />
Main-Class: org.basex.modules.Hello<br />
<br />
Contents of the file {{Code|Hello.java}} (comments removed):<br />
<br />
<syntaxhighlight lang="java"><br />
package org.basex.modules;<br />
public class Hello {<br />
public String hello(final String world) {<br />
return "Hello " + world;<br />
}<br />
}<br />
</syntaxhighlight><br />
<br />
Installation (the file will be copied to {{Code|org/basex/modules/Hello.jar}}):<br />
<br />
REPO INSTALL HelloWorld.jar<br />
<br />
XQuery file <code>[https://files.basex.org/modules/org/basex/modules/Hello/HelloUniverse.xq HelloUniverse.xq]</code> (same as above):<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace m = 'http://basex.org/modules/Hello';<br />
m:hello("Universe")<br />
</syntaxhighlight><br />
<br />
After having installed the module, all of the following URIs can be used in XQuery to import this module or call its functions (see [[Java Bindings#URI Rewriting|URI Rewriting]] for more information):<br />
<br />
<nowiki>http://basex.org/modules/Hello</nowiki><br />
org/basex/modules/Hello<br />
org.basex.modules.Hello<br />
<br />
===Additional Libraries===<br />
<br />
A Java class may depend on additional libraries. The dependencies can be resolved by creating a fat JAR file, i.e., extracting all files of the library archives and producing a single, flat JAR package.<br />
<br />
Another solution is to copy the libraries into a {{Code|lib}} directory of the JAR package. When the package is installed, the additional library archives will be extracted and copied to a hidden sub-directory in the repository. If the package is deleted, the hidden sub-directory will be removed as well.<br />
<br />
; Examplary contents of {{Code|Image.jar}}<br />
<br />
lib/<br />
Images.jar<br />
META-INF/<br />
MANIFEST.MF<br />
org/basex/modules/<br />
Image.class<br />
<br />
; Directory structure of the repository directory after installing the package<br />
<br />
org/basex/modules/<br />
Image.class<br />
.Images/<br />
Images.jar<br />
<br />
==Combined==<br />
<br />
It makes sense to combine the advantages of XQuery and Java packages:<br />
<br />
* Instead of directly calling Java code, a wrapper module can be provided. This module contains functions that invoke the Java functions.<br />
* These functions can be strictly typed. This reduces the danger of erroneous or unexpected conversions between XQuery and Java code.<br />
* In addition, the entry functions can have properly maintained XQuery comments.<br />
<br />
XQuery and Java can be combined as follows:<br />
<br />
* First, a JAR package is created (as described above).<br />
* A new XQuery wrapper module is created, which is named identically to the Java main class.<br />
* The URL of the {{Code|import module}} statement in the wrapper module must start with the {{Code|java:}} prefix.<br />
* The finalized XQuery module must be copied into the JAR file, and placed in the same directory as the Java main class.<br />
<br />
If the resulting JAR file is installed, the embedded XQuery module will be extracted, and will be called first if the module will be imported.<br />
<br />
; Main Module {{Code|hello-universe.xq}}:<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace m = 'http://basex.org/modules/Hello';<br />
m:hello("Universe")<br />
</syntaxhighlight><br />
<br />
; Wrapper Module {{Code|Hello.xqm}}:<br />
<br />
<syntaxhighlight lang="xquery"><br />
module namespace hello = 'http://basex.org/modules/Hello';<br />
<br />
(: Import JAR file :)<br />
import module namespace java = 'java:org.basex.modules.Hello';<br />
<br />
(:~<br />
: Say hello to someone.<br />
: @param $world the one to be greeted<br />
: @return welcome string<br />
:)<br />
declare function hello:hello(<br />
$world as xs:string<br />
) as xs:string {<br />
java:hello($world)<br />
};<br />
</syntaxhighlight><br />
<br />
; Java class {{Code|Hello.java}}:<br />
<br />
<syntaxhighlight lang="java"><br />
package org.basex.modules;<br />
<br />
public class Hello {<br />
public String hello(final String world) {<br />
return "Hello " + world;<br />
}<br />
}<br />
</syntaxhighlight><br />
<br />
If the JAR file is installed, {{Code|Combined}} will be displayed as type:<br />
<br />
REPO INSTALL http://files.basex.org/modules/org/basex/modules/Hello.jar<br />
REPO LIST<br />
<br />
Name Version Type Path<br />
-----------------------------------------------------------------------<br />
org.basex.modules.Hello - Combined org/basex/modules/Hello.xqm<br />
<br />
=EXPath Packaging=<br />
<br />
The [http://expath.org/spec/pkg EXPath specification] defines the structure of a .xar archive. The package contains at its root a package descriptor named <code>expath-pkg.xml</code>. This descriptor presents some meta data about the package as well as the libraries which it contains and their dependencies on other libraries or processors.<br />
<br />
==XQuery==<br />
<br />
Apart from the package descriptor, a {{Code|.xar}} archive contains a directory which includes the actual XQuery modules. For example, the [https://files.basex.org/modules/expath/functx-1.0.xar FunctX XAR archive] is packaged as follows:<br />
<br />
<syntaxhighlight><br />
expath-pkg.xml<br />
functx/<br />
functx.xql<br />
functx.xsl<br />
</syntaxhighlight><br />
<br />
==Java==<br />
<br />
If you want to package an EXPath archive with Java code, some additional requirements have to be fulfilled:<br />
<br />
* Apart from the package descriptor <code>expath-pkg.xml</code>, the package has to contain a descriptor file at its root, defining the included jars and the binary names of their public classes. It must be named <code>basex.xml</code> and must conform to the following structure:<br />
<br />
<syntaxhighlight lang="xml"><br />
<package xmlns="http://expath.org/ns/pkg"><br />
<jar>...</jar><br />
....<br />
<class>...</class><br />
<class>...</class><br />
....<br />
</package><br />
</syntaxhighlight><br />
<br />
* The jar file itself along with an XQuery file defining wrapper functions around the java methods has to reside in the module directory. The following example illustrates how java methods are wrapped with XQuery functions:<br />
<br />
'''Example:'''<br>Suppose we have a simple class <code>Printer</code> having just one public method <code>print()</code>:<br />
<br />
<syntaxhighlight lang="java"><br />
package test;<br />
<br />
public final class Printer {<br />
public String print(final String s) {<br />
return new Writer(s).write();<br />
}<br />
}<br />
</syntaxhighlight><br />
<br />
We want to extend BaseX with this class and use its method. In order to make this possible we have to define an XQuery function which wraps the <code>print</code> method of our class. This can be done in the following way:<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace j="http://basex.org/lib/testJar";<br />
<br />
declare namespace p="java:test.Printer";<br />
<br />
declare function j:print($str as xs:string) as xs:string {<br />
let $printer := p:new()<br />
return p:print($printer, $str)<br />
};<br />
</syntaxhighlight><br />
<br />
As it can be seen, the class {{Code|Printer}} is declared with its binary name as a namespace prefixed with "java" and the XQuery function is implemented using the [http://docs.basex.org/wiki/Java_Bindings Java Bindings] offered by BaseX.<br />
<br />
On our [https://files.basex.org/modules/ file server], you can find some example libraries packaged as XML archives (xar files). You can use them to try our packaging API or just as a reference for creating your own packages.<br />
<br />
=Performance=<br />
<br />
Importing XQuery modules that are located in the repository is just as fast as importing any other modules. Modules that are imported several times in a project will only be compiled once.<br />
<br />
Imported Java archives will be dynamically added to the classpath and unregistered after query execution. This requires some constant overhead and may lead to unexpected effects in scenarios with highly concurrent read operations. If you want to get optimal performance, it is recommendable to move your JAR files into the {{Code|lib/custom}} directory of BaseX. This way, the archive will be added to the classpath if BaseX is started. If you have installed a [[#Combined|Combined Package]], you can simply keep your XQuery module in the repository, and the Java classes will be automatically detected.<br />
<br />
=Changelog=<br />
<br />
;Version 9.0<br />
<br />
* Added: [[#Combined|Combined]] XQuery and Java packages<br />
* Added: [[#Additional Libraries|Additional Libraries]]<br />
<br />
;Version 7.2.1<br />
<br />
* Updated: [[#Installation|Installation]]: existing packages will be replaced without raising an error<br />
* Updated: [[#Removal|Removal]]: remove specific version of a package<br />
<br />
;Version 7.1<br />
<br />
* Added: [[Repository Module]]<br />
<br />
;Version 7.0<br />
<br />
* Added: [[#EXPath Packaging|EXPath Packaging]]</div>James Ballhttps://docs.basex.org/index.php?title=Repository&diff=15332Repository2020-12-18T18:02:12Z<p>James Ball: Minor grammatical updates</p>
<hr />
<div>This article is part of the [[XQuery|XQuery Portal]].<br />
It describes how external XQuery modules and Java code can be installed<br />
in the XQuery module repository, and how new packages are built and deployed.<br />
<br />
=Introduction=<br />
<br />
One of the things that makes languages successful is the availability of external libraries. As XQuery comes with only 150 pre-defined functions, which cannot meet all requirements, additional library modules exist – such as [http://www.functx.com/ FunctX] – which extend the language with new features.<br />
<br />
BaseX offers the following mechanisms to make external modules accessible to the XQuery processor:<br />
<br />
# The internal [[#Packaging|Packaging]] mechanism will install single XQuery and JAR modules in the repository.<br />
# The [[#EXPath Packaging|EXPath Packaging]] system provides a generic mechanism for adding XQuery modules to query processors. A package is defined as a {{Code|.xar}} archive, which encapsulates one or more extension libraries.<br />
<br />
==Accessing Modules==<br />
<br />
Library modules can be imported with the {{Code|import module}} statement, followed by a freely choosable prefix and the namespace of the target module. The specified location may be absolute or relative; in the latter case, it is resolved against the location (i.e., ''static base URI'') of the calling module. Import module statements must be placed at the beginning of a module:<br />
<br />
'''Main Module''' <code>hello-universe.xq</code>:<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace m = 'http://basex.org/modules/hello' at 'hello-world.xqm';<br />
m:hello("Universe")<br />
</syntaxhighlight><br />
<br />
'''Library Module''' <code>hello-world.xqm</code> (in the same directory):<br />
<br />
<syntaxhighlight lang="xquery"><br />
module namespace m = 'http://basex.org/modules/Hello';<br />
declare function m:hello($world) {<br />
'Hello ' || $world<br />
};<br />
</syntaxhighlight><br />
<br />
If no location is supplied, modules will be looked up in the repository. Repository modules are stored in the {{Code|repo}} directory, which resides in your [[Configuration#Home Directory|home directory]]. XQuery modules can be manually copied to the repository directory or installed and deleted via [[#Commands|commands]].<br />
<br />
The following example calls a function from the FunctX module in the repository:<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace functx = 'http://www.functx.com';<br />
functx:capitalize-first('test')<br />
</syntaxhighlight><br />
<br />
=Commands=<br />
<br />
There are various ways to organize your packages:<br />
<br />
* Execute BaseX REPO commands (listed below)<br />
* Call XQuery functions of the [[Repository Module]]<br />
* Use the GUI (''Options'' → ''Packages'')<br />
<br />
You can even manually add and remove packages in the repository directory; all changes will automatically be detected by BaseX.<br />
<br />
==Installation==<br />
<br />
A module or package can be installed with {{Command|REPO INSTALL}}. The path to the file has to be given as a parameter:<br />
<br />
REPO INSTALL http://files.basex.org/modules/expath/functx-1.0.xar<br />
REPO INSTALL hello-world.xqm<br />
<br />
The installation will only succeed if the specified file conforms to the constraints described below. If you know that your input is valid, you may as well copy the files directly to the repository directory, or edit its contents in the repository without deleting and reinstalling them.<br />
<br />
==Listing==<br />
<br />
All currently installed packages can be listed with {{Command|REPO LIST}}. The names of all packages are listed, along with their version, their package type, and the repository path:<br />
<br />
Name Version Type Path<br />
-----------------------------------------------------------------<br />
<nowiki>http://www.functx.com</nowiki> 1.0 EXPath http-www.functx.com-1.0<br />
<br />
==Removal==<br />
<br />
A package can be deleted with {{Command|REPO DELETE}} and an additional argument, containing its name or the name suffixed with a hyphen and the package version:<br />
<br />
REPO DELETE <nowiki>http://www.functx.com</nowiki><br />
REPO DELETE <nowiki>http://www.functx.com-1.0</nowiki><br />
<br />
=Packaging=<br />
<br />
==XQuery==<br />
<br />
If an XQuery file is specified as input for the install command, it will be parsed as XQuery library module. If the file can successfully be parsed, the module URI will be [[Java Bindings#URI Rewriting|rewritten]] to a file path and attached with the {{Code|.xqm}} file suffix, and the original file will possibly be renamed and copied to that path into the repository.<br />
<br />
'''Example:'''<br />
<br />
Installation (the original file will be copied to the {{Code|org/basex/modules/Hello}} sub-directory of the repository):<br />
<br />
REPO INSTALL http://files.basex.org/modules/org/basex/modules/Hello/HelloWorld.xqm<br />
<br />
Importing the repository module:<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace m = 'http://basex.org/modules/Hello';<br />
m:hello("Universe")<br />
</syntaxhighlight><br />
<br />
==Java==<br />
<br />
For general notes on importing Java classes, please read the Java Bindings article on [[Java Bindings#Module_Imports|Module Imports]].<br />
<br />
Java archives (JARs) may contain one or more class files. One of them will be chosen as main class, which must be specified in a {{Code|Main-Class}} entry in the manifest file ({{Code|META-INF/MANIFEST.MF}}). This fully qualified Java class name will be rewritten to a file path by replacing the dots with slashes and attaching the {{Code|.jar}} file suffix, and the original file will be renamed and copied to that path into the repository.<br />
<br />
If the class will be imported in the prolog of the XQuery module, an instance of it will be created, and its public functions can then be addressed from XQuery. A class may extend the {{Code|QueryModule}} class to get access to the current query context and to be enriched by some helpful annotations (see [[Java_Bindings#Annotations|Annotations]]).<br />
<br />
'''Example:'''<br />
<br />
Structure of the <code>[https://files.basex.org/modules/org/basex/modules/Hello/HelloWorld.jar HelloWorld.jar]</code> archive:<br />
<br />
META-INF/<br />
MANIFEST.MF<br />
org/basex/modules/<br />
Hello.class<br />
<br />
Contents of the file {{Code|MANIFEST.mf}} (the whitespaces are obligatory):<br />
<br />
Manifest-Version: 1.0<br />
Main-Class: org.basex.modules.Hello<br />
<br />
Contents of the file {{Code|Hello.java}} (comments removed):<br />
<br />
<syntaxhighlight lang="java"><br />
package org.basex.modules;<br />
public class Hello {<br />
public String hello(final String world) {<br />
return "Hello " + world;<br />
}<br />
}<br />
</syntaxhighlight><br />
<br />
Installation (the file will be copied to {{Code|org/basex/modules/Hello.jar}}):<br />
<br />
REPO INSTALL HelloWorld.jar<br />
<br />
XQuery file <code>[https://files.basex.org/modules/org/basex/modules/Hello/HelloUniverse.xq HelloUniverse.xq]</code> (same as above):<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace m = 'http://basex.org/modules/Hello';<br />
m:hello("Universe")<br />
</syntaxhighlight><br />
<br />
After having installed the module, all of the following URIs can be used in XQuery to import this module or call its functions (see [[Java Bindings#URI Rewriting|URI Rewriting]] for more information):<br />
<br />
<nowiki>http://basex.org/modules/Hello</nowiki><br />
org/basex/modules/Hello<br />
org.basex.modules.Hello<br />
<br />
===Additional Libraries===<br />
<br />
A Java class may depend on additional libraries. The dependencies can be resolved by creating a fat JAR file, i.e., extracting all files of the library archives and producing a single, flat JAR package.<br />
<br />
Another solution is to copy the libraries into a {{Code|lib}} directory of the JAR package. When the package is installed, the additional library archives will be extracted and copied to a hidden sub-directory in the repository. If the package is deleted, the hidden sub-directory will be removed as well.<br />
<br />
; Examplary contents of {{Code|Image.jar}}<br />
<br />
lib/<br />
Images.jar<br />
META-INF/<br />
MANIFEST.MF<br />
org/basex/modules/<br />
Image.class<br />
<br />
; Directory structure of the repository directory after installing the package<br />
<br />
org/basex/modules/<br />
Image.class<br />
.Images/<br />
Images.jar<br />
<br />
==Combined==<br />
<br />
It makes sense to combine the advantages of XQuery and Java packages:<br />
<br />
* Instead of directly calling Java code, a wrapper module can be provided. This module contains functions that invoke the Java functions.<br />
* These functions can be strictly typed. This reduces the danger of erroneous or unexpected conversions between XQuery and Java code.<br />
* In addition, the entry functions can have properly maintained XQuery comments.<br />
<br />
XQuery and Java can be combined as follows:<br />
<br />
* First, a JAR package is created (as described above).<br />
* A new XQuery wrapper module is created, which is named identically to the Java main class.<br />
* The URL of the {{Code|import module}} statement in the wrapper module must start with the {{Code|java:}} prefix.<br />
* The finalized XQuery module must be copied into the JAR file, and placed in the same directory as the Java main class.<br />
<br />
If the resulting JAR file is installed, the embedded XQuery module will be extracted, and will be called first if the module will be imported.<br />
<br />
; Main Module {{Code|hello-universe.xq}}:<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace m = 'http://basex.org/modules/Hello';<br />
m:hello("Universe")<br />
</syntaxhighlight><br />
<br />
; Wrapper Module {{Code|Hello.xqm}}:<br />
<br />
<syntaxhighlight lang="xquery"><br />
module namespace hello = 'http://basex.org/modules/Hello';<br />
<br />
(: Import JAR file :)<br />
import module namespace java = 'java:org.basex.modules.Hello';<br />
<br />
(:~<br />
: Say hello to someone.<br />
: @param $world the one to be greeted<br />
: @return welcome string<br />
:)<br />
declare function hello:hello(<br />
$world as xs:string<br />
) as xs:string {<br />
java:hello($world)<br />
};<br />
</syntaxhighlight><br />
<br />
; Java class {{Code|Hello.java}}:<br />
<br />
<syntaxhighlight lang="java"><br />
package org.basex.modules;<br />
<br />
public class Hello {<br />
public String hello(final String world) {<br />
return "Hello " + world;<br />
}<br />
}<br />
</syntaxhighlight><br />
<br />
If the JAR file is installed, {{Code|Combined}} will be displayed as type:<br />
<br />
REPO INSTALL http://files.basex.org/modules/org/basex/modules/Hello.jar<br />
REPO LIST<br />
<br />
Name Version Type Path<br />
-----------------------------------------------------------------------<br />
org.basex.modules.Hello - Combined org/basex/modules/Hello.xqm<br />
<br />
=EXPath Packaging=<br />
<br />
The [http://expath.org/spec/pkg EXPath specification] defines how the structure of a .xar archive shall look like. The package contains at its root a package descriptor named <code>expath-pkg.xml</code>. This descriptor presents some meta data about the package as well as the libraries which it contains and their dependencies on other libraries or processors.<br />
<br />
==XQuery==<br />
<br />
Apart from the package descriptor, a {{Code|.xar}} archive contains a directory which includes the actual XQuery modules. For example, the [https://files.basex.org/modules/expath/functx-1.0.xar FunctX XAR archive] is packaged as follows:<br />
<br />
<syntaxhighlight><br />
expath-pkg.xml<br />
functx/<br />
functx.xql<br />
functx.xsl<br />
</syntaxhighlight><br />
<br />
==Java==<br />
<br />
If you want to package an EXPath archive with Java code, some additional requirements have to be fulfilled:<br />
<br />
* Apart from the package descriptor <code>expath-pkg.xml</code>, the package has to contain a descriptor file at its root, defining the included jars and the binary names of their public classes. It must be named <code>basex.xml</code> and must conform to the following structure:<br />
<br />
<syntaxhighlight lang="xml"><br />
<package xmlns="http://expath.org/ns/pkg"><br />
<jar>...</jar><br />
....<br />
<class>...</class><br />
<class>...</class><br />
....<br />
</package><br />
</syntaxhighlight><br />
<br />
* The jar file itself along with an XQuery file defining wrapper functions around the java methods has to reside in the module directory. The following example illustrates how java methods are wrapped with XQuery functions:<br />
<br />
'''Example:'''<br>Suppose we have a simple class <code>Printer</code> having just one public method <code>print()</code>:<br />
<br />
<syntaxhighlight lang="java"><br />
package test;<br />
<br />
public final class Printer {<br />
public String print(final String s) {<br />
return new Writer(s).write();<br />
}<br />
}<br />
</syntaxhighlight><br />
<br />
We want to extend BaseX with this class and use its method. In order to make this possible we have to define an XQuery function which wraps the <code>print</code> method of our class. This can be done in the following way:<br />
<br />
<syntaxhighlight lang="xquery"><br />
import module namespace j="http://basex.org/lib/testJar";<br />
<br />
declare namespace p="java:test.Printer";<br />
<br />
declare function j:print($str as xs:string) as xs:string {<br />
let $printer := p:new()<br />
return p:print($printer, $str)<br />
};<br />
</syntaxhighlight><br />
<br />
As it can be seen, the class {{Code|Printer}} is declared with its binary name as a namespace prefixed with "java" and the XQuery function is implemented using the [http://docs.basex.org/wiki/Java_Bindings Java Bindings] offered by BaseX.<br />
<br />
On our [https://files.basex.org/modules/ file server], you can find some example libraries packaged as XML archives (xar files). You can use them to try our packaging API or just as a reference for creating your own packages.<br />
<br />
=Performance=<br />
<br />
Importing XQuery modules that are located in the repository is just as fast as importing any other modules. Modules that are imported several times in a project will only be compiled once.<br />
<br />
Imported Java archives will be dynamically added to the classpath and unregistered after query execution. This requires some constant overhead and may lead to unexpected effects in scenarios with highly concurrent read operations. If you want to get optimal performance, it is recommendable to move your JAR files into the {{Code|lib/custom}} directory of BaseX. This way, the archive will be added to the classpath if BaseX is started. If you have installed a [[#Combined|Combined Package]], you can simply keep your XQuery module in the repository, and the Java classes will be automatically detected.<br />
<br />
=Changelog=<br />
<br />
;Version 9.0<br />
<br />
* Added: [[#Combined|Combined]] XQuery and Java packages<br />
* Added: [[#Additional Libraries|Additional Libraries]]<br />
<br />
;Version 7.2.1<br />
<br />
* Updated: [[#Installation|Installation]]: existing packages will be replaced without raising an error<br />
* Updated: [[#Removal|Removal]]: remove specific version of a package<br />
<br />
;Version 7.1<br />
<br />
* Added: [[Repository Module]]<br />
<br />
;Version 7.0<br />
<br />
* Added: [[#EXPath Packaging|EXPath Packaging]]</div>James Ballhttps://docs.basex.org/index.php?title=Storage_Layout&diff=15331Storage Layout2020-12-12T22:52:28Z<p>James Ball: Added information regarding pth.basex and idp.basex</p>
<hr />
<div>This article is part of the [[Advanced User's Guide]]. It presents some low-level details on how data is stored in the database files.<br />
<br />
=Data Types=<br />
<br />
The following data types are used for specifying the storage layout:<br />
<br />
{| class="wikitable" width="100%"<br />
|-<br />
! Type<br />
! Description<br />
! Example (native → hex integers)<br />
|-<br />
| {{Type|Num}}<br />
| Compressed integer (1-5 bytes), specified in [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/Num.java Num.java]<br />
| {{Code|15}} → {{Code|0F}}; {{Code|511}} → {{Code|41 FF}}<br/><br />
|-<br />
| {{Type|Token}}<br />
| Length ({{Type|Num}}) and bytes of UTF8 byte representation<br />
| {{Code|Hello}} → {{Code|05 48 65 6c 6c 6f}}<br />
|-<br />
| {{Type|Double}}<br />
| Number, stored as token<br />
| {{Code|123}} → {{Code|03 31 32 33}}<br />
|-<br />
| {{Type|Boolean}}<br />
| Boolean (1 byte, {{Code|00}} or {{Code|01}})<br />
| {{Code|true}} → {{Code|01}}<br />
|-<br />
| {{Type|Nums}}, {{Type|Tokens}}, {{Type|Doubles}}<br />
| Arrays of values, introduced with the number of entries<br />
| {{Code|1,2}} → {{Code|02 01 31 01 32}}<br />
|-<br />
| {{Type|TokenSet}}<br />
| Key array ({{Type|Tokens}}), next/bucket/size arrays (3x {{Type|Nums}})<br />
|<br />
|}<br />
<br />
=Database Files=<br />
<br />
The following tables illustrate the layout of the BaseX database files. All files are suffixed with {{Code|.basex}}.<br />
<br />
==Meta Data, Name/Path/Doc Indexes: {{Code|inf}}==<br />
<br />
{| class="wikitable" width="100%"<br />
|-<br />
! Description<br />
! Format<br />
! Method<br />
|-<br />
| '''1. Meta Data'''<br />
| 1. Key/value pairs, in no particular order ({{Type|Token}}/{{Type|Token}}):<br/>&nbsp; &bull; Examples: {{Code|FNAME}}, {{Code|TIME}}, {{Code|SIZE}}, ...<br />&nbsp; &bull; {{Code|PERM}} → Number of users ({{Type|Num}}), and name/password/permission values for each user ({{Type|Token}}/{{Type|Token}}/{{Type|Num}})<br/>2. Empty key as finalizer<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/data/DiskData.java DiskData()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/data/MetaData.java MetaData()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/core/Users.java Users()]<br />
|-<br />
| '''2. Main memory indexes'''<br />
| 1. Key/value pairs, in no particular order ({{Type|Token}}/{{Type|Token}}):<br />&nbsp; &bull; {{Code|TAGS}} → Element Name Index<br />&nbsp; &bull; {{Code|ATTS}} → Attribute Name Index<br />&nbsp; &bull; {{Code|PATH}} → Path Index<br />&nbsp; &bull; {{Code|NS}} → Namespaces<br />&nbsp; &bull; {{Code|DOCS}} → Document Index<br/>2. Empty key as finalizer<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/data/DiskData.java DiskData()]<br />
|-<br />
| '''2 a) Name Index'''<br/>Element/attribute names<br />
| 1. Token set, storing all names ({{Type|TokenSet}})<br />2. One StatsKey instance per entry:<br/>2.1. Content kind ({{Type|Num}}):<br />2.1.1. Number: min/max ({{Type|Doubles}})<br />2.1.2. Category: number of entries ({{Type|Num}}), entries ({{Type|Tokens}})<br />2.2. Number of entries ({{Type|Num}})<br />2.3. Leaf flag ({{Type|Boolean}})<br />2.4. Maximum text length ({{Type|Double}}; legacy, could be {{Type|Num}})<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/index/Names.java Names()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/hash/TokenSet.java TokenSet.read()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/index/StatsKey.java StatsKey()]<br />
|-<br />
| '''2 b) Path Index'''<br />
| 1. Flag for path definition ({{Type|Boolean}}, always {{Code|true}}; legacy)<br/>2. PathNode:<br/>2.1. Name reference ({{Type|Num}})<br/>2.2. Node kind ({{Type|Num}})<br/>2.3. Number of occurrences ({{Type|Num}})<br/>2.4. Number of children ({{Type|Num}})<br/>2.5. {{Type|Double}}; legacy, can be reused or discarded<br/>2.6. Recursive generation of child nodes (→ 2)<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/index/path/PathSummary.java PathSummary()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/index/path/PathNode.java PathNode()]<br />
|-<br />
| '''2 c) Namespaces'''<br />
| 1. Token set, storing prefixes ({{Type|TokenSet}})<br/>2. Token set, storing URIs ({{Type|TokenSet}})<br/>3. NSNode:<br/>3.1. pre value ({{Type|Num}})<br/>3.2. References to prefix/URI pairs ({{Type|Nums}})<br/>3.3. Number of children ({{Type|Num}})<br/>3.4. Recursive generation of child nodes (→ 3)<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/data/Namespaces.java Namespaces()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/data/NSNode.java NSNode()]<br />
|-<br />
| '''2 d) Document Index'''<br />
| Array of integers, representing the distances between all document pre values ({{Type|Nums}})<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/index/DocIndex.java DocIndex()]<br />
|}<br />
<br />
==Node Table: {{Code|tbl}}, {{Code|tbli}}==<br />
<br />
* {{Code|tbl}}: Main database table, stored in blocks.<br />
* {{Code|tbli}}: Database directory, organizing the database blocks.<br />
<br />
Some more information on the [[Node Storage|node storage]] is available.<br />
<br />
==Texts: {{Code|txt}}, {{Code|atv}}==<br />
<br />
* {{Code|txt}}: Heap file for text values (document names, string values of texts, comments and processing instructions)<br />
* {{Code|atv}}: Heap file for attribute values.<br />
<br />
==Value Indexes: {{Code|txtl}}, {{Code|txtr}}, {{Code|atvl}}, {{Code|atvr}}==<br />
<br />
'''Text Index:'''<br />
* {{Code|txtl}}: Heap file with ID lists.<br />
* {{Code|txtr}}: Index file with references to ID lists.<br />
The '''Attribute Index''' is contained in the files {{Code|atvl}} and {{Code|atvr}}, the '''Token Index''' in {{Code|tokl}} and {{Code|tokr}}. All have the same layout.<br />
<br />
For a more detailed discussion and examples of these file formats please see [[Index File Structure]].<br />
<br />
==Document Path Index: {{Code|pth}}==<br />
<br />
Provides an index of all the document paths in the database. For databases with a large number of paths this file can be quite large so it is only generated the first time a function requesting a path lookup is run. For databases where path lookups are never used this file will not exist.<br />
<br />
'''Note:''' On Windows/Mac systems this file is case insensitive (all paths are lower case). On UNIX-like systems this file is case sensitive. The behaviour of path look ups will vary between systems. Copying this file between system types may lead to unexpected behaviour.<br />
<br />
==ID/Pre Mapping: {{Code|idp}}==<br />
<br />
This file is only created if incremental indexing (UPDINDEX) is enabled for a database. It is used to provide a quick look up of the pre value for a database node id.<br />
<br />
==Full-Text Fuzzy Index: {{Code|ftxx}}, {{Code|ftxy}}, {{Code|ftxz}}==<br />
<br />
...may soon be reimplemented.</div>James Ballhttps://docs.basex.org/index.php?title=File_Module&diff=15310File Module2020-11-15T17:48:10Z<p>James Ball: file:size, corrected the parameter name from $file to $path to match Summary</p>
<hr />
<div>This [[Module Library|XQuery Module]] contains functions related to file system operations, such as listing, reading, or writing files.<br />
<br />
This module is based on the [http://expath.org/spec/file EXPath File Module]. The following enhancements have not been added to the specification yet:<br />
<br />
{| class="wikitable"<br />
|- valign="top"<br />
! Function<br />
! Description<br />
|- valign="top"<br />
| [[#file:descendants|file:descendants]]<br />
| new function<br />
|- valign="top"<br />
| [[#file:is-absolute|file:is-absolute]]<br />
| new function<br />
|- valign="top"<br />
| [[#file:read-text|file:read-text]], [[#file:read-text-lines|file:read-text-lines]]<br />
| <code>$fallback</code> argument added<br />
|- valign="top"<br />
| [[#file:read-text-lines|file:read-text-lines]]<br />
| <code>$offset</code> and <code>$length</code> arguments added<br />
|- valign="top"<br />
| [[#file:resolve-path|file:resolve-path]]<br />
| <code>$base</code> argument added<br />
|}<br />
<br />
=Conventions=<br />
<br />
All functions and errors in this module are assigned to the <code><nowiki>http://expath.org/ns/file</nowiki></code> namespace, which is statically bound to the {{Code|file}} prefix.<br/><br />
<br />
For serialization parameters, the <code><nowiki>http://www.w3.org/2010/xslt-xquery-serialization</nowiki></code> namespace is used, which is statically bound to the {{Code|output}} prefix.<br/><br />
<br />
The error <code>[[#Errors|invalid-path]]</code> is raised if a path is invalid.<br />
<br />
==File Paths==<br />
<br />
* All file paths are resolved against the ''current working directory'' (the directory from which BaseX or, more generally, the Java Virtual Machine, was started). This directory can be retrieved via [[#file:base-dir|file:base-dir]].<br />
<br />
* A path can be specified as local filesystem path or as file URI.<br />
<br />
* Returned strings that refer to existing directories are suffixed with a directory separator.<br />
<br />
=Read Operations=<br />
<br />
==file:list==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:list|$dir as xs:string|xs:string*}}<br />{{Func|file:list|$dir as xs:string, $recursive as xs:boolean|xs:string*}}<br />{{Func|file:list|$dir as xs:string, $recursive as xs:boolean, $pattern as xs:string|xs:string*}}<br /><br />
|-<br />
| '''Summary'''<br />
|Lists all files and directories found in the specified {{Code|$dir}}. The returned paths are relative to the provided path.<br />The optional parameter {{Code|$recursive}} specifies whether sub-directories will be traversed, too.<br />The optional parameter {{Code|$pattern}} defines a file name pattern in the [[Commands#Glob_Syntax|Glob Syntax]]. If present, only those files and directories are returned that correspond to the pattern. Several patterns can be separated with a comma ({{Code|,}}).<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified file does not exist.<br />{{Error|no-dir|#Errors}} the specified path does not point to a directory.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:children==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:children|$dir as xs:string|xs:string*}}<br />
|-<br />
| '''Summary'''<br />
|Returns the full paths to all files and directories found in the specified {{Code|$dir}}.<br/>The inverse function is [[#file:parent|file:parent]]. The related function [[#file:list|file:list]] returns relative file paths.<br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified file does not exist.<br />{{Error|no-dir|#Errors}} the specified path does not point to a directory.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:descendants==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:descendants|$dir as xs:string|xs:string*}}<br />
|-<br />
| '''Summary'''<br />
|Returns the full paths to all files and directories found in the specified {{Code|$dir}} and its sub-directories.<br/>. The related function [[#file:list|file:list]] returns relative file paths.<br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified file does not exist.<br />{{Error|no-dir|#Errors}} the specified path does not point to a directory.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:read-binary==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:read-binary|$path as xs:string|xs:base64Binary}}<br />{{Func|file:read-binary|$path as xs:string, $offset as xs:integer|xs:base64Binary}}<br />{{Func|file:read-binary|$path as xs:string, $offset as xs:integer, $length as xs:integer|xs:base64Binary}}<br />
|-<br />
| '''Summary'''<br />
|Reads the binary content of the file specified by {{Code|$path}} and returns it as [[Lazy Module|lazy]] {{Code|xs:base64Binary}} item.<br />The optional parameters {{Code|$offset}} and {{Code|$length}} can be used to read chunks of a file.<br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified file does not exist.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|out-of-range|#Errors}} the offset or length is negative, or the chosen values would exceed the file bounds.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|-<br />
| '''Examples'''<br />
|<br />
* <code><nowiki>lazy:cache(file:read-binary("config.data"))</nowiki></code> enforces the file access (otherwise, it will be delayed until requested first).<br />
|}<br />
<br />
==file:read-text==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:read-text|$path as xs:string|xs:string}}<br />{{Func|file:read-text|$path as xs:string, $encoding as xs:string|xs:string}}<br />{{Func|file:read-text|$path as xs:string, $encoding as xs:string, $fallback as xs:boolean|xs:string}}<br /><br />
|-<br />
| '''Summary'''<br />
|Reads the textual contents of the file specified by {{Code|$path}} and returns it as [[Lazy Module|lazy]] {{Code|xs:string}} item:<br />
* The UTF-8 default encoding can be overwritten with the optional {{Code|$encoding}} argument.<br />
* By default, invalid characters will be rejected. If {{Code|$fallback}} is set to true, these characters will be replaced with the Unicode replacement character <code>FFFD</code> (&#xFFFD;).<br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified file does not exist.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|unknown-encoding|#Errors}} the specified encoding is not supported, or unknown.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code><nowiki>lazy:cache(file:read-text("ids.txt"))</nowiki></code> enforces the file access (otherwise, it will be delayed until requested first).<br />
|}<br />
<br />
==file:read-text-lines==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:read-text-lines|$path as xs:string|xs:string*}}<br />{{Func|file:read-text-lines|$path as xs:string, $encoding as xs:string|xs:string*}}<br />{{Func|file:read-text-lines|$path as xs:string, $encoding as xs:string, $fallback as xs:boolean|xs:string*}}<br />{{Func|file:read-text-lines|$path as xs:string, $encoding as xs:string, $fallback as xs:boolean, $offset as xs:integer|xs:string*}}<br />{{Func|file:read-text-lines|$path as xs:string, $encoding as xs:string, $fallback as xs:boolean, $offset as xs:integer, $length as xs:integer|xs:string*}}<br /><br />
|-<br />
| '''Summary'''<br />
|Reads the textual contents of the file specified by {{Code|$path}} and returns it as a sequence of {{Code|xs:string}} items:<br />
* The UTF-8 default encoding can be overwritten with the optional {{Code|$encoding}} argument.<br />
* By default, invalid characters will be rejected. If {{Code|$fallback}} is set to true, these characters will be replaced with the Unicode replacement character <code>FFFD</code> (&#xFFFD;).<br />
The lines to be read can be restricted with the optional parameters {{Code|$offset}} and {{Code|$length}}.<br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified file does not exist.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|unknown-encoding|#Errors}} the specified encoding is not supported, or unknown.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
=Write Operations=<br />
<br />
==file:create-dir==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:create-dir|$dir as xs:string|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Creates the directory specified by {{Code|$dir}} if it does not already exist. Non-existing parent directories will be created as well.<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|exists|#Errors}} the specified target exists, but is no directory.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:create-temp-dir==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:create-temp-dir|$prefix as xs:string, $suffix as xs:string|xs:string}}<br />{{Func|file:create-temp-dir|$prefix as xs:string, $suffix as xs:string, $dir as xs:string|xs:string}}<br />
|-<br />
| '''Summary'''<br />
|Creates a new temporary directory that did not exist before this function was called, and returns its full file path. The directory name begins and ends with the specified {{Code|$prefix}} and {{Code|$suffix}}. If no directory is specified via {{Code|$dir}}, the directory will be placed in the system’s default temporary directory. The operation will create all non-existing parent directories.<br />
|-<br />
| '''Errors'''<br />
|{{Error|no-dir|#Errors}} the specified directory points to a file.<br />{{Error|io-error|#Errors}} the directory could not be created.<br />
|}<br />
<br />
==file:create-temp-file==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:create-temp-file|$prefix as xs:string, $suffix as xs:string|xs:string}}<br />{{Func|file:create-temp-file|$prefix as xs:string, $suffix as xs:string, $dir as xs:string|xs:string}}<br />
|-<br />
| '''Summary'''<br />
|Creates a new temporary file that did not exist before this function was called, and returns its full file path. The file name begins and ends with the specified {{Code|$prefix}} and {{Code|$suffix}}. If no directory is specified via {{Code|$dir}}, the file will be placed in the system’s default temporary directory. The operation will create all non-existing parent directories.<br />
|-<br />
| '''Errors'''<br />
|{{Error|no-dir|#Errors}} the specified directory points to a file.<br />{{Error|io-error|#Errors}} the directory could not be created.<br />
|}<br />
<br />
==file:delete==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:delete|$path as xs:string|empty-sequence()}}<br />{{Func|file:delete|$path as xs:string, $recursive as xs:boolean|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Recursively deletes a file or directory specified by {{Code|$path}}.<br />The optional parameter {{Code|$recursive}} specifies whether sub-directories will be deleted, too.<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified path does not exist.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:write==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:write|$path as xs:string, $items as item()*|empty-sequence()}}<br />{{Func|file:write|$path as xs:string, $items as item()*, $params as item()|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Writes a serialized sequence of items to the specified file. If the file already exists, it will be overwritten.<br />The {{Code|$params}} argument contains [[Serialization|serialization parameters]]. As with [https://www.w3.org/TR/xpath-functions-31/#func-serialize fn:serialize()], the parameters can be specified<br /><br />
* either as children of an {{Code|&lt;output:serialization-parameters/&gt;}} element:<br />
<syntaxhighlight lang="xml"><br />
<output:serialization-parameters><br />
<output:method value='xml'/><br />
<output:cdata-section-elements value="div"/><br />
...<br />
</output:serialization-parameters><br />
</syntaxhighlight><br />
* or as map, which contains all key/value pairs:<br />
<syntaxhighlight lang="xquery"><br />
map { "method": "xml", "cdata-section-elements": "div", ... }<br />
</syntaxhighlight><br />
|-<br />
| '''Errors'''<br />
|{{Error|no-dir|#Errors}} the parent of specified path is no directory.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|-<br />
| '''Examples'''<br />
|<br />
* <code><nowiki>file:write('data.bin', xs:hexBinary('414243'))</nowiki></code> writes a hex representation to the specified file.<br />
* <code><nowiki>file:write('data.bin', xs:hexBinary('414243'), map { 'method': 'basex')</nowiki></code> writes binary data to the specified file (see [[XQuery_Extensions#Serialization|Serialization]] for more details).<br />
|}<br />
<br />
==file:write-binary==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:write-binary|$path as xs:string, $value as xs:anyAtomicType|empty-sequence()}}<br />{{Func|file:write-binary|$path as xs:string, $value as xs:anyAtomicType, $offset as xs:integer|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Writes a binary item (xs:base64Binary, xs:hexBinary) to the specified file. If the file already exists, it will be overwritten.<br />If {{Code|$offset}} is specified, data will be written at this file position. An existing file may be resized by that operation.<br />
|-<br />
| '''Errors'''<br />
|{{Error|no-dir|#Errors}} the parent of specified path is no directory.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|out-of-range|#Errors}} the offset is negative, or it exceeds the current file size.<br/>{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:write-text==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:write-text|$path as xs:string, $value as xs:string|empty-sequence()}}<br />{{Func|file:write-text|$path as xs:string, $value as xs:string, $encoding as xs:string|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Writes a string to the specified file. If the file already exists, it will be overwritten.<br />The optional parameter {{Code|$encoding}} defines the output encoding (default: UTF-8).<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|no-dir|#Errors}} the parent of specified path is no directory.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|unknown-encoding|#Errors}} the specified encoding is not supported, or unknown.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:write-text-lines==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:write-text-lines|$path as xs:string, $values as xs:string*|empty-sequence()}}<br />{{Func|file:write-text-lines|$path as xs:string, $values as xs:string*, $encoding as xs:string|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Writes a sequence of strings to the specified file, each followed by the system specific newline character. If the file already exists, it will be overwritten.<br />The optional parameter {{Code|$encoding}} defines the output encoding (default: UTF-8).<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|no-dir|#Errors}} the parent of specified path is no directory.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|unknown-encoding|#Errors}} the specified encoding is not supported, or unknown.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:append==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:append|$path as xs:string, $items as item()*|empty-sequence()}}<br />{{Func|file:append|$path as xs:string, $items as item()*, $params as item()|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Appends a serialized sequence of items to the specified file. If the file does not exists, a new file is created.<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|no-dir|#Errors}} the parent of specified path is no directory.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:append-binary==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:append-binary|$path as xs:string, $value as xs:anyAtomicType|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Appends a binary item (xs:base64Binary, xs:hexBinary) to the specified file. If the file does not exists, a new one is created.<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|no-dir|#Errors}} the parent of specified path is no directory.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:append-text==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:append-text|$path as xs:string, $value as xs:string|empty-sequence()}}<br />{{Func|file:append-text|$path as xs:string, $value as xs:string, $encoding as xs:string|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Appends a string to a file specified by {{Code|$path}}. If the specified file does not exists, a new file is created.<br />The optional parameter {{Code|$encoding}} defines the output encoding (default: UTF-8).<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|no-dir|#Errors}} the parent of specified path is no directory.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|unknown-encoding|#Errors}} the specified encoding is not supported, or unknown.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:append-text-lines==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:append-text-lines|$path as xs:string, $values as xs:string*|empty-sequence()}}<br />{{Func|file:append-text-lines|$path as xs:string, $values as xs:string*, $encoding as xs:string|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Appends a sequence of strings to the specified file, each followed by the system specific newline character. If the specified file does not exists, a new file is created.<br />The optional parameter {{Code|$encoding}} defines the output encoding (default: UTF-8).<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|no-dir|#Errors}} the parent of specified path is no directory.<br />{{Error|is-dir|#Errors}} the specified path is a directory.<br />{{Error|unknown-encoding|#Errors}} the specified encoding is not supported, or unknown.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:copy==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:copy|$source as xs:string, $target as xs:string|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Copies a file or directory specified by {{Code|$source}} to the file or directory specified by {{Code|$target}}. If the target file already exists, it will be overwritten. No operation will be performed if the source and target path are equal.<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified source does not exist.<br />{{Error|exists|#Errors}} the specified source is a directory and the target is a file.<br />{{Error|no-dir|#Errors}} the parent of the specified target is no directory.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
==file:move==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:move|$source as xs:string, $target as xs:string|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Moves or renames the file or directory specified by {{Code|$source}} to the path specified by {{Code|$target}}. If the target file already exists, it will be overwritten. No operation will be performed if the source and target path are equal.<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified source does not exist.<br />{{Error|exists|#Errors}} the specified source is a directory and the target is a file.<br />{{Error|no-dir|#Errors}} the parent of the specified target is no directory.<br />{{Error|io-error|#Errors}} the operation fails for some other reason.<br /><br />
|}<br />
<br />
=File Properties=<br />
<br />
==file:exists==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:exists|$path as xs:string|xs:boolean}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns an {{Code|xs:boolean}} indicating whether a file or directory specified by {{Code|$path}} exists in the file system.<br /><br />
|}<br />
<br />
==file:is-dir==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:is-dir|$path as xs:string|xs:boolean}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns an {{Code|xs:boolean}} indicating whether the argument {{Code|$path}} points to an existing directory.<br /><br />
|}<br />
<br />
==file:is-absolute==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:is-absolute|$path as xs:string|xs:boolean}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns an {{Code|xs:boolean}} indicating whether the argument {{Code|$path}} is absolute.<br />The behavior of this function depends on the operating system: On Windows, an absolute path starts with the drive letter and a colon, whereas on Linux it starts with a slash.<br />
|}<br />
<br />
==file:is-file==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:is-file|$path as xs:string|xs:boolean}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns an {{Code|xs:boolean}} indicating whether the argument {{Code|$path}} points to an existing file.<br /><br />
|}<br />
<br />
==file:last-modified==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:last-modified|$path as xs:string|xs:dateTime}}<br /><br />
|-<br />
| '''Summary'''<br />
|Retrieves the timestamp of the last modification of the file or directory specified by {{Code|$path}}.<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified path does not exist.<br /><br />
|}<br />
<br />
==file:size==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:size|$path as xs:string|xs:integer}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns the size, in bytes, of the file specified by {{Code|$path}}, or {{Code|0}} for directories.<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified file does not exist.<br /><br />
|}<br />
<br />
=Path Functions=<br />
<br />
==file:name==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:name|$path as xs:string|xs:string}}<br />
|-<br />
| '''Summary'''<br />
|Returns the name of a file or directory specified by {{Code|$path}}. An empty string is returned if the path points to the root directory.<br />
|}<br />
<br />
==file:parent==<br />
<br />
{| width='100%'<br />
<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:parent|$path as xs:string|xs:string?}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns the absolute path to the parent directory of a file or directory specified by {{Code|$path}}. An empty sequence is returned if the path points to a root directory.<br/>The inverse function is [[#file:children|file:children]].<br /><br />
|-<br />
| '''Examples'''<br />
|<br />
* <code><nowiki>file:parent(static-base-uri())</nowiki></code> returns the directory of the current XQuery module.<br />
|}<br />
<br />
==file:path-to-native==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:path-to-native|$path as xs:string|xs:string}}<br /><br />
|-<br />
| '''Summary'''<br />
|Transforms the {{Code|$path}} argument to its native representation on the operating system.<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|not-found|#Errors}} the specified file does not exist.<br />{{Error|io-error|#Errors}} the specified path cannot be transformed to its native representation.<br /><br />
|}<br />
<br />
==file:resolve-path==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:resolve-path|$path as xs:string|xs:string}}<br />{{Func|file:resolve-path|$path as xs:string, $base as xs:string|xs:string}}<br /><br />
|-<br />
| '''Summary'''<br />
|Transforms the {{Code|$path}} argument to an absolute operating system path.<br />If the path is relative, and if an absolute {{Code|$base}} path is specified, it will be resolved against this path.<br />
|-<br />
| '''Errors'''<br />
|{{Error|is-relative|#Errors}} the specified base path is relative.<br /><br />
|-<br />
| '''Examples'''<br />
|The following examples apply to Windows:<br />
* {{Code|file:resolve-path('file.txt', 'C:/Temp/')}} returns {{Code|C:/Temp/file.txt}}.<br />
* {{Code|file:resolve-path('file.txt', 'C:/Temp')}} returns {{Code|C:/file.txt}}.<br />
* {{Code|file:resolve-path('file.txt', 'Temp')}} raises an error.<br />
|}<br />
<br />
==file:path-to-uri==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:path-to-uri|$path as xs:string|xs:string}}<br /><br />
|-<br />
| '''Summary'''<br />
|Transforms the path specified by {{Code|$path}} into a URI with the {{Code|file://}} scheme.<br /><br />
|}<br />
<br />
=System Properties=<br />
<br />
==file:dir-separator==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Code|'''file:dir-separator'''() as xs:string}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns the directory separator used by the operating system, such as {{Code|/}} or {{Code|\}}.<br /><br />
|}<br />
<br />
==file:path-separator==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Code|'''file:path-separator'''() as xs:string}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns the path separator used by the operating system, such as {{Code|;}} or {{Code|:}}.<br /><br />
|}<br />
<br />
==file:line-separator==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:line-separator||xs:string}}<br />
|-<br />
| '''Summary'''<br />
|Returns the line separator used by the operating system, such as {{Code|&amp;#10;}}, {{Code|&amp;#13;&amp;#10;}} or {{Code|&amp;#13;}}.<br /><br />
|}<br />
<br />
==file:temp-dir==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:temp-dir||xs:string}}<br />
|-<br />
| '''Summary'''<br />
|Returns the system’s default temporary-file directory.<br /><br />
|}<br />
<br />
==file:current-dir==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:current-dir||xs:string}}<br />
|-<br />
| '''Summary'''<br />
|Returns the current working directory. This function returns the same result as the function call <code>file:resolve-path("")</code>.<br />
|}<br />
<br />
==file:base-dir==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|file:base-dir||xs:string?}}<br />
|-<br />
| '''Summary'''<br />
|Returns the parent directory of the static base URI. If the Base URI property is undefined, the empty sequence is returned. - If a static base URI exists, and if points to a local file path, this function returns the same result as the expression {{Code|file:parent(static-base-uri())}}.<br />
|}<br />
<br />
=Errors=<br />
<br />
{| class="wikitable" width="100%"<br />
! width="160"|Code<br />
|Description<br />
|-<br />
|{{Code|exists}}<br />
|A file with the same path already exists.<br />
|-<br />
|{{Code|invalid-path}}<br />
|A specified path is invalid.<br />
|-<br />
|{{Code|io-error}}<br />
|The operation fails for some other reason specific to the operating system.<br />
|-<br />
|{{Code|is-dir}}<br />
|The specified path is a directory.<br />
|-<br />
|{{Code|is-relative}}<br />
|The specified path is relative (and must be absolute).<br />
|-<br />
|{{Code|no-dir}}<br />
|The specified path does not point to a directory.<br />
|-<br />
|{{Code|not-found}}<br />
|A specified path does not exist.<br />
|-<br />
|{{Code|out-of-range}}<br />
|The specified offset or length is negative, or the chosen values would exceed the file bounds.<br />
|-<br />
|{{Code|unknown-encoding}}<br />
|The specified encoding is not supported, or unknown.<br />
|}<br />
<br />
=Changelog=<br />
<br />
;Version 9.3<br />
* Added: [[#file:descendants|file:descendants]]<br />
<br />
;Version 9.0<br />
* Updated: [[#file:read-text-lines|file:read-text-lines]]: <code>$offset</code> and <code>$length</code> arguments added.<br />
<br />
;Version 8.5<br />
* Updated: [[#file:read-text|file:read-text]], [[#file:read-text-lines|file:read-text-lines]]: <code>$fallback</code> argument added.<br />
<br />
;Version 8.2<br />
* Added: [[#file:is-absolute|file:is-absolute]]<br />
* Updated: [[#file:resolve-path|file:resolve-path]]: base argument added<br />
<br />
;Version 8.0<br />
* Added: [[#file:current-dir|file:current-dir]], [[#file:base-dir|file:base-dir]], [[#file:children|file:children]]<br />
<br />
;Version 7.8<br />
* Added: [[#file:parent|file:parent]], [[#file:name|file:name]]<br />
* Updated: error codes; [[#file:read-binary|file:read-binary]], [[#file:write-binary|file:write-binary]]: {{Code|$offset}} and {{Code|$length}} arguments added.<br />
* Deleted: file:base-name, file:dir-name<br />
<br />
;Version 7.7<br />
* Added: [[#file:create-temp-dir|file:create-temp-dir]], [[#file:create-temp-file|file:create-temp-file]], [[#file:temp-dir|file:temp-dir]]<br />
* Updated: all returned strings that refer to existing directories will be suffixed with a directory separator.<br />
<br />
;Version 7.3<br />
* Added: [[#file:append-text|file:append-text]], [[#file:write-text|file:write-text]], [[#file:append-text-lines|file:append-text-lines]], [[#file:write-text-lines|file:write-text-lines]], [[#file:line-separator|file:line-separator]]<br />
* Aligned with latest specification: $file:directory-separator → [[#file:dir-separator|file:dir-separator]], $file:path-separator → [[#file:path-separator|file:path-separator]], file:is-directory → [[#file:is-dir|file:is-dir]], file:create-directory → [[#file:create-dir|file:create-dir]]<br />
* Updated: [[#file:write-binary|file:write-binary]], [[#file:append-binary|file:append-binary]]: output limited to a single value<br />
<br />
;Version 7.2.1<br />
* Updated: [[#file:delete|file:delete]]: {{Code|$recursive}} parameter added to prevent sub-directories from being accidentally deleted.<br />
* Fixed: [[#file:list|file:list]] now returns relative instead of absolute paths.</div>James Ballhttps://docs.basex.org/index.php?title=XQuery_3.0&diff=15265XQuery 3.02020-07-28T15:24:04Z<p>James Ball: Correction of code example in Count where angle brackets were showing entities</p>
<hr />
<div>This article is part of the [[XQuery|XQuery Portal]]. It provides a summary of the most important features of the [https://www.w3.org/TR/xquery-30/ XQuery 3.0] Recommendation.<br />
<br />
=Enhanced FLWOR Expressions=<br />
<br />
Most clauses of FLWOR expressions can be specified in an arbitrary order: additional {{Code|let}} and {{Code|for}} clauses can be put after a {{Code|where}} clause, and multiple {{Code|where}}, {{Code|order by}} and {{Code|group by}} statements can be used. This means that many nested loops can now be rewritten to a single FLWOR expression.<br />
<br />
'''Example:''' <br />
<syntaxhighlight lang="xquery"><br />
for $country in db:open('factbook')//country<br />
where $country/@population > 100000000<br />
for $city in $country//city[population > 1000000]<br />
group by $name := $country/name[1]<br />
count $id<br />
return <country id='{ $id }' name='{ $name }'>{ $city/name }</country><br />
</syntaxhighlight><br />
<br />
==group by==<br />
<br />
FLWOR expressions have been extended to include the [https://www.w3.org/TR/xquery-30/#id-group-by group by] clause, which is well-established in SQL. <code>group by</code> can be used to apply value-based partitioning to query results:<br />
<br />
'''XQuery:''' <br />
<syntaxhighlight lang="xquery"><br />
for $ppl in doc('xmark')//people/person <br />
let $ic := $ppl/profile/@income<br />
let $income := if($ic < 30000) then<br />
"challenge" <br />
else if($ic >= 30000 and $ic < 100000) then <br />
"standard" <br />
else if($ic >= 100000) then <br />
"preferred" <br />
else <br />
"na" <br />
group by $income<br />
order by $income<br />
return element { $income } { count($ppl) }<br />
</syntaxhighlight> <br />
<br />
This query is a rewrite of [https://www.ins.cwi.nl/projects/xmark/Assets/xmlquery.txt Query #20] contained in the [https://projects.cwi.nl/xmark/ XMark Benchmark Suite] to use <code>group by</code>.<br />
The query partitions the customers based on their income. <br />
<br />
'''Result:''' <br />
<syntaxhighlight lang="xml"><br />
<challenge>4731</challenge><br />
<na>12677</na><br />
<preferred>314</preferred><br />
<standard>7778</standard><br />
</syntaxhighlight><br />
<br />
In contrast to the relational GROUP BY statement, the XQuery counterpart concatenates the values of all non-grouping variables that belong to a specific group. In the context of our example, all nodes in <code>//people/person</code> that belong to the <code>preferred</code> partition are concatenated in <code class="brush:xquery">$ppl</code> after grouping has finished. You can see this effect by changing the return statement to:<br />
<br />
<syntaxhighlight lang="xquery"> <br />
...<br />
return element { $income } { $ppl }<br />
</syntaxhighlight><br />
<br />
'''Result:'''<br />
<syntaxhighlight lang="xml"><br />
<challenge><br />
<person id="person0"><br />
<name>Kasidit Treweek</name><br />
…<br />
<person id="personX"><br />
…<br />
</challenge><br />
</syntaxhighlight><br />
<br />
Moreover, a value can be assigned to the grouping variable. This is shown in the following example:<br />
<br />
'''XQuery:''' <br />
<syntaxhighlight lang="xquery"><br />
let $data :=<br />
<xml><br />
<person country='USA' name='John'/><br />
<person country='USA' name='Jack'/><br />
<person country='Germany' name='Johann'/><br />
</xml><br />
for $person in $data/person<br />
group by $country := $person/@country/string()<br />
return element persons {<br />
attribute country { $country },<br />
$person/@name ! element name { data() }<br />
}<br />
</syntaxhighlight><br />
<br />
'''Result:'''<br />
<syntaxhighlight lang="xml"><br />
<persons country="USA"><br />
<name>John</name><br />
<name>Jack</name><br />
</persons><br />
<persons country="Germany"><br />
<name>Johann</name><br />
</persons><br />
</syntaxhighlight><br />
<br />
==count==<br />
<br />
The {{Code|count}} clause enhances the FLWOR expression with a variable that enumerates the iterated tuples.<br />
<br />
<syntaxhighlight lang="xquery"><br />
for $n in (1 to 10)[. mod 2 = 1]<br />
count $c<br />
return <number count="{ $c }" number="{ $n }"/><br />
</syntaxhighlight><br />
<br />
==allowing empty==<br />
<br />
The {{Code|allowing empty}} provides functionality similar to outer joins in SQL:<br />
<br />
<syntaxhighlight lang="xquery"><br />
for $n allowing empty in ()<br />
return 'empty? ' || empty($n)<br />
</syntaxhighlight><br />
<br />
==window==<br />
<br />
Window clauses provide a rich set of variable declarations to process sub-sequences of iterated tuples. An example:<br />
<br />
<syntaxhighlight lang="xquery"><br />
for tumbling window $w in (2, 4, 6, 8, 10, 12, 14)<br />
start at $s when fn:true()<br />
only end at $e when $e - $s eq 2<br />
return &lt;window&gt;{ $w }&lt;/window&gt;<br />
</syntaxhighlight><br />
<br />
More information on window clauses, and all other enhancements, can be found in the [https://www.w3.org/TR/xquery-30/#id-windows specification].<br />
<br />
=Function Items=<br />
<br />
One of the most distinguishing features added in ''XQuery 3.0'' are ''function items'', also known as ''lambdas'' or ''lambda functions''. They make it possible to abstract over functions and thus write more modular code.<br />
<br />
'''Examples:'''<br />
<br />
Function items can be obtained in three different ways:<br />
<br />
<ul><br />
<li>Declaring a new ''inline function'':<br />
<syntaxhighlight lang="xquery">let $f := function($x, $y) { $x + $y }<br />
return $f(17, 25)</syntaxhighlight> <br />
'''Result:''' <code>42</code><br />
</li><br />
<li>Getting the function item of an existing (built-in or user-defined) XQuery function. The arity (number of arguments) has to be specified as there can be more than one function with the same name:<br />
<syntaxhighlight lang="xquery">let $f := math:pow#2<br />
return $f(5, 2)</syntaxhighlight> <br />
'''Result:''' <code>25</code><br />
</li><br />
<li>''Partially applying'' another function or function item. This is done by supplying only some of the required arguments, writing the placeholder <code>?</code> in the positions of the arguments left out. The produced function item has one argument for every placeholder.<br />
<syntaxhighlight lang="xquery">let $f := fn:substring(?, 1, 3)<br />
return (<br />
$f('foo123'),<br />
$f('bar456')<br />
)</syntaxhighlight> <br />
'''Result:''' <code>foo bar</code><br />
</li><br />
</ul><br />
<br />
Function items can also be passed as arguments to and returned as results from functions. These so-called [[Higher-Order Functions]] like <code>fn:map</code> and <code>fn:fold-left</code> are discussed in more depth on their own Wiki page.<br />
<br />
=Simple Map Operator=<br />
<br />
The [https://www.w3.org/TR/xquery-30/#id-map-operator simple map] operator {{Code|!}} provides a compact notation for applying the results of a first to a second expression: the resulting items of the first expression are bound to the context item one by one, and the second expression is evaluated for each item. The map operator may be used as replacement for FLWOR expressions:<br />
<br />
'''Example:''' <br />
<syntaxhighlight lang="xquery"><br />
(: Simple map notation :)<br />
(1 to 10) ! element node { . },<br />
(: FLWOR notation :)<br />
for $i in 1 to 10<br />
return element node { $i }<br />
</syntaxhighlight><br />
<br />
In contrast to path expressions, the results of the map operator will not be made duplicate-free and returned in document order.<br />
<br />
=Try/Catch=<br />
<br />
The [https://www.w3.org/TR/xquery-30/#id-try-catch try/catch] construct can be used to handle errors at runtime:<br />
<br />
'''Example:''' <br />
<syntaxhighlight lang="xquery"><br />
try {<br />
1 + '2'<br />
} catch err:XPTY0004 {<br />
'Typing error: ' || $err:description<br />
} catch * {<br />
'Error [' || $err:code || ']: ' || $err:description<br />
}<br />
</syntaxhighlight><br />
'''Result:''' <code>Typing error: '+' operator: number expected, xs:string found.</code><br />
<br />
Within the scope of the catch clause, a number of variables are implicitly declared, giving information about the error that occurred:<br />
<br />
* {{Code|$err:code}} error code<br />
* {{Code|$err:description}}: error message<br />
* {{Code|$err:value}}: value associated with the error (optional)<br />
* {{Code|$err:module}}: URI of the module where the error occurred<br />
* {{Code|$err:line-number}}: line number where the error occurred<br />
* {{Code|$err:column-number}}: column number where the error occurred<br />
* {{Code|$err:additional}}: error stack trace<br />
<br />
=Switch=<br />
<br />
The [https://www.w3.org/TR/xquery-30/#id-switch switch] statement is available in many other programming languages. It chooses one of several expressions to evaluate based on its input value.<br />
<br />
'''Example:''' <br />
<syntaxhighlight lang="xquery"><br />
for $fruit in ("Apple", "Pear", "Peach")<br />
return switch ($fruit)<br />
case "Apple" return "red"<br />
case "Pear" return "green"<br />
case "Peach" return "pink"<br />
default return "unknown"<br />
</syntaxhighlight> <br />
'''Result:''' <code>red green pink</code><br />
<br />
The expression to evaluate can correspond to multiple input values.<br />
<br />
'''Example:'''<br />
<syntaxhighlight lang="xquery"><br />
for $fruit in ("Apple", "Cherry")<br />
return switch ($fruit)<br />
case "Apple"<br />
case "Cherry"<br />
return "red"<br />
case "Pear"<br />
return "green"<br />
case "Peach"<br />
return "pink"<br />
default<br />
return "unknown"<br />
</syntaxhighlight><br />
'''Result:''' <code>red red</code><br />
<br />
=Expanded QNames=<br />
<br />
A ''QName'' can be prefixed with the letter "Q" and a namespace URI in the [http://www.jclark.com/xml/xmlns.htm Clark Notation].<br />
<br />
'''Examples:'''<br />
* <code><nowiki>Q{http://www.w3.org/2005/xpath-functions/math}pi()</nowiki></code> returns the number π<br />
* <code>Q{java:java.io.FileOutputStream}new("output.txt")</code> creates a new Java file output stream<br />
<br />
=Namespace Constructors=<br />
<br />
New namespaces can be created via so-called 'Computed Namespace Constructors'.<br />
<br />
<syntaxhighlight lang="xquery"> <br />
element node { namespace pref { 'http://url.org/' } }<br />
</syntaxhighlight><br />
<br />
=String Concatenations=<br />
<br />
Two vertical bars <code>||</code> (also named ''pipe characters'') can be used to concatenate strings. This operator is a shortcut for the {{Code|fn:concat()}} function.<br />
<br />
<syntaxhighlight lang="xquery"> <br />
'Hello' || ' ' || 'Universe'<br />
</syntaxhighlight><br />
<br />
=External Variables=<br />
<br />
Default values can be attached to external variable declarations. This way, an expression can also be evaluated if its external variables have not been bound to a new value.<br />
<br />
<syntaxhighlight lang="xquery"> <br />
declare variable $user external := "admin";<br />
"User:", $user<br />
</syntaxhighlight><br />
<br />
=Serialization=<br />
<br />
[[Serialization|Serialization ]]parameters can be defined within XQuery expressions. Parameters are placed in the query prolog and need to be specified as option declarations, using the <code>output</code> prefix.<br />
<br />
'''Example:''' <br />
<syntaxhighlight lang="xquery"><br />
declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";<br />
declare option output:omit-xml-declaration "no";<br />
declare option output:method "xhtml";<br />
&lt;html/&gt;<br />
</syntaxhighlight> <br />
'''Result:''' <code>&lt;?xml version="1.0" encoding="UTF-8"?&gt;&lt;html&gt;&lt;/html&gt;</code><br />
<br />
In BaseX, the {{Code|output}} prefix is statically bound and can thus be omitted. Note that all namespaces need to be specified when using external APIs, such as [http://xqj.net/basex/ XQJ].<br />
<br />
=Context Item=<br />
<br />
The context item can be specified in the prolog of an XQuery expression:<br />
<br />
'''Example:''' <br />
<syntaxhighlight lang="xquery"><br />
declare context item := document {<br />
<xml><br />
<text>Hello</text><br />
<text>World</text><br />
</xml><br />
};<br />
<br />
for $t in .//text()<br />
return string-length($t)<br />
</syntaxhighlight> <br />
'''Result:''' <code>5 5</code><br />
<br />
=Annotations=<br />
<br />
XQuery 3.0 introduces annotations to declare properties associated with functions and variables. For instance, a function may be declared %public, %private, or %updating.<br />
<br />
'''Example:''' <br />
<syntaxhighlight lang="xquery"><br />
declare %private function local:max($x1, $x2) {<br />
if($x1 > $x2) then $x1 else $x2<br />
};<br />
<br />
local:max(2, 3)<br />
</syntaxhighlight><br />
<br />
=Functions=<br />
<br />
The following functions have been added in the [https://www.w3.org/TR/xpath-functions-31/ XQuery 3.0 Functions and Operators] Specification:<br />
<br />
<code>fn:analyze-string</code>* <code>fn:available-environment-variables</code>, <code>fn:element-with-id</code>, <code>fn:environment-variable</code>, <code>fn:filter</code>, <code>fn:fold-left</code>, <code>fn:fold-right</code>, <code>fn:for-each</code>, <code>fn:for-each-pair</code>, <code>fn:format-date</code>, <code>fn:format-dateTime</code>, <code>fn:format-integer</code>, <code>fn:format-number</code>, <code>fn:format-time</code>, <code>fn:function-arity</code>, <code>fn:function-lookup</code>, <code>fn:function-name</code>, <code>fn:generate-id</code>, <code>fn:has-children</code>, <code>fn:head</code>, <code>fn:innermost</code>, <code>fn:outermost</code>, <code>fn:parse-xml</code>, <code>fn:parse-xml-fragment</code>, <code>fn:path</code>, <code>fn:serialize</code>, <code>fn:tail</code>, <code>fn:unparsed-text</code>, <code>fn:unparsed-text-available</code>, <code>fn:unparsed-text-lines</code>, <code>fn:uri-collection</code><br />
<br />
New signatures have been added for the following functions:<br />
<br />
<code>fn:document-uri</code>, <code>fn:string-join</code>, <code>fn:node-name</code>, <code>fn:round</code>, <code>fn:data</code><br />
<br />
=Changelog=<br />
<br />
;Version 8.4<br />
<br />
* Added: %non-deterministic<br />
<br />
;Version 8.0<br />
<br />
* Added: %basex:inline, %basex:lazy<br />
<br />
;Version 7.7<br />
<br />
* Added: [[#Enhanced FLWOR Expressions|Enhanced FLWOR Expressions]]<br />
<br />
;Version 7.3<br />
<br />
* Added: [[#Simple Map Operator|Simple Map Operator]]<br />
<br />
;Version 7.2<br />
<br />
* Added: [[#Annotations|Annotations]]<br />
* Updated: [[#Expanded QNames|Expanded QNames]]<br />
<br />
;Version 7.1<br />
<br />
* Added: [[#Expanded QNames|Expanded QNames]], [[#Namespace Constructors|Namespace Constructors]]<br />
<br />
;Version 7.0<br />
<br />
* Added: [[#String Concatenations|String Concatenations]]<br />
<br />
[[Category:XQuery]]</div>James Ballhttps://docs.basex.org/index.php?title=HTML_Module&diff=14568HTML Module2019-06-28T11:31:27Z<p>James Ball: Made link to Wikipedia HTTPS for binary example - as HTTP returns nothing</p>
<hr />
<div>This [[Module Library|XQuery Module]] provides functions for converting HTML to XML. Conversion will only take place if TagSoup is included in the classpath (see [[Parsers#HTML Parser|HTML Parsing]] for more details).<br />
<br />
=Conventions=<br />
<br />
All functions and errors in this module are assigned to the <code><nowiki>http://basex.org/modules/html</nowiki></code> namespace, which is statically bound to the {{Code|html}} prefix.<br/><br />
<br />
=Functions=<br />
<br />
==html:parser==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Code|'''html:parser'''() as xs:string}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns the name of the applied HTML parser (currently: {{Code|TagSoup}}). If an ''empty string'' is returned, TagSoup was not found in the classpath, and the input will be treated as well-formed XML.<br /><br />
|}<br />
<br />
==html:parse==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|html:parse|$input as xs:anyAtomicType|document-node()}}<br />{{Func|html:parse|$input as xs:anyAtomicType, $options as map(*)?|document-node()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Converts the HTML document specified by {{Code|$input}} to XML, and returns a document node:<br/><br />
* The input may either be a string or a binary item (xs:hexBinary, xs:base64Binary).<br />
* If the input is passed on in its binary representation, the HTML parser will try to automatically choose the correct encoding.<br />
<br />
The {{Code|$options}} argument can be used to set [[Parsers#Options|TagSoup Options]].<br />
|-<br />
| '''Errors'''<br />
|{{Error|parse|#Errors}} the input cannot be converted to XML.<br />
|}<br />
<br />
=Examples=<br />
<br />
===Basic Example===<br />
<br />
The following query converts the specified string to an XML document node.<br />
<br />
;Query:<br />
<pre class="brush:xquery"><br />
html:parse("<html>")<br />
</pre><br />
<br />
;Result:<br />
<pre class="brush:xml"><br />
<html xmlns="http://www.w3.org/1999/xhtml"/><br />
</pre><br />
<br />
===Specifying Options===<br />
<br />
The next query creates an XML document with namespaces:<br />
<br />
;Query:<br />
<pre class="brush:xquery"><br />
html:parse("<a href='ok.html'/>", map { 'nons': false() })<br />
</pre><br />
<br />
;Result:<br />
<pre class="brush:xml"><br />
<html xmlns="http://www.w3.org/1999/xhtml"><br />
<body><br />
<a shape="rect" href="ok.html"/><br />
</body><br />
</html><br />
</pre><br />
<br />
===Parsing Binary Input===<br />
<br />
If the input encoding is unknown, the data to be processed can be passed on in its binary representation.<br />
The HTML parser will automatically try to detect the correct encoding:<br />
<br />
;Query:<br />
<pre class="brush:xquery"><br />
html:parse(fetch:binary("https://en.wikipedia.org"))<br />
</pre><br />
<br />
;Result:<br />
<pre class="brush:xml"><br />
<html xmlns="http://www.w3.org/1999/xhtml" class="client-nojs" dir="ltr" lang="en"><br />
<head><br />
<title>Wikipedia, the free encyclopedia</title><br />
<meta charset="UTF-8"/><br />
...<br />
</pre><br />
<br />
=Errors=<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
|Description<br />
|-<br />
|{{Code|parse}}<br />
|The input cannot be converted to XML.<br />
|}<br />
<br />
=Changelog=<br />
<br />
;Version 9.0<br />
<br />
* Updated: error codes updated; errors now use the module namespace<br />
<br />
The module was introduced with Version 7.6.</div>James Ballhttps://docs.basex.org/index.php?title=XQuery_3.1&diff=14530XQuery 3.12019-04-16T22:07:40Z<p>James Ball: Grammatical correction</p>
<hr />
<div>This article is part of the [[XQuery|XQuery Portal]]. It provides a summary of the most important features of the [http://www.w3.org/TR/xquery-31/ XQuery 3.1] Recommendation.<br />
<br />
=Maps=<br />
<br />
A ''map'' is a function that associates a set of keys with values, resulting in a collection of key/value pairs. Each key/value pair in a map is called an entry. A key is an arbitrary atomic value, and the associated value is an arbitrary sequence. Within a map, no two entries have the same key, when compared using the {{Code|eq}} operator. It is not necessary that all the keys should be mutually comparable (for example, they can include a mixture of integers and strings).<br />
<br />
Maps can be constructed as follows:<br />
<br />
<pre class="brush:xquery"><br />
map { }, (: empty map :)<br />
map { 'key': true(), 1984: (<a/>, <b/>) }, (: map with two entries :)<br />
map:merge( (: map with ten entries :)<br />
for $i in 1 to 10<br />
return map { $i: 'value' || $i }<br />
)<br />
</pre><br />
<br />
The function corresponding to the map has the signature {{Code|function($key as xs:anyAtomicType) as item()*}}. The expression {{Code|$map($key)}} returns the associated value; the function call {{Code|map:get($map, $key)}} is equivalent. For example, if {{Code|$books-by-isbn}} is a map whose keys are ISBNs and whose associated values are {{Code|book}} elements, then the expression {{Code|$books-by-isbn("0470192747")}} returns the {{Code|book}} element with the given ISBN. The fact that a map is a function item allows it to be passed as an argument to higher-order functions that expect a function item as one of their arguments. As an example, the following query uses the higher-order function {{Code|fn:map($f, $seq)}} to extract all bound values from a map:<br />
<br />
<pre class="brush:xquery"><br />
let $map := map { 'foo': 42, 'bar': 'baz', 123: 456 }<br />
return fn:for-each(map:keys($map), $map)<br />
</pre><br />
<br />
This returns some permutation of {{Code|(42, 'baz', 456)}}.<br />
<br />
Because a map is a function item, functions that apply to functions also apply to maps. A map is an anonymous function, so {{Code|fn:function-name}} returns the empty sequence; {{Code|fn:function-arity}} always returns {{Code|1}}.<br />
<br />
Like all other values, maps are immutable. For example, the <code>[[Map Module#map:remove|map:remove]]</code> function creates a new map by removing an entry from an existing map, but the existing map is not changed by the operation. Like sequences, maps have no identity. It is meaningful to compare the contents of two maps, but there is no way of asking whether they are "the same map": two maps with the same content are indistinguishable.<br />
<br />
Maps may be compared using the {{Code|fn:deep-equal}} function. The [[Map Module]] describes the available set of map functions.<br />
<br />
=Arrays=<br />
<br />
An ''array'' is a function that associates a set of positions, represented as positive integer keys, with values. The first position in an array is associated with the integer {{Code|1}}. The values of an array are called its members. In the type hierarchy, array has a distinct type, which is derived from function. In BaseX, arrays (as well as sequences) are based on an efficient [http://en.wikipedia.org/wiki/Finger_tree Finger Tree] implementation.<br />
<br />
Arrays can be constructed in two ways. With the square bracket notation, the comma serves as delimiter:<br />
<br />
<pre class="brush:xquery"><br />
[], (: empty array :)<br />
[ (1, 2) ], (: array with single member :)<br />
[ 1 to 2, 3 ] (: array with two members; same as: [ (1, 2), 3 ] :)<br />
</pre><br />
<br />
With the {{Code|array}} keyword and curly brackets, the inner expression is evaluated as usual, and the resulting values will be the members of the array:<br />
<br />
<pre class="brush:xquery"><br />
array { }, (: empty array; same as: array { () } :) <br />
array { (1, 2) }, (: array with two members; same as: array { 1, 2 } :)<br />
array { 1 to 2, 3 } (: array with three members; same as: array { 1, 2, 3 } :)<br />
</pre><br />
<br />
The function corresponding to the array has the signature {{Code|function($index as xs:integer) as item()*}}. The expression {{Code|$array($index)}} returns an addressed member of the array. The following query returns the five array members {{Code|48 49 50 51 52}} as result:<br />
<br />
<pre class="brush:xquery"><br />
let $array := array { 48 to 52 }<br />
for $i in 1 to array:size($array)<br />
return $array($i)<br />
</pre><br />
<br />
Like all other values, arrays are immutable. For example, the <code>[[Array Module#array:reverse|array:reverse]]</code> function creates a new array containing a re-ordering of the members of an existing array, but the existing array is not changed by the operation. Like sequences, arrays have no identity. It is meaningful to compare the contents of two arrays, but there is no way of asking whether they are "the same array": two arrays with the same content are indistinguishable.<br />
<br />
==Atomization==<br />
<br />
If an array is ''atomized'', all of its members will be atomized. As a result, an atomized item may now result in more than one item. Some examples:<br />
<br />
<pre class="brush:xquery"><br />
fn:data([1 to 2]) (: returns the sequence 1, 2 :)<br />
[ 'a', 'b', 'c' ] = 'b' (: returns true :)<br />
<a>{ [ 1, 2 ] }</a> (: returns <a>1 2</a> :)<br />
array { 1 to 2 } + 3 (: error: the left operand returns two items :)<br />
</pre><br />
<br />
Atomization also applies to function arguments. The following query returns 5, because the array will be atomized to a sequence of 5 integers:<br />
<br />
<pre class="brush:xquery"><br />
let $f := function($x as xs:integer*) { count($x) }<br />
return $f([1 to 5])<br />
</pre><br />
<br />
However, the next query returns 1, because the array is already of the general type {{Code|item()}}, and no atomization will take place:<br />
<br />
<pre class="brush:xquery"><br />
let $f := function($x as item()*) { count($x) }<br />
return $f([1 to 5])<br />
</pre><br />
<br />
Arrays can be compared with the {{Code|fn:deep-equal}} function. The [[Array Module]] describes the available set of array functions.<br />
<br />
=Lookup Operator=<br />
<br />
The lookup operator provides some syntactic sugar to access values of maps or array members. It is introduced by the question mark ({{Code|?}}) and followed by a specifier. The specifier can be:<br />
<br />
# A wildcard {{Code|*}},<br />
# The name of the key,<br />
# The integer offset, or<br />
# Any other parenthesized expression.<br />
<br />
The following example demonstrates the four alternatives:<br />
<br />
<pre class="brush:xquery"><br />
let $map := map { 'R': 'red', 'G': 'green', 'B': 'blue' }<br />
return (<br />
$map?* (: 1. returns all values; same as: map:keys($map) ! $map(.) :),<br />
$map?R (: 2. returns the value associated with the key 'R'; same as: $map('R') :),<br />
$map?('G','B') (: 3. returns the values associated with the key 'G' and 'B' :)<br />
),<br />
<br />
let $array := [ 'one', 'two', 'three' ]<br />
return (<br />
$array?* (: 1. returns all values; same as: (1 to array:size($array)) ! $array(.) :),<br />
$array?1 (: 2. returns the first value; same as: $array(1) :),<br />
$array?(2 to 3) (: 3. returns the second and third values; same as: (1 to 2) ! $array(.) :)<br />
)<br />
</pre><br />
<br />
The lookup operator can also be used without left operand. In this case, the context item will be used as input. This query returns {{Code|Akureyri}}:<br />
<br />
<pre class="brush:xquery"><br />
let $maps := (<br />
map { 'name': 'Guðrún', 'city': 'Reykjavík' },<br />
map { 'name': 'Hildur', 'city': 'Akureyri' }<br />
)<br />
return $maps[?name = 'Hildur'] ?city<br />
</pre><br />
<br />
=Arrow Operator=<br />
<br />
The arrow operator <code>=></code> provides a convenient alternative syntax for passing on functions to a value. The expression that precedes the operator will be supplied as first argument of the function that follows the arrow. If <code>$v</code> is a value and <code>f()</code> is a function, then <code>$v => f()</code> is equivalent to <code>f($v)</code>, and <code>$v => f($j)</code> is equivalent to <code>f($v, $j)</code>:<br />
<br />
<pre class="brush:xquery"><br />
(: Returns 3 :)<br />
count(('A', 'B', 'C')),<br />
('A', 'B', 'C') => count(),<br />
('A', 'B', 'C') => (function( $sequence) { count( $sequence)})(),<br />
<br />
(: Returns W-E-L-C-O-M-E :)<br />
string-join(tokenize(upper-case('w e l c o m e')), '-'),<br />
'w e l c o m e' => upper-case() => tokenize() => string-join('-'),<br />
<br />
(: Returns xfmdpnf :)<br />
codepoints-to-string(<br />
for $i in string-to-codepoints('welcome')<br />
return $i + 1<br />
),<br />
(for $i in 'welcome' => string-to-codepoints()<br />
return $i + 1) => codepoints-to-string()<br />
</pre><br />
<br />
The syntax makes nested function calls more readable, as it is easy to see if parentheses are balanced.<br />
<br />
=String Constructor=<br />
<br />
The string constructor has been inspired by [https://en.wikipedia.org/wiki/Here_document here document] literals of the Unix shell and script languages. It allows you to generate strings that contain various characters that would otherwise be interpreted as XQuery delimiters.<br />
<br />
The string constructors syntax uses two backticks and a square bracket for opening and closing a string:<br />
<br />
<pre class="brush:xquery"><br />
(: Returns "This is a 'new' & 'flexible' syntax." :)<br />
``["This is a 'new' & 'flexible' syntax."]``<br />
</pre><br />
<br />
XQuery expressions can be embedded via backticks and a curly bracket. The evaluated results will be separated with spaces, and all strings will eventually be concatenated:<br />
<br />
<pre class="brush:xquery"><br />
(: Returns »Count 1 2 3, and I will be there.« :)<br />
let $c := 1 to 3<br />
return ``[»Count `{ $c }`, and I will be there.«]``</pre><br />
<br />
=Serialization=<br />
<br />
Two [[Serialization]] methods have been added to the [http://www.w3.org/TR/xslt-xquery-serialization-31 Serialization spec]:<br />
<br />
==Adaptive Serialization==<br />
<br />
The {{Code|adaptive}} serialization provides an intuitive textual representation for all XDM types, including maps and arrays, functions, attributes, and namespaces. All items will be separated by the value of the {{Code|item-separator}} parameter, which by default is a newline character. It is utilized by the functions <code>[[Profiling Module#prof:dump|prof:dump]]</code> and <code>[http://www.w3.org/TR/xpath-functions-31/#func-trace fn:trace]</code>.<br />
<br />
Example:<br />
<br />
<pre class="brush:xquery"><br />
declare option output:method 'adaptive';<br />
<element id='id0'/>/@id,<br />
xs:token("abc"),<br />
map { 'key': 'value' },<br />
true#0<br />
</pre><br />
<br />
Result:<br />
<br />
<pre class="brush:xml"><br />
id="id0"<br />
xs:token("abc"),<br />
map {<br />
"key": "value"<br />
}<br />
fn:true#0<br />
</pre><br />
<br />
==JSON Serialization==<br />
<br />
The new {{Code|json}} serialization output method can be used to serialize XQuery maps, arrays, atomic values and empty sequences as JSON.<br />
<br />
The {{Code|json}} output method has been introduced in BaseX before it was added to the official specification. It complies with the standard serialization rules and, at the same time, preserves the existing semantics:<br />
<br />
* If an XML node of type {{Code|element(json)}} is found, it will be serialized following the serialization rules of the [[JSON Module]].<br />
* Any other node or atomic value, map, array, or empty sequence will be serialized according to the [http://www.w3.org/TR/xslt-xquery-serialization-31/#json-output rules in the specification].<br />
<br />
The following two queries will both return the JSON snippet <code>{ "key": "value" }</code>:<br />
<br />
<pre class="brush:xquery"><br />
declare option output:method 'json';<br />
map { "key": "value" }<br />
</pre><br />
<br />
<pre class="brush:xquery"><br />
declare option output:method 'json';<br />
<json type='object'><br />
<key>value</key><br />
</json><br />
</pre><br />
<br />
=Functions=<br />
<br />
The following functions have been added in the [http://www.w3.org/TR/xpath-functions-31/ XQuery 3.1 Functions and Operators] Specification:<br />
<br />
==Map Functions==<br />
<br />
<code>map:merge</code>, <code>map:size</code>, <code>map:keys</code>, <code>map:contains</code>, <code>map:get</code>, <code>map:entry</code>, <code>map:put</code>, <code>map:remove</code>, <code>map:for-each</code><br />
<br />
Please check out the [[Map Module]] for more details.<br />
<br />
==Array Functions==<br />
<br />
<code>array:size</code>, <code>array:append</code>, <code>array:subarray</code>, <code>array:remove</code>, <code>array:insert-before</code>, <code>array:head</code>, <code>array:tail</code>, <code>array:reverse</code>, <code>array:join</code>, <code>array:flatten</code>, <code>array:for-each</code>, <code>array:filter</code>, <code>array:fold-left</code>, <code>array:fold-right</code>, <code>array:for-each-pair</code><br />
<br />
==JSON Functions==<br />
<br />
With XQuery 3.1, native support for JSON objects was added. Strings and resources can be parsed to XQuery items and, as [[#JSON Serialization|shown above]], serialized back to their original form.<br />
<br />
===fn:parse-json===<br />
<br />
; Signatures<br />
* <code>fn:parse-json($input as xs:string) as item()?</code><br />
* <code>fn:parse-json($input as xs:string, $options as map(*)) as item()?</code><br />
<br />
Parses the supplied string as JSON text and returns its item representation. The result may be a map, an array, a string, a double, a boolean, or an empty sequence. The allowed options can be looked up in the [http://www.w3.org/TR/xpath-functions-31/#func-parse-json specification].<br />
<br />
<pre class="brush:xquery"><br />
parse-json('{ "name": "john" }') (: yields { "name": "json" } :),<br />
parse-json('[ 1, 2, 4, 8, 16]') (: yields [ 1, 2, 4, 8, 16 ] :)<br />
</pre><br />
<br />
===fn:json-doc===<br />
<br />
; Signatures<br />
* <code>fn:json-doc($uri as xs:string) as item()?</code><br />
* <code>fn:json-doc($uri as xs:string, $options as map(*)) as item()?</code><br />
<br />
Retrieves the text from the specified URI, parses the supplied string as JSON text and returns its item representation (see [[#fn:parse-json|fn:parse-json]] for more details).<br />
<br />
<pre class="brush:xquery"><br />
json-doc("http://ip.jsontest.com/")('ip') (: returns your IP address :)<br />
</pre><br />
<br />
===fn:json-to-xml===<br />
<br />
; Signatures<br />
* <code>fn:json-to-xml($string as xs:string?) as node()?</code><br />
<br />
Converts a JSON string to an XML node representation. The allowed options can be looked up in the [http://www.w3.org/TR/xslt-30/#func-json-to-xml specification].<br />
<br />
<pre class="brush:xquery"><br />
json-to-xml('{ "message": "world" }')<br />
<br />
(: result:<br />
<map xmlns="http://www.w3.org/2005/xpath-functions"><br />
<string key="message">world</string><br />
</map> :)<br />
</pre><br />
<br />
===fn:xml-to-json===<br />
<br />
; Signatures<br />
* <code>fn:xml-to-json($node as node()?) as xs:string?</code><br />
<br />
Converts an XML node, whose format conforms to the results created by [[#fn:json-to-xml|fn:json-to-xml]], to a JSON string representation. The allowed options can be looked up in the [http://www.w3.org/TR/xslt-30/#func-xml-to-json specification].<br />
<br />
<pre class="brush:xquery"><br />
(: returns "JSON" :)<br />
xml-to-json(<string xmlns="http://www.w3.org/2005/xpath-functions">JSON</string>)<br />
</pre><br />
<br />
==fn:sort==<br />
<br />
; Signatures<br />
* <code>fn:sort($input as item()*) as item()*</code><br />
* <code>fn:sort($input as item()*, $collation as xs:string?) as xs:anyAtomicType*)) as item()*</code><br />
* <code>fn:sort($input as item()*, $collation as xs:string?, $key as function(item()*) as xs:anyAtomicType*)) as item()*</code><br />
<br />
Returns a new sequence with sorted {{Code|$input}} items, using an optional {{Code|$collation}}. If a {{Code|$key}} function is supplied, it will be applied on all items. The items of the resulting values will be sorted using the semantics of the {{Code|lt}} expression.<br />
<br />
<pre class="brush:xquery"><br />
sort(reverse(1 to 3)) (: yields 1, 2, 3 :),<br />
reverse(sort(1 to 3)) (: returns the sorted order in descending order :),<br />
sort((3,-2,1), (), abs#1) (: yields 1, -2, 3 :),<br />
sort((1,2,3), (), function($x) { -$x }) (: yields 3, 2, 1 :),<br />
sort((1,'a')) (: yields an error, as strings and integers cannot be compared :)<br />
</pre><br />
<br />
==fn:contains-token==<br />
<br />
; Signatures<br />
* <code>fn:contains-token($input as xs:string*, $token as string) as xs:boolean</code><br />
* <code>fn:contains-token($input as xs:string*, $token as string, $collation as xs:string) as xs:boolean</code><br />
<br />
The supplied strings will be tokenized at whitespace boundaries. The function returns {{Code|true}} if one of the strings equals the supplied token, possibly under the rules of a supplied collation:<br />
<br />
<pre class="brush:xquery"><br />
contains-token(('a', 'b c', 'd'), 'c') (: yields true :)<br />
<xml class='one two'/>/contains-token(@class, 'one') (: yields true :)<br />
</pre><br />
<br />
==fn:parse-ietf-date==<br />
<br />
; Signature<br />
* <code>fn:parse-ietf-date($input as xs:string?) as xs:string?</code><br />
<br />
Parses a string in the IETF format (which is widely used on the Internet) and returns a {{Code|xs:dateTime}} item:<br />
<br />
<pre class="brush:xquery"><br />
fn:parse-ietf-date('28-Feb-1984 07:07:07')" (: yields 1984-02-28T07:07:07Z :),<br />
fn:parse-ietf-date('Wed, 01 Jun 2001 23:45:54 +02:00')" (: yields 2001-06-01T23:45:54+02:00 :)<br />
</pre><br />
<br />
==fn:apply==<br />
<br />
; Signatures<br />
* <code>fn:apply($function as function(*), $arguments as array(*)) as item()*</code><br />
<br />
The supplied {{Code|$function}} is invoked with the specified {{Code|$arguments}}. The arity of the function must be the same as the size of the array.<br />
<br />
Example:<br />
<br />
<pre class="brush:xquery"><br />
fn:apply(concat#5, array { 1 to 5 }) (: 12345 :)<br />
fn:apply(function($a) { sum($a) }, [ 1 to 5 ]) (: 15 :)<br />
fn:apply(count#1, [ 1,2 ]) (: error. the array has two members :)<br />
</pre><br />
<br />
==fn:random-number-generator==<br />
<br />
; Signatures<br />
* <code>fn:random-number-generator() as map(xs:string, item())</code><br />
* <code>fn:random-number-generator($seed as xs:anyAtomicType) as map(xs:string, item())</code><br />
<br />
Creates a random number generator, using an optional seed. The returned map contains three entries:<br />
<br />
* {{Code|number}} is a random double between 0 and 1<br />
* {{Code|next}} is a function that returns another random number generator<br />
* {{Code|permute}} is a function that returns a random permutation of its argument<br />
<br />
The returned random generator is ''deterministic'': If the function is called twice with the same arguments and in the same execution scope, it will always return the same result.<br />
<br />
Example:<br />
<br />
<pre class="brush:xquery"><br />
let $rng := fn:random-number-generator()<br />
let $number := $rng('number') (: returns a random number :)<br />
let $next-rng := $rng('next')() (: returns a new generator :)<br />
let $next-number := $next-rng('number') (: returns another random number :)<br />
let $permutation := $rng('permute')(1 to 5) (: returns a random permutation of (1,2,3,4,5) :)<br />
return ($number, $next-number, $permutation)<br />
</pre><br />
<br />
==fn:format-number==<br />
<br />
The function has been extended to support scientific notation:<br />
<br />
<pre class="brush:xquery"><br />
format-number(1984.42, '00.0e0') (: yields 19.8e2 :)<br />
</pre><br />
<br />
==fn:tokenize==<br />
<br />
If no separator is specified as second argument, a string will be tokenized at whitespace boundaries:<br />
<br />
<pre class="brush:xquery"><br />
fn:tokenize(" a b c d") (: yields "a", "b", "c", "d" :)<br />
</pre><br />
<br />
==fn:trace==<br />
<br />
The second argument can now be omitted:<br />
<br />
<pre class="brush:xquery"><br />
fn:trace(<xml/>, "Node: ")/node() (: yields the debugging output "Node: <xml/>" :),<br />
fn:trace(<xml/>)/node() (: returns the debugging output "<xml/>" :)<br />
</pre><br />
<br />
==fn:string-join==<br />
<br />
The type of the first argument is now <code>xs:anyAtomicType*</code>, and all items will be implicitly cast to strings:<br />
<br />
<pre class="brush:xquery"><br />
fn:string-join(1 to 3) (: yields the string "123" :)<br />
</pre><br />
<br />
==fn:default-language==<br />
<br />
Returns the default language used for formatting numbers and dates. BaseX always returns {{Code|en}}.<br />
<br />
==Appendix==<br />
<br />
The three functions <code>fn:transform</code>, <code>fn:load-xquery-module</code> and <code>fn:collation-key</code> may be added in a future version of BaseX as their implementation might require the use of additional external libraries.<br />
<br />
=Binary Data=<br />
<br />
Items of type <code>xs:hexBinary</code> and <code>xs:base64Binary</code> can be compared against each other. The following queries all yield {{Code|true}}:<br />
<br />
<pre class="brush:xquery"><br />
xs:hexBinary('') < xs:hexBinary('bb'),<br />
xs:hexBinary('aa') < xs:hexBinary('bb'),<br />
max((xs:hexBinary('aa'), xs:hexBinary('bb'))) = xs:hexBinary('bb')<br />
</pre><br />
<br />
=Collations=<br />
<br />
XQuery 3.1 provides a default collation, which allows for a case-insensitive comparison of ASCII characters (<code>A-Z</code> = <code>a-z</code>). This query returns <code>true</code>:<br />
<br />
<pre class="brush:xquery"><br />
declare default collation 'http://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive';<br />
'HTML' = 'html'<br />
</pre><br />
<br />
If the [http://site.icu-project.org/download ICU Library] is downloaded and added to the classpath, the full [http://www.w3.org/TR/xpath-functions-31/#uca-collations Unicode Collation Algorithm] features become available in BaseX:<br />
<br />
<pre class="brush:xquery"><br />
(: returns 0 (both strings are compared as equal) :)<br />
compare('a-b', 'ab', 'http://www.w3.org/2013/collation/UCA?alternate=shifted')<br />
</pre><br />
<br />
=Enclosed Expressions=<br />
<br />
''Enclosed expression'' is the syntactical term for the expressions that are specified inside a function body, try/catch clauses, node constructors and some other expressions. In the following example expressions, its the empty sequence:<br />
<br />
<pre class="brush:xquery"><br />
declare function local:x() { () }; i<br />
try { () } catch * { () },<br />
element x { () },<br />
text { () }<br />
</pre><br />
<br />
With XQuery 3.1, the expression can be omitted. The following query is equivalent to the upper one:<br />
<br />
<pre class="brush:xquery"><br />
declare function local:x() { };<br />
try { } catch * { },<br />
element x { }<br />
text { }<br />
</pre><br />
<br />
=Changelog=<br />
<br />
;Version 8.6<br />
<br />
* Updated: Collation argument was inserted between first and second argument.<br />
<br />
;Version 8.4<br />
<br />
* Added: [[#String Constructors|String Constructors]], [[#fn:default-language|fn:default-language]], [[#Enclosed Expressions|Enclosed Expressions]]<br />
* Updated: [[#Adaptive Serialization|Adaptive Serialization]], [[#fn:string-join|fn:string-join]]<br />
<br />
;Version 8.2<br />
<br />
* Added: [[#fn:json-to-xml|fn:json-to-xml]], [[#fn:xml-to-json|fn:xml-to-json]].<br />
<br />
;Version 8.1<br />
<br />
* Updated: arrays are now based on a [http://en.wikipedia.org/wiki/Finger_tree Finger Tree] implementation.<br />
<br />
Introduced with Version 8.0.</div>James Ballhttps://docs.basex.org/index.php?title=Utility_Module&diff=14345Utility Module2019-02-12T15:15:39Z<p>James Ball: Corrected function name in Signature for util:last</p>
<hr />
<div>This [[Module Library|XQuery Module]] contains various small utility and helper functions. Please note that some of the functions are used for internal query rewritings. They may be renamed or moved to other modules in future versions of BaseX.<br />
<br />
=Conventions=<br />
<br />
All functions and errors in this module and errors are assigned to the <code><nowiki>http://basex.org/modules/util</nowiki></code> namespace, which is statically bound to the {{Code|util}} prefix.<br/><br />
<br />
=Conditions=<br />
<br />
==util:if==<br />
<br />
{{Mark|Introduced with Version 9.1:}}<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|util:if|$condition as item()*, $then as item()*|item()*}}<br/>{{Func|util:if|$condition as item()*, $then as item()*, $else as item()*|item()*}}<br/><br />
|-<br />
| '''Summary'''<br />
|Alternative writing for the if/then/else expression:<br />
* If the ''effective boolean value'' of {{Code|$condition}} is true, the {{Code|$then}} branch will be evaluated.<br />
* Otherwise, {{Code|$else}} will be evaluated. If no third argument is supplied, an empty sequence will be returned.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code>util:if(true(), 123, 456)</code> returns {{Code|123}}.<br />
* <code>util:if(0, 'wrong!')</code> returns an empty sequence.<br />
|}<br />
<br />
==util:or==<br />
<br />
{{Mark|Introduced with Version 9.1:}}<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|util:or|$items as item()*, $default as item()*|item()*}}<br />
|-<br />
| '''Summary'''<br />
|Returns {{Code|$items}} if it is a non-empty sequence. Otherwise, returns {{Code|$default}}. The function is equivalent to the expression <code>if(exists($items)) then $items else $default</code>.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code>util:or(123, 456)</code> returns {{Code|123}}.<br />
* <code>util:or(1[. = 0], -1)</code> returns {{Code|-1}}.<br />
|}<br />
<br />
=Positional Access=<br />
<br />
==util:item==<br />
<br />
{{Mark|Updated with Version 9.2}}: Renamed (before: {{Code|util:item-at}}).<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|util:item|$sequence as item()*, $position as xs:double|item()?}}<br/><br />
|-<br />
| '''Summary'''<br />
|Returns the item from {{Code|$sequence}} at the specified {{Code|$position}}. Equivalent to <code>$sequence[$position]</code>.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code>util:item(reverse(1 to 5), 1)</code> returns <code>5</code>.<br />
* <code>util:item(('a','b'), 0)</code> returns an empty sequence.<br />
|}<br />
<br />
==util:range==<br />
<br />
{{Mark|Updated with Version 9.2}}: Renamed (before: {{Code|util:item-range}}).<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|util:range|$sequence as item()*, $first as xs:double, $last as xs:double|item()*}}<br/><br />
|-<br />
| '''Summary'''<br />
|Returns items from {{Code|$sequence}}, starting at position {{Code|$first}} and ending at {{Code|$last}}. Equivalent to <code>subsequence($sequence, $first, $last - $first + 1)</code>.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code>util:range(//item, 11, 20)</code> returns all path results from (if available) position 11 to 20.<br />
|}<br />
<br />
==util:last==<br />
<br />
{{Mark|Updated with Version 9.2}}: Renamed (before: {{Code|util:last-from}}).<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|util:last|$sequence as item()*|item()?}}<br/><br />
|-<br />
| '''Summary'''<br />
|Returns last item of a {{Code|$sequence}}. Equivalent to <code>$sequence[last()]</code>.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code>util:last(reverse(1 to 100))</code> returns <code>1</code>.<br />
|}<br />
<br />
==util:init==<br />
<br />
{{Mark|Introduced with Version 9.2:}}<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|util:init|$sequence as item()*|item()*}}<br/><br />
|-<br />
| '''Summary'''<br />
|Returns all items of a {{Code|$sequence}} except for the last one. Equivalent to <code>$sequence[position() < last()]</code>.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code>util:init(1 to 4)</code> returns <code>1 2 3</code>.<br />
|}<br />
<br />
=Helper Functions=<br />
<br />
==util:replicate==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|util:replicate|$sequence as item()*, $count as xs:integer|item()*}}<br/><br />
|-<br />
| '''Summary'''<br />
|Returns {{Code|$count}} instances of the specified {{Code|$sequence}}. A similar result can be generated with <code>(1 to $count) ! $sequence</code>, but in the latter case, the right-hand expression will be evaluated multiple times.<br />
|-<br />
| '''Errors'''<br />
|{{Error|negative|#Errors}} The specified number is negative.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code>util:replicate('A', 3)</code> returns <code>A A A</code>.<br />
|}<br />
<br />
=Errors=<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
|Description<br />
|-<br />
|{{Code|negative}}<br />
|The specified number is negative.<br />
|}<br />
<br />
=Changelog=<br />
<br />
;Version 9.2<br />
* Added: [[#util:init|util:init]]<br />
* Updates: [[#util:item|util:item]], [[#util:last|util:last]], [[#util:range|util:range]] renamed (before: {{Code|util:item-at}}, {{Code|util:item-range}}, {{Code|util:last-from}})<br />
<br />
;Version 9.1<br />
* Added: [[#util:if|util:if]], [[#util:or|util:or]]<br />
<br />
;Version 9.0<br />
* Added: [[#util:replicate|util:replicate]]<br />
<br />
The Module was introduced with Version 8.5.</div>James Ballhttps://docs.basex.org/index.php?title=Index_Module&diff=13884Index Module2018-05-31T18:26:58Z<p>James Ball: Added note and link to Value Indexes in Database Module</p>
<hr />
<div>This [[Module Library|XQuery Module]] provides functions for displaying information stored in the database index structures.<br />
<br />
For functions that use the indexes to return nodes see [[Database_Module#Value_Indexes|Value Indexes]] in the [[Database_Module|Database Module]].<br />
<br />
=Conventions=<br />
<br />
All functions and errors in this module are assigned to the <code><nowiki>http://basex.org/modules/index</nowiki></code> namespace, which is statically bound to the {{Code|index}} prefix.<br/><br />
<br />
=Functions=<br />
<br />
==index:facets==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|index:facets|$db as xs:string|xs:string}}<br/>{{Func|index:facets|$db as xs:string, $type as xs:string|xs:string}}<br />
|-<br />
|'''Summary'''<br />
|Returns information about all facets and facet values of the database {{Code|$db}} in document structure format.<br/>If {{Code|$type}} is specified as {{Code|flat}}, the function returns this information in a flat summarized version. The returned data is derived from the [[Indexes#Path Index|Path Index]].<br />
|-<br />
|'''Errors'''<br />
|{{Error|db:open|Database Module#Errors}} The addressed database does not exist or could not be opened.<br />
|-<br />
|'''Examples'''<br />
|<br />
* {{Code|index:facets("DB")}} returns information about facets and facet values on the database {{Code|DB}} in document structure.<br />
* {{Code|index:facets("DB", "flat")}} returns information about facets and facet values on the database {{Code|DB}} in a summarized flat structure.<br />
|}<br />
<br />
==index:texts==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|index:texts|$db as xs:string|element(value)*}}<br/>{{Func|index:texts|$db as xs:string, $prefix as xs:string|element(value)*}}<br/>{{Func|index:texts|$db as xs:string, $start as xs:string, $ascending as xs:boolean|element(value)*}}<br />
|-<br />
|'''Summary'''<br />
|Returns all strings stored in the [[Indexes#Text Index|Text Index]] of the database {{Code|$db}}, along with their number of occurrences.<br/>If {{Code|$prefix}} is specified, the returned entries will be refined to the ones starting with that prefix.<br/>If {{Code|$start}} and {{Code|$ascending}} are specified, all nodes will be returned after or before the specified start entry.<br />
|-<br />
|'''Errors'''<br />
|{{Error|db:open|Database Module#Errors}} The addressed database does not exist or could not be opened.<br/>{{Error|db:no-index|Database Module#Errors}} the index is not available.<br />
|}<br />
<br />
==index:attributes==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|index:attributes|$db as xs:string|element(value)*}}<br/>{{Func|index:attributes|$db as xs:string, $prefix as xs:string|element(value)*}}<br/>{{Func|index:attributes|$db as xs:string, $start as xs:string, $ascending as xs:boolean|element(value)*}}<br />
|-<br />
|'''Summary'''<br />
|Returns all strings stored in the [[Indexes#Attribute Index|Attribute Index]] of the database {{Code|$db}}, along with their number of occurrences.<br/>If {{Code|$prefix}} is specified, the returned entries will be refined to the ones starting with that prefix.<br/>If {{Code|$start}} and {{Code|$ascending}} are specified, all nodes will be returned after or before the specified start entry.<br />
|-<br />
|'''Errors'''<br />
|{{Error|db:open|Database Module#Errors}} The addressed database does not exist or could not be opened.<br/>{{Error|db:no-index|Database Module#Errors}} the index is not available.<br />
|}<br />
<br />
==index:tokens==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|index:tokens|$db as xs:string|element(value)*}}<br />
|-<br />
|'''Summary'''<br />
|Returns all strings stored in the [[Indexes#Token Index|Token Index]] of the database {{Code|$db}}, along with their number of occurrences.<br />
|-<br />
|'''Errors'''<br />
|{{Error|db:open|Database Module#Errors}} The addressed database does not exist or could not be opened.<br/>{{Error|db:no-index|Database Module#Errors}} the index is not available.<br />
|}<br />
<br />
==index:element-names==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|index:element-names|$db as xs:string|element(value)*}}<br />
|-<br />
|'''Summary'''<br />
|Returns all element names stored in the [[Indexes#Name Index|Name Index]] of the database {{Code|$db}}, along with their number of occurrences.<br />
|-<br />
|'''Errors'''<br />
|{{Error|db:open|Database Module#Errors}} The addressed database does not exist or could not be opened.<br />
|}<br />
<br />
==index:attribute-names==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|index:attribute-names|$db as xs:string|element(value)*}}<br />
|-<br />
|'''Summary'''<br />
|Returns all attribute names stored in the [[Indexes#Name Index|Name Index]] of the database {{Code|$db}}, along with their number of occurrences.<br />
|-<br />
|'''Errors'''<br />
|{{Error|db:open|Database Module#Errors}} The addressed database does not exist or could not be opened.<br />
|}<br />
<br />
=Changelog=<br />
<br />
;Version 8.4<br />
<br />
* Added: [[#index:tokens|index:token]]<br />
<br />
;Version 7.7<br />
<br />
* Updated: the functions no longer accept [[Database Module#Database Nodes|Database Nodes]] as reference. Instead, the name of a database must now be specified.<br />
<br />
;Version 7.3<br />
<br />
* Updated: [[#index:texts|index:texts]], [[#index:attributes|index:attributes]]: signature with three arguments added.<br />
<br />
The module was introduced with Version 7.1.</div>James Ballhttps://docs.basex.org/index.php?title=Fetch_Module&diff=13280Fetch Module2017-09-09T08:24:28Z<p>James Ball: Added an example to show fetch:text() with optional attributes</p>
<hr />
<div>This [[Module Library|XQuery Module]] provides simple functions to fetch the content of resources identified by URIs. Resources can be stored locally or remotely and e.g. use the {{Code|file://}} or {{Code|http://}} scheme. If more control over HTTP requests is required, the [[HTTP Module]] can be used. With the [[HTML Module]], retrieved HTML documents can be converted to XML.<br />
<br />
=Conventions=<br />
<br />
All functions in this module are assigned to the <code><nowiki>http://basex.org/modules/fetch</nowiki></code> namespace, which is statically bound to the {{Code|fetch}} prefix.<br/><br />
All errors are assigned to the <code><nowiki>http://basex.org/errors</nowiki></code> namespace, which is statically bound to the {{Code|bxerr}} prefix.<br />
<br />
URI arguments can point be URLs or point to local files. Relative file paths will be resolved against the ''current working directory'' (for more details, have a look at the [[File Module#File Paths|File Module]]).<br />
<br />
=Functions=<br />
<br />
==fetch:binary==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|fetch:binary|$uri as xs:string|xs:base64Binary}}<br/><br />
|-<br />
| '''Summary'''<br />
|Fetches the resource referred to by the given URI and returns it as [[Streaming Module|streamable]] {{Code|xs:base64Binary}}.<br />
|-<br />
| '''Errors'''<br />
|{{Error|BXFE0001|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code><nowiki>fetch:binary("http://images.trulia.com/blogimg/c/5/f/4/679932_1298401950553_o.jpg")</nowiki></code> returns the addressed image.<br />
* <code><nowiki>stream:materialize(fetch:binary("http://en.wikipedia.org"))</nowiki></code> returns a materialized representation of the streamable result.<br />
|}<br />
<br />
==fetch:text==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|fetch:text|$uri as xs:string|xs:string}}<br/>{{Func|fetch:text|$uri as xs:string, $encoding as xs:string|xs:string}}<br/>{{Func|fetch:text|$uri as xs:string, $encoding as xs:string, $fallback as xs:boolean|xs:string}}<br/><br />
|-<br />
| '''Summary'''<br />
|Fetches the resource referred to by the given {{Code|$uri}} and returns it as [[Streaming Module|streamable]] {{Code|xs:string}}:<br />
* The UTF-8 default encoding can be overwritten with the optional {{Code|$encoding}} argument.<br />
* By default, invalid characters will be rejected. If {{Code|$fallback}} is set to true, these characters will be replaced with the Unicode replacement character <code>FFFD</code> (&#xFFFD;).<br />
|-<br />
| '''Errors'''<br />
|{{Error|BXFE0001|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.<br/>{{Error|BXFE0002|XQuery Errors#Functions Errors}} the specified encoding is not supported, or unknown.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code><nowiki>fetch:text("http://en.wikipedia.org")</nowiki></code> returns a string representation of the English Wikipedia main HTML page.<br />
* <code><nowiki>fetch:text("http://www.bbc.com","US-ASCII",true())</nowiki></code> returns the BBC homepage in US-ASCII with all non-US-ASCII characters replaced with &#xFFFD;.<br />
* <code><nowiki>stream:materialize(fetch:text("http://en.wikipedia.org"))</nowiki></code> returns a materialized representation of the streamable result.<br />
|}<br />
<br />
==fetch:xml==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|fetch:xml|$uri as xs:string|document-node()}}<br/>{{Func|fetch:xml|$uri as xs:string, $options as map(*)|document-node()}}<br />
|-<br />
| '''Summary'''<br />
|Fetches the resource referred to by the given {{Code|$uri}} and returns it as XML document node.<br/>In contrast to <code>fn:doc</code>, each function call returns a different document node. As a consequence, document instances created by this function will not be kept in memory until the end of query evaluation.<br/>The {{Code|$options}} argument can be used to change the parsing behavior. Allowed options are all [[Options#Parsing|parsing]] and [[Options#XML Parsing|XML parsing]] options in lower case.<br />
|-<br />
| '''Errors'''<br />
|{{Error|BXFE0001|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.<br />
|-<br />
| '''Examples'''<br />
|<br />
* Retrieve an XML representation of the English Wikipedia main HTML page, chop all whitespace nodes:<br />
<pre class="brush:xquery"><br />
fetch:xml("http://en.wikipedia.org", map { 'chop': true() })<br />
</pre><br />
* Return a document located in the current base directory:<br />
<pre class="brush:xquery"><br />
fetch:xml(file:base-dir() || "example.xml")<br />
</pre><br />
|}<br />
<br />
==fetch:xml-binary==<br />
<br />
{{Mark|Introduced with Version 8.7:}}<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|fetch:xml-binary|$data as xs:base64Binary|document-node()}}<br/>{{Func|fetch:xml-binary|$data as xs:base64Binary, $options as map(*)|document-node()}}<br />
|-<br />
| '''Summary'''<br />
|Parses binary {{Code|$data}} and returns it as XML document node.<br/>In contrast to fn:parse-xml, which expects an XQuery string, the input of this function can be arbitrarily encoded. The encoding will be derived from the XML declaration or (in case of UTF16 or UTF32) from the first bytes of the input.<br/>The {{Code|$options}} argument can be used to change the parsing behavior. Allowed options are all [[Options#Parsing|parsing]] and [[Options#XML Parsing|XML parsing]] options in lower case.<br />
|-<br />
| '''Examples'''<br />
|<br />
* Retrieves file input as binary data and parses it as XML:<br />
<pre class="brush:xquery"><br />
fetch:xml-binary(file:read-binary('doc.xml'))<br />
</pre><br />
* Encodes a string as CP1252 and parses it as XML. The input and the string {{Code|touché}} will be correctly decoded because of the XML declaration:<br />
<pre class="brush:xquery"><br />
fetch:xml-binary(convert:string-to-base64(<br />
"<?xml version='1.0' encoding='CP1252'?><xml>touché</xml>",<br />
"CP1252"<br />
))<br />
</pre><br />
* Encodes a string as UTF16 and parses it as XML. The document will be correctly decoded, as the first bytes of the data indicate that the input must be UTF16:<br />
<pre class="brush:xquery"><br />
fetch:xml-binary(convert:string-to-base64("<xml/>", "UTF16"))<br />
</pre><br />
|}<br />
<br />
==fetch:content-type==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|fetch:content-type|$uri as xs:string|xs:string}}<br/><br />
|-<br />
| '''Summary'''<br />
|Returns the content-type (also called mime-type) of the resource specified by {{Code|$uri}}:<br />
* If a remote resource is addressed, the request header will be evaluated.<br />
* If the addressed resource is locally stored, the content-type will be guessed based on the file extension.<br />
|-<br />
| '''Errors'''<br />
|{{Error|BXFE0001|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code><nowiki>fetch:content-type("http://docs.basex.org/skins/vector/images/wiki.png")</nowiki></code> returns {{Code|image/png}}.<br />
|}<br />
<br />
=Errors=<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
|Description<br />
|-<br />
|{{Code|BXFE0001}}<br />
|The URI could not be resolved, or the resource could not be retrieved.<br />
|-<br />
|{{Code|BXFE0002}}<br />
|The specified encoding is not supported, or unknown.<br />
|}<br />
<br />
=Changelog=<br />
<br />
;Version 8.7<br />
<br />
* Added: [[#fetch:xml-binary|fetch:xml-binary]]<br />
<br />
;Version 8.5<br />
<br />
* Updated: [[#fetch:text|fetch:text]]: <code>$fallback</code> argument added.<br />
<br />
;Version 8.0<br />
<br />
* Added: [[#fetch:xml|fetch:xml]]<br />
<br />
The module was introduced with Version 7.6.</div>James Ballhttps://docs.basex.org/index.php?title=Conversion_Module&diff=13263Conversion Module2017-08-13T16:35:17Z<p>James Ball: Removed the example from convert:bytes-to-base64 as it was actually an example of convert:string-to-base64.</p>
<hr />
<div>This [[Module Library|XQuery Module]] contains functions to convert data between different formats.<br />
<br />
=Conventions=<br />
<br />
All functions in this module are assigned to the <code><nowiki>http://basex.org/modules/convert</nowiki></code> namespace, which is statically bound to the {{Code|convert}} prefix.<br/><br />
All errors are assigned to the <code><nowiki>http://basex.org/errors</nowiki></code> namespace, which is statically bound to the {{Code|bxerr}} prefix.<br />
<br />
=Strings=<br />
<br />
==convert:binary-to-string==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:binary-to-string|$bytes as xs:anyAtomicType|xs:string}}<br/>{{Func|convert:binary-to-string|$bytes as xs:anyAtomicType, $encoding as xs:string|xs:string}}<br/>{{Func|convert:binary-to-string|$bytes as xs:anyAtomicType, $encoding as xs:string, $fallback as xs:boolean|xs:string}}<br />
|-<br />
| '''Summary'''<br />
|Converts the specifed binary data (xs:base64Binary, xs:hexBinary) to a string:<br />
* The UTF-8 default encoding can be overwritten with the optional {{Code|$encoding}} argument.<br />
* By default, invalid characters will be rejected. If {{Code|$fallback}} is set to true, these characters will be replaced with the Unicode replacement character <code>FFFD</code> (&#xFFFD;).<br />
|-<br />
| '''Errors'''<br />
|{{Error|BXCO0001|#Errors}} The input is an invalid XML string, or the wrong encoding has been specified.<br/>{{Error|BXCO0002|#Errors}} The specified encoding is invalid or not supported.<br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|convert:binary-to-string(xs:hexBinary('48656c6c6f576f726c64'))}} yields {{Code|HelloWorld}}.<br />
|}<br />
<br />
==convert:string-to-base64==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:string-to-base64|$input as xs:string|xs:base64Binary}}<br/>{{Func|convert:string-to-base64|$input as xs:string, $encoding as xs:string|xs:base64Binary}}<br />
|-<br />
| '''Summary'''<br />
|Converts the specified string to a {{Code|xs:base64Binary}} item. If the default encoding is chosen, conversion will be cheap, as both {{Code|xs:string}} and {{Code|xs:base64Binary}} items are internally represented as byte arrays.<br/>The UTF-8 default encoding can be overwritten with the optional {{Code|$encoding}} argument.<br />
|-<br />
| '''Errors'''<br />
|{{Error|BXCO0001|#Errors}} The input cannot be represented in the specified encoding.<br/>{{Error|BXCO0002|#Errors}} The specified encoding is invalid or not supported.<br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|string(convert:string-to-base64('HelloWorld'))}} yields <code>SGVsbG9Xb3JsZA==</code>.<br />
|}<br />
<br />
==convert:string-to-hex==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:string-to-hex|$input as xs:string|xs:hexBinary}}<br/>{{Func|convert:string-to-hex|$input as xs:string, $encoding as xs:string|xs:hexBinary}}<br />
|-<br />
| '''Summary'''<br />
|Converts the specified string to a {{Code|xs:hexBinary}} item. If the default encoding is chosen, conversion will be cheap, as both {{Code|xs:string}} and {{Code|xs:hexBinary}} items are internally represented as byte arrays.<br/>The UTF-8 default encoding can be overwritten with the optional {{Code|$encoding}} argument.<br />
|-<br />
| '''Errors'''<br />
|{{Error|BXCO0001|#Errors}} The input cannot be represented in the specified encoding.<br/>{{Error|BXCO0002|#Errors}} The specified encoding is invalid or not supported.<br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|string(convert:string-to-hex('HelloWorld'))}} yields {{Code|48656C6C6F576F726C64}}.<br />
|}<br />
<br />
=Binary Data=<br />
<br />
==convert:bytes-to-base64==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:bytes-to-base64|$input as xs:byte*|xs:base64Binary}}<br />
|-<br />
| '''Summary'''<br />
|Converts the specified byte sequence to a {{Code|xs:base64Binary}} item. Conversion is cheap, as {{Code|xs:base64Binary}} items are internally represented as byte arrays.<br />
|-<br />
| '''Errors'''<br />
|{{Error|BXCO0001|#Errors}} The input cannot be represented in the specified encoding.<br/>{{Error|BXCO0002|#Errors}} The specified encoding is invalid or not supported.<br />
|}<br />
<br />
==convert:bytes-to-hex==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:bytes-to-hex|$input as xs:byte*|xs:hexBinary}}<br />
|-<br />
| '''Summary'''<br />
|Converts the specified byte sequence to a {{Code|xs:hexBinary}} item. Conversion is cheap, as {{Code|xs:hexBinary}} items are internally represented as byte arrays.<br />
|}<br />
<br />
==convert:binary-to-bytes==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:binary-to-bytes|$bin as xs:anyAtomicType|xs:byte*}}<br />
|-<br />
| '''Summary'''<br />
|Returns the specified binary data (xs:base64Binary, xs:hexBinary) as a sequence of bytes.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code>convert:binary-to-bytes(xs:base64Binary('QmFzZVggaXMgY29vbA=='))</code> yields the sequence {{Code|(66, 97, 115, 101, 88, 32, 105, 115, 32, 99, 111, 111, 108)}}.<br />
* {{Code|convert:binary-to-bytes(xs:hexBinary("4261736558"))}} yields the sequence {{Code|(66 97 115 101 88)}}.<br />
|}<br />
<br />
=Numbers=<br />
<br />
==convert:integer-to-base==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:integer-to-base|$num as xs:integer, $base as xs:integer|xs:string}}<br /><br />
|-<br />
| '''Summary'''<br />
|Converts {{Code|$num}} to base {{Code|$base}}, interpreting it as a 64-bit unsigned integer.<br />The first {{Code|$base}} elements of the sequence {{Code|'0',..,'9','a',..,'z'}} are used as digits.<br />Valid bases are {{Code|2, .., 36}}.<br /><br />
|-<br />
| '''Errors'''<br />
|{{Error|BXCO0004|#Errors}} The specified base is not in the range 2-36.<br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|convert:integer-to-base(-1, 16)}} yields {{Code|'ffffffffffffffff'}}.<br />
* {{Code|convert:integer-to-base(22, 5)}} yields {{Code|'42'}}.<br />
|}<br />
<br />
==convert:integer-from-base==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:integer-from-base|$str as xs:string, $base as xs:integer|xs:integer}}<br /><br />
|-<br />
| '''Summary'''<br />
|Decodes an {{Code|xs:integer}} from {{Code|$str}}, assuming that it's encoded in base {{Code|$base}}.<br /> The first {{Code|$base}} elements of the sequence {{Code|'0',..,'9','a',..,'z'}} are allowed as digits, case doesn't matter. <br />Valid bases are 2 - 36.<br /> If {{Code|$str}} contains more than 64 bits of information, the result is truncated arbitarily.<br />
|-<br />
| '''Errors'''<br />
|{{Error|BXCO0004|#Errors}} The specified base is not in the range 2-36.<br/>{{Error|BXCO0005|#Errors}} The specified digit is not valid for the given range.<br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|convert:integer-from-base('ffffffffffffffff', 16)}} yields {{Code|-1}}.<br />
* {{Code|convert:integer-from-base('CAFEBABE', 16)}} yields {{Code|3405691582}}.<br />
* {{Code|convert:integer-from-base('42', 5)}} yields {{Code|22}}.<br />
* {{Code|convert:integer-from-base(convert:integer-to-base(123, 7), 7)}} yields {{Code|123}}.<br />
|}<br />
<br />
=Dates and Durations=<br />
<br />
==convert:integer-to-dateTime==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:integer-to-dateTime|$ms as xs:integer|xs:dateTime}}<br /><br />
|-<br />
| '''Summary'''<br />
|Converts the specified number of milliseconds since 1 Jan 1970 to an item of type xs:dateTime.<br /><br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|convert:integer-to-dateTime(0)}} yields {{Code|1970-01-01T00:00:00Z}}.<br />
* {{Code|convert:integer-to-dateTime(1234567890123)}} yields {{Code|2009-02-13T23:31:30.123Z}}.<br />
* {{Code|convert:integer-to-dateTime(prof:current-ms())}} returns the current miliseconds in the {{Code|xs:dateTime}} format.<br />
|}<br />
<br />
==convert:dateTime-to-integer==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:dateTime-to-integer|$dateTime as xs:dateTime|xs:integer}}<br /><br />
|-<br />
| '''Summary'''<br />
|Converts the specified item of type xs:dateTime to the number of milliseconds since 1 Jan 1970.<br /><br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|convert:dateTime-to-integer(xs:dateTime('1970-01-01T00:00:00Z'))}} yields {{Code|0}}.<br />
|}<br />
<br />
==convert:integer-to-dayTime==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:integer-to-dayTime|$ms as xs:integer|xs:dayTimeDuration}}<br /><br />
|-<br />
| '''Summary'''<br />
|Converts the specified number of milliseconds to an item of type xs:dayTimeDuration.<br /><br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|convert:integer-to-dayTime(1234)}} yields {{Code|PT1.234S}}.<br />
|}<br />
<br />
==convert:dayTime-to-integer==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|convert:dayTime-to-integer|$dayTime as xs:dayTimeDuration|xs:integer}}<br /><br />
|-<br />
| '''Summary'''<br />
|Converts the specified item of type xs:dayTimeDuration to milliseconds represented by an integer.<br /><br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|convert:dayTime-to-integer(xs:dayTimeDuration('PT1S'))}} yields {{Code|1000}}.<br />
|}<br />
<br />
=Errors=<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
| Description<br />
|-<br />
|{{Code|BXCO0001}}<br />
|The input is an invalid XML string, or the wrong encoding has been specified.<br />
|-<br />
|{{Code|BXCO0002}}<br />
|The specified encoding is invalid or not supported.<br />
|-<br />
|{{Code|BXCO0003}}<br />
|The specified base is not in the range 2-36.<br />
|-<br />
|{{Code|BXCO0004}}<br />
|The specified encoding is invalid or not supported.<br />
|-<br />
|{{Code|BXCO0005}}<br />
|The specified digit is not valid for the given range.<br />
|}<br />
<br />
=Changelog=<br />
<br />
;Version 8.5<br />
<br />
* Updated: [[#convert:binary-to-string|convert:binary-to-string]]: <code>$fallback</code> argument added.<br />
<br />
;Version 7.5<br />
<br />
* Added: [[#convert:integer-to-dateTime|convert:integer-to-dateTime]], [[#convert:dateTime-to-integer|convert:dateTime-to-integer]], [[#convert:integer-to-dayTime|convert:integer-to-dayTime]], [[#convert:dayTime-to-integer|convert:dayTime-to-integer]].<br />
<br />
The module was introduced with Version 7.3. Some of the functions have been adopted from the obsolete Utility Module.</div>James Ballhttps://docs.basex.org/index.php?title=Databases&diff=13045Databases2016-11-28T10:11:26Z<p>James Ball: Grammatical edit</p>
<hr />
<div>This page is part of the [[Getting Started]] Section.<br />
<br />
In BaseX, a ''database'' is a pretty light-weight concept and can be compared<br />
to a ''collection''. It contains an arbitrary number of '''resources''',<br />
addressed by their unique database path. Resources can either be<br />
'''XML documents''' or '''raw files''' (binaries).<br />
Some information on [[Binary Data|binary data]] can be found on an extra page.<br />
<br />
=Create Databases=<br />
<br />
New databases can be created via commands, in the GUI, or with any of our<br />
[[Developing|APIs]]. If some input is specified along with the create operation, it will be added to the database in a bulk operation:<br />
<br />
* [[Startup#BaseX Standalone|Console]]: <code>CREATE DB db /path/to/resources</code> will add initial documents to a database<br />
* [[Startup#BaseX GUI|GUI]]: Go to ''Database'' → ''New'', press ''Browse'' to choose an initial file or directory, and press ''OK''<br />
<br />
Database must follow the [[Valid Names|valid names constraints]].<br />
Various [[parsers]] can be chosen to influence the database creation, or to convert different formats to XML.<br />
<br />
'''Note:''' A main-memory only database can be created using the the <code>SET MAINMEM true</code> command before calling <code>CREATE DB</code> ([[Databases#In Memory Database|see below]] for more).<br />
<br />
=Access Resources=<br />
<br />
Stored resources and external documents can be accessed in different ways:<br />
<br />
==XML Documents==<br />
<br />
Various XQuery functions exist to access XML documents in databases:<br />
<br />
{| class="wikitable"<br />
|-<br />
!Function<br />
!Example<br />
!Description<br />
|-<br />
|[[Database Module#db:open|db:open]]<br />
|{{Code|db:open("db", "path/to/docs")}}<br />
|Returns all documents that are found in the database {{Code|db}} at the (optional) path {{Code|path/to/docs}}.<br />
|-<br />
|[http://www.xqueryfunctions.com/xq/fn_collection.html fn:collection]<br />
|{{Code|collection("db/path/to/docs")}}<br />
|Returns all documents at the location {{Code|path/to/docs}} in the database {{Code|db}}.<br/>If no path is specified after the database, all documents in the database will be returned.<br/>If no argument is specified, all documents of the database will be returned that has been opened in the global context.<br />
|-<br />
|[http://www.xqueryfunctions.com/xq/fn_doc.html fn:doc]<br />
|{{Code|doc("db/path/to/doc.xml")}}<br />
|Returns the document at the location {{Code|path/to/docs}} in the database {{Code|db}}.<br/>An error is raised if the specified yields zero or more than one document.<br />
|}<br />
<br />
You can access multiple databases in a single query:<br />
<br />
<pre class="brush:xquery"><br />
for $i in 1 to 100<br />
return db:open('books' || $i)//book/title<br />
</pre><br />
<br />
If the [[Options#DEFAULTDB|DEFAULTDB]] option is turned on, the path argument of the {{Code|fn:doc}} or {{Code|fn:collection}} function will first be resolved against the globally opened database.<br />
<br />
Two more functions are available for retrieving information on database nodes:<br />
<br />
{| class="wikitable"<br />
|-<br />
!Function<br />
!Example<br />
!Description<br />
|-<br />
|[[Database Module#db:name|db:name]]<br />
|{{Code|db:name($node)}}<br />
|Returns the name of the database in which the specified {{Code|$node}} is stored.<br />
|-<br />
|[[Database Module#db:path|db:path]]<br />
|{{Code|db:path($node)}}<br />
|Returns the path of the database document in which the specified {{Code|$node}} is stored.<br />
|}<br />
<br />
The {{Code|fn:document-uri}} and {{Code|fn:base-uri}} functions return URIs that can also be reused as arguments for the {{Code|fn:doc}} and {{Code|fn:collection}} functions. As a result, the following example query always returns {{Code|true}}:<br />
<br />
<pre class="brush:xquery"><br />
every $c in collection('anyDB')<br />
satisfies doc-available(document-uri($c))<br />
</pre><br />
<br />
If the argument of {{Code|fn:doc}} or {{Code|fn:collection}} does not start with a valid database name, or if the addressed database does not exist, the string is interpreted as URI reference, and the documents found at this location will be returned. Examples:<br />
<br />
* {{Code|doc("http://web.de")}}: retrieves the addressed URI and returns it as a main-memory document node.<br />
* {{Code|doc("myfile.xml")}}: retrieves the given file from the file system and returns it as a main-memory document node. Note that updates to main-memory nodes are not automatically written back to disk unless the <code>[[Options#WRITEBACK|WRITEBACK]]</code> option is set.<br />
* {{Code|collection("/path/to/docs")}}: returns a main-memory collection with all XML documents found at the addressed file path.<br />
<br />
==Raw Files==<br />
<br />
The <code>[[Commands#RETRIEVE|RETRIEVE]]</code> command and the <code>[[Database Module#db:retrieve|db:retrieve]]</code> function can be used to return files in their native byte representation.<br />
<br />
If the API you use does not support binary output (this is e.g. the case for various [[Clients|Client]] language bindings), you need to convert your binary data to its string representation before returning it to the client:<br />
<br />
<pre class="brush:xquery"><br />
string(db:retrieve('multimedia', 'sample.avi'))<br />
</pre><br />
<br />
==HTTP Services==<br />
<br />
* With [[REST]] and [[WebDAV]], all database resources can be requested in a uniform way, no matter if they are well-formed XML documents or binary files.<br />
<br />
=Update Resources=<br />
<br />
Once you have created a database, additional commands exist to modify its contents:<br />
<br />
* XML documents can be added with the <code>[[Commands#ADD|ADD]]</code> command.<br />
* Raw files are added with <code>[[Commands#STORE|STORE]]</code>.<br />
* Existing resources can be replaced with the <code>[[Commands#REPLACE|REPLACE]]</code> command.<br />
* Resources can be deleted via <code>[[Commands#DELETE|DELETE]]</code>.<br />
<br />
The [[Options#AUTOFLUSH|AUTOFLUSH]] option can be turned off before ''bulk operations'' (i.e. before a large number of new resources is added to the database).<br />
<br />
The [[Options#ADDCACHE|ADDCACHE]] option will first cache the input before adding it to the database. This is helpful when the input documents to be added are expected to eat up too much main memory.<br />
<br />
The following commands create an empty database, add two resources, explicitly flush data structures to disk, and finally delete all inserted data:<br />
<br />
<pre><br />
CREATE DB example<br />
SET AUTOFLUSH false<br />
ADD example.xml<br />
SET ADDCACHE true<br />
ADD /path/to/xml/documents<br />
STORE TO images/ 123.jpg<br />
FLUSH<br />
DELETE /<br />
</pre><br />
<br />
You may also use the BaseX-specific [[Database Module|XQuery Database Functions]] to create, add, replace, and delete XML documents:<br />
<br />
<pre class="brush:xquery"><br />
let $root := "/path/to/xml/documents/"<br />
for $file in file:list($root)<br />
return db:add("database", $root || $file)<br />
</pre><br />
<br />
Last but not least, XML documents can also be added via the GUI and the ''Database'' menu.<br />
<br />
=Export Data=<br />
<br />
All resources stored in a database can be ''exported'', i.e., written back to disk. This can be done in several ways:<br />
<br />
* Commands: <code>[[Commands#EXPORT|EXPORT]]</code> writes all resources to the specified target directory<br />
* GUI: Go to ''Database'' → ''Export'', choose the target directory and press ''OK''<br />
* WebDAV: Locate the database directory (or a sub-directory of it) and copy all contents to another location<br />
<br />
=In Memory Database=<br />
<br />
* In the standalone context, a main-memory database can be created (using <code>CREATE DB</code>), which can then be accessed by subsequent commands.<br />
* If a BaseX server instance is started, and if a database is created in its context (using <code>CREATE DB</code>), other BaseX client instances can access (and update) this database (using OPEN, db:open, etc.) as long as no other database is opened/created by the server.<br />
* You can force an ordinary database to being copied to memory by using <code>db:open('some-db') update {}</code><br />
<br />
'''Note:''' main-memory database instances are also created by the invocation of <code>doc(...)</code> or <code>collection(...)</code>, if the argument is not a<br />
database (no matter which value is set for MAINMEM). In other words:<br />
the same internal representation is used for main-memory databases and<br />
documents/collections generated via XQuery.<br />
<br />
=Changelog=<br />
<br />
;Version 8.4<br />
<br />
* Updated: [[#Raw Files|Raw Files]]: Items of binary type can be output without specifying the obsolete <code>raw</code> serialization method.<br />
<br />
;Version 7.2.1<br />
<br />
* Updated: {{Code|fn:document-uri}} and {{Code|fn:base-uri}} now return strings that can be reused with {{Code|fn:doc}} or {{Code|fn:collection}} to reopen the original document.</div>James Ballhttps://docs.basex.org/index.php?title=Serialization&diff=12238Serialization2016-01-13T19:31:23Z<p>James Ball: Added details on the behaviour of include-content-type</p>
<hr />
<div>This page is part of the [[XQuery|XQuery Portal]].<br />
Serialization parameters define how XQuery items and XML nodes are textually output, i.e., ''serialized''. (For input, see [[Parsers]].)<br />
They have been formalized in the [http://www.w3.org/TR/xslt-xquery-serialization-31 W3C XQuery Serialization 3.1] document.<br />
In BaseX, they can be specified by…<br />
<br />
* including them in the [[XQuery_3.0#Serialization|prolog of the XQuery expression]],<br />
* specifying them in the XQuery functions [[File_Module#file:write|file:write()]] or [[XQuery_3.0#Functions|fn:serialize()]]. The serialization parameters are specified as<br />
** children of an {{Code|&lt;output:serialization-parameters/&gt;}} element, as defined for the [http://www.w3.org/TR/xpath-functions-30/#func-serialize fn:serialize()] function, or as<br />
** map, which contains all key/value pairs: <code>map { "method": "xml", "cdata-section-elements": "div", ... }</code>,<br />
* using the {{Code|-s}} flag of the BaseX [[Command-Line Options#BaseX Standalone|command-line]] clients,<br />
* setting the [[Options#SERIALIZER|SERIALIZER]] option before running a query,<br />
* setting the [[Options#EXPORTER|EXPORTER]] option before exporting a database, or<br />
* setting them as [[REST#Parameters|REST]] query parameters.<br />
<br />
=Parameters=<br />
<br />
{{Mark|Updated with Version 8.4}}: New serialization method <code>basex</code>; method <code>raw</code> was removed. By default, items of binary type are now output in their native byte representation.<br />
<br />
The following table gives a brief summary of all serialization parameters recognized by BaseX. For details, please refer to official specification.<br />
<br />
{| class="wikitable sortable" width="100%"<br />
|- valign="top"<br />
! width="140" | Parameter<br />
! Description<br />
! Allowed<br />
! Default<br />
|- valign="top"<br />
| {{Code|method}}<br />
| Specifies the serialization method. {{Code|xml}}, {{Code|xhtml}}, {{Code|html}}, {{Code|text}}, {{Code|json}}, and {{Code|adaptive}} are adopted from the official specification. The methods {{Code|basex}} and {{Code|csv}} are specific to BaseX (see [[XQuery Extensions#Serialization|XQuery Extensions]]).<br />
| {{Code|xml}}, {{Code|xhtml}}, {{Code|html}}, {{Code|text}}, {{Code|json}}, {{Code|adaptive}}, {{Code|csv}}, {{Code|basex}}<br />
| {{Code|basex}}<br />
|- valign="top"<br />
| {{Code|version}}<br />
| Specifies the version of the serialization method.<br />
| xml/xhtml: {{Code|1.0}}, {{Code|1.1}}<br/>html: {{Code|4.0}}, {{Code|4.01}}, {{Code|5.0}}<br/><br />
| {{Code|1.0}}<br />
|- valign="top"<br />
| {{Code|html-version}}<br />
| Specifies the version of the HTML serialization method.<br />
| {{Code|4.0}}, {{Code|4.01}}, {{Code|5.0}}<br />
| {{Code|4.0}}<br />
|- valign="top"<br />
| {{Code|item-separator}}<br />
| Determines a string to be used as item separator. If a separator is specified, the default separation of atomic values with single whitespaces will be skipped.<br />
| ''arbitrary strings'', {{Code|\n}}, {{Code|\r\n}}, {{Code|\r}}<br />
| ''empty''<br />
|- valign="top"<br />
| {{Code|encoding}}<br />
| Encoding to be used for outputting the data.<br />
| ''[http://docs.oracle.com/javase/7/docs/technotes/guides/intl/encoding.doc.html all encodings supported by Java]''<br />
| {{Code|UTF-8}}<br />
|- valign="top"<br />
| {{Code|indent}}<br />
| Adjusts whitespaces to make the output better readable.<br />
| {{Code|yes}}, {{Code|no}}<br />
| {{Code|yes}}<br />
|- valign="top"<br />
| {{Code|cdata-section-elements}}<br />
| List of elements to be output as CDATA, separated by whitespaces.<br />Example: {{Code|&lt;text&gt;&lt;![CDATA[ &lt;&gt; ]]&gt;&lt;/text&gt;}}<br />
| <br />
| <br />
|- valign="top"<br />
| {{Code|omit-xml-declaration}}<br />
| Omits the XML declaration, which is serialized before the actual query result<br />Example: <code>&lt;?xml version="1.0" encoding="UTF-8"?&gt;</code><br />
| {{Code|yes}}, {{Code|no}}<br />
| {{Code|yes}}<br />
|- valign="top"<br />
| {{Code|standalone}}<br />
| Prints or omits the "standalone" attribute in the XML declaration.<br />
| {{Code|yes}}, {{Code|no}}, {{Code|omit}}<br />
| {{Code|omit}}<br />
|- valign="top"<br />
| {{Code|doctype-system}}<br />
| Introduces the output with a document type declaration and the given system identifier.<br />Example: {{Code|&lt;!DOCTYPE x SYSTEM "entities.dtd"&gt;}}<br />
|<br />
|<br />
|- valign="top"<br />
| {{Code|doctype-public}}<br />
| If {{Code|doctype-system}} is specified, adds a public identifier.<br />Example: {{Code|&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "<nowiki>http://www.w3.org/TR/html4/strict.dtd</nowiki>"&gt;}}<br />
| <br />
|<br />
|- valign="top"<br />
| {{Code|undeclare-prefixes}}<br />
| Undeclares prefixes in XML 1.1.<br />
| {{Code|yes}}, {{Code|no}}<br />
| {{Code|no}}<br />
|- valign="top"<br />
| {{Code|normalization-form}}<br />
| Specifies a normalization form. BaseX supports Form C ({{Code|NFC}}).<br />
| {{Code|NFC}}, {{Code|none}}<br />
| {{Code|NFC}}<br />
|- valign="top"<br />
| {{Code|media-type}}<br />
| Specifies the media type.<br />
| <br />
| {{Code|application/xml}}<br />
|- valign="top"<br />
| {{Code|parameter-document}}<br />
| Parses the value as XML document with additional serialization parameters (see the [http://www.w3.org/TR/xslt-xquery-serialization-31/#serparams-in-xdm-instance Serialization Specification] for more details).<br />
| <br />
| <br />
|- valign="top"<br />
| {{Code|use-character-maps}}<br />
| Defines character mappings. May only occur in documents parsed with {{Code|parameter-document}}.<br />
| <br />
| <br />
|- valign="top"<br />
| {{Code|byte-order-mark}}<br />
| Prints a byte-order-mark before starting serialization.<br />
| {{Code|yes}}, {{Code|no}}<br />
| {{Code|no}}<br />
|- valign="top"<br />
| {{Code|escape-uri-attributes}}<br />
| Escapes URI information in certain HTML attributes<br />Example: <code>&lt;a&nbsp;href="%C3%A4%C3%B6%C3%BC"&gt;äöü&lt;a&gt;</code><br />
| {{Code|yes}}, {{Code|no}}<br />
| {{Code|no}}<br />
|- valign="top"<br />
| {{Code|include-content-type}}<br />
| Inserts a {{Code|meta}} content-type element into the head element if the result is output as HTML<br />Example: <code>&lt;head&gt;&lt;meta http-equiv="Content-Type" content="text/html; charset=UTF-8"&gt;&lt;/head&gt;</code>. The head element must already exist or nothing will be added. Any existing {{Code|meta}} content-type elements will be removed.<br />
| {{Code|yes}}, {{Code|no}}<br />
| {{Code|no}}<br />
|}<br />
<br />
BaseX provides some additional serialization parameters:<br />
<br />
{| class="wikitable sortable" width="100%"<br />
|- valign="top"<br />
! width="140" | Parameter<br />
! Description<br />
! Allowed<br />
! Default<br />
|- valign="top"<br />
| {{Code|csv}}<br />
| Defines the way how data is serialized as CSV.<br />
| see [[CSV Module]]<br />
|<br />
|- valign="top"<br />
| {{Code|json}}<br />
| Defines the way how data is serialized as JSON.<br />
| see [[JSON Module]]<br />
| <br />
|- valign="top"<br />
| {{Code|tabulator}}<br />
| Uses tab characters ({{Code|\t}}) instead of spaces for indenting elements.<br />
| {{Code|yes}}, {{Code|no}}<br />
| {{Code|no}}<br />
|- valign="top"<br />
| {{Code|indents}}<br />
| Specifies the number of characters to be indented.<br />
| ''positive number''<br />
| {{Code|2}}<br />
|- valign="top"<br />
| {{Code|newline}}<br />
| Specifies the type of newline to be used as end-of-line marker. <br />
| {{Code|\n}}, {{Code|\r\n}}, {{Code|\r}}<br />
| ''system dependent''<br />
|- valign="top"<br />
| {{Code|limit}}<br />
| Stops serialization after the specified number of bytes has been serialized. If a negative number is specified, everything will be output.<br />
| ''positive number''<br />
| {{Code|-1}}<br />
|- valign="top"<br />
| {{Code|binary}}<br />
| Indicates if items of binary type are output in their native byte representation. Only applicable to the <code>base</code> serialization method.<br />
| {{Code|yes}}, {{Code|no}}<br />
| {{Code|yes}}<br />
|}<br />
<br />
The {{Code|csv}} and {{Code|json}} parameters are supplied with a list of options. Option names and values are combined with <code>=</code>, several options are separated by <code>,</code>:<br />
<br />
'''Query''':<br />
<pre class="brush:xquery"><br />
(: The output namespace declaration is optional, because it is statically declared in BaseX) :)<br />
declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";<br />
declare option output:method "csv";<br />
declare option output:csv "header=yes, separator=semicolon";<br />
<csv><br />
<record><br />
<Name>John</Name><br />
<City>Newton</City><br />
</record><br />
<record><br />
<Name>Jack</Name><br />
<City>Oldtown</City><br />
</record><br />
</csv><br />
</pre><br />
<br />
'''Result''':<br />
<pre class="brush:xml"><br />
Name;City<br />
John;Newton<br />
Jack;Oldtown<br />
</pre><br />
<br />
=Changelog=<br />
<br />
;Version 8.4<br />
<br />
* Added: Serialization parameter {{Code|binary}}.<br />
* Updated: New serialization method <code>basex</code>. By default, items of binary type are now output in their native byte representation. The method <code>raw</code> was removed.<br />
<br />
;Version 8.0<br />
<br />
* Added: Support for {{Code|use-character-maps}} and {{Code|parameter-document}}.<br />
* Added: Serialization method {{Code|adaptive}}.<br />
* Updated: {{Code|adaptive}} is new default method (before: {{Code|xml}}).<br />
* Removed: {{Code|format}}, {{Code|wrap-prefix}}, {{Code|wrap-uri}}.<br />
<br />
;Version 7.8.2<br />
<br />
* Added: {{Code|limit}}: Stops serialization after the specified number of bytes has been serialized.<br />
<br />
;Version 7.8<br />
<br />
* Added: {{Code|csv}} and {{Code|json}} serialization parameters.<br />
* Removed: {{Code|separator}} option (use {{Code|item-separator}} instead).<br />
<br />
;Version 7.7.2<br />
<br />
* Added: {{Code|csv}} serialization method.<br />
* Added: temporary serialization methods {{Code|csv-header}}, {{Code|csv-separator}}, {{Code|json-unescape}}, {{Code|json-spec}}, {{Code|json-format}}.<br />
<br />
;Version 7.5<br />
<br />
* Added: official {{Code|item-separator}} and {{Code|html-version}} parameter.<br />
* Updated: <code>method=html5</code> removed; serializers updated with the [http://www.w3.org/TR/2013/WD-xslt-xquery-serialization-30-20130108/ latest version of the specification], using <code>method=html</code> and <code>version=5.0</code>.<br />
<br />
;Version 7.2<br />
<br />
* Added: {{Code|separator}} parameter.<br />
<br />
;Version 7.1<br />
<br />
* Added: {{Code|newline}} parameter.<br />
<br />
;Version 7.0<br />
<br />
* Added: Serialization parameters added to [[REST API]]; JSON/JsonML/raw methods.</div>James Ballhttps://docs.basex.org/index.php?title=Options&diff=12233Options2016-01-12T16:44:23Z<p>James Ball: Added details of how to set global options for the Mac OS X packaged application.</p>
<hr />
<div>This page is linked from the [[Getting Started]] Section.<br />
<br />
The options listed on this page influence the way how database [[Commands|commands]] are executed and XQuery expressions are evaluated. Options are divided into [[#Global Options|'''global options''']], which are valid for all BaseX instances, and '''local options''', which are specific to a client or session. Values of options are either ''strings'', ''numbers'' or ''booleans''.<br />
<br />
The {{Code|.basex}} [[Configuration#Configuration Files|configuration file]] is parsed by every new local BaseX instance. It contains all global options and, optionally, local options at the end of the file.<br />
<br />
Various ways exist to access and change options:<br />
<br />
* The current value of an option can be requested with the [[Commands#GET|GET]] command. Local options can be changed via [[Commands#SET|SET]]. All values are ''static'': They stay valid until they are changed once again by another operation. If an option is of type ''boolean'', and if no value is specified, its current value will be inverted.<br />
<br />
* Initial values for global options can also be specified via system properties, which can e.g. be passed on with the [http://docs.oracle.com/javase/1.4.2/docs/tooldocs/windows/java.html#options -D flag] on command line, or using [http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#setProperty(java.lang.String,%20java.lang.String) System.setProperty()] before creating a BaseX instance. The specified keys need to be prefixed with {{Code|org.basex.}}. An example:<br />
<br />
<pre class="brush:bash"><br />
java -Dorg.basex.CHOP=false -cp basex.jar org.basex.BaseX -c"get chop"<br />
CHOP: false<br />
</pre><br />
<br />
* If using the Mac OS X packaged application then global options can be set within the Info.plist file within the Contents folder of the application package. For example:<br />
<br />
<key>JVMOptions</key><br />
<array><br />
<string>-Dorg.basex.CHOP=false</string><br />
</array><br />
<br />
* In XQuery, local options can be set via option declarations and pragmas (see [[XQuery Extensions]]).<br />
<br />
If options are implicitly changed by operations in the [[GUI]], the underlying commands will be listed in the [[GUI#Visualizations|Info View]].<br/><br/><br />
<br />
=Global Options=<br />
<br />
Global options are constants. They can only be set in the configuration file or via system properties (see above). One exception is the [[#debug|DEBUG]] option, which can also be changed at runtime by users with [[User Management|admin permissions]].<br />
<br />
==General Options==<br />
<br />
===DEBUG===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|DEBUG [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Sends internal debug info to STDERR. This option can be turned on to get additional information for development and debugging purposes. It can also be triggered on [[Command-Line Options#BaseX Standalone|command line]] via <code>-d</code>.<br />
|}<br />
<br />
===DBPATH===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|DBPATH [path]}}<br />
|-<br />
| '''Default'''<br />
|<code>[[Configuration#Database Directory|{home}/BaseXData]]</code> or <code>[[Configuration#Database Directory|{home}/data]]</code><br />
|-<br />
| '''Summary'''<br />
|Points to the directory in which all databases are located.<br />
|}<br />
<br />
===REPOPATH===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|REPOPATH [path]}}<br />
|-<br />
| '''Default'''<br />
|<code>[[Configuration#Database Directory|{home}/BaseXRepo]]</code><br />
|-<br />
| '''Summary'''<br />
|Points to the [[Repository]], in which all XQuery modules are located.<br />
|}<br />
<br />
===LANG===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|LANG [language]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|English}}<br />
|-<br />
| '''Summary'''<br />
|Specifies the interface language. Currently, seven languages are available: 'English', 'German', 'French', 'Dutch', 'Italian', 'Japanese', and 'Vietnamese'.<br />
|}<br />
<br />
===LANGKEY===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|LANGKEY [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Prefixes all texts with the internal language keys. This option is helpful if BaseX is translated into another language, and if you want to see where particular texts are displayed.<br />
|}<br />
<br />
===GLOBALLOCK===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|GLOBALLOCK [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Controls if local (database) or global (process) locking will be used for managing read and write operations. The article on [[Transaction Management]] provides more details on concurrency control.<br />
|}<br />
<br />
==Client/Server Architecture==<br />
<br />
===HOST===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|HOST [host]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|localhost}}<br />
|-<br />
| '''Summary'''<br />
|This host name is used by the client when connecting to a server. This option can also be changed when running the client on [[Command-Line Options#BaseX Client|command line]] via <code>-n</code>.<br />
|}<br />
<br />
===PORT===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|PORT [port]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|1984}}<br />
|-<br />
| '''Summary'''<br />
|This port is used by the client when connecting to a server. This option can also be changed when running the client on [[Command-Line Options#BaseX Client|command line]] via <code>-p</code>.<br />
|}<br />
<br />
===SERVERPORT===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|SERVERPORT [port]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|1984}}<br />
|-<br />
| '''Summary'''<br />
|This is the port the database server will be listening to. This option can also be changed when running the server on [[Command-Line Options#BaseX Server|command line]] via <code>-p</code>.<br />
|}<br />
<br />
===USER===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|USER [name]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Represents a user name, which is used for accessing the server or an HTTP service:<br />
* The default value will be overwritten if a client specifies its own credentials.<br />
* If the default value is empty, login will only be possible if the client specifies credentials.<br />
* The option can also be changed on [[Command-Line Options#BaseX Client|command line]] via <code>-U</code>.<br />
|}<br />
<br />
===PASSWORD===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|PASSWORD [password]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Represents a password, which is used for accessing the server or an HTTP service:<br />
* The default value will be overwritten if a client specifies its own credentials.<br />
* If the default value is empty, login will only be possible if the client specifies credentials.<br />
* The option can also be changed on [[Command-Line Options#BaseX Client|command line]] via <code>-P</code>.<br />
* Please note that it is a security risk to specify your password in plain text.<br />
|}<br />
<br />
===AUTHMETHOD===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|AUTHMETHOD [method]}}<br />
|-<br />
| '''Default'''<br />
|''Basic''<br />
|-<br />
| '''Summary'''<br />
|Specifies the HTTP Authentication, which will be proposed by the [[Web Application|HTTP server]] if a client sends an unauthorized request. Allowed values are {{Code|Basic}} and {{Code|Digest}}.<br />
|}<br />
<br />
===SERVERHOST===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|SERVERHOST [host&#x7c;ip]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|This is the host name or ip address the server is bound to. If the option is set to an empty string (which is the default), the server will be open to all clients.<br />
|}<br />
<br />
===PROXYHOST===<br />
<br />
{| width='100%' width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|PROXYHOST [host]}}<br />
|-<br />
| '''Default'''<br />
|''empty'' <br />
|-<br />
| '''Summary'''<br />
|This is the host name of a proxy server. If the value is an empty string, it will be ignored.<br />
|}<br />
<br />
===PROXYPORT===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|PROXYPORT [port]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|0}}<br />
|-<br />
| '''Summary'''<br />
|This is the port number of a proxy server. If the value is set to {{Code|0}}, it will be ignored.<br />
|}<br />
<br />
===NONPROXYHOSTS===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|NONPROXYHOSTS [hosts]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|This is a list of hosts that should be directly accessed. If the value is an empty string, it will be ignored.<br />
|}<br />
<br />
===IGNORECERT===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|IGNORECERT [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|This option can be turned on to ignore untrusted certificates when connecting to servers. Please use this option carefully.<br />
|}<br />
<br />
===TIMEOUT===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|TIMEOUT [seconds]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|30}}<br />
|-<br />
| '''Summary'''<br />
|Specifies the maximum time a read-only transaction may take. If an operation takes longer than the specified timeout, it will be aborted. Write operations will not be affected by this timeout, as this would corrupt the integrity of the database. The timeout is deactivated if the timeout is set to {{Code|0}}. It is ignored for {{Code|ADMIN}} operations.<br />
|}<br />
<br />
===KEEPALIVE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|KEEPALIVE [seconds]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|600}}<br />
|-<br />
| '''Summary'''<br />
|Specifies the maximum time a client will be remembered by the server. If there has been no interaction with a client for a longer time than specified by this timeout, it will be disconnected. Running operations will not be affected by this option. The keepalive check is deactivated if the value is set to {{Code|0}}.<br />
|}<br />
<br />
===PARALLEL===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|PARALLEL [number]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|8}}<br />
|-<br />
| '''Summary'''<br />
|Denotes the maximum allowed {{Code|number}} of parallel [[Transaction Management|transactions]].<br/>Note that a higher number of parallel operations may increase disk activity and thus slow down queries. In some cases, a single transaction may even give you better results than any parallel activity. The main reason for allowing parallel operations is to prevent slow transactions from blocking all other operations.<br />
|}<br />
<br />
===LOG===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|LOG [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|Turns [[Logging]] of server operations and HTTP requests on/off. This option can also be changed when running the server on [[Command-Line Options#BaseX Server|command line]] via <code>-z</code>.<br />
|}<br />
<br />
===LOGMSGMAXLEN===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|LOGMSGMAXLEN [length]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|1000}}<br />
|-<br />
| '''Summary'''<br />
|Specifies the maximum length of a single [[Logging|log message]].<br />
|}<br />
<br />
==HTTP Services==<br />
<br />
If BaseX is run as web servlet, the HTTP options must be specified in the <code>[https://github.com/BaseXdb/basex/tree/master/basex-api/src/main/webapp/WEB-INF webapp/WEB-INF]</code> directory and the {{Code|jetty.xml}} and {{Code|web.xml}} configuration files.<br />
<br />
===WEBPATH===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|WEBPATH [path]}}<br />
|-<br />
| '''Default'''<br />
|<code>[[Configuration#Database Directory|{home}/BaseXWeb]]</code> or <code>[[Configuration#Database Directory|{home}/webapp]]</code><br />
|-<br />
| '''Summary'''<br />
|Points to the directory in which all the [[Web Application]] contents are stored, including XQuery, Script, [[RESTXQ]] and configuration files. This option is ignored if BaseX is deployed as [[Web Application#Servlet_Container|web servlet]].<br />
|}<br />
<br />
===RESTXQPATH===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|RESTXQPATH [path]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Points to the directory which contains the [[RESTXQ]] modules of a web application. Relative paths will be resolved against the [[#WEBPATH|WEBPATH]] directory.<br />
|}<br />
<br />
===CACHERESTXQ===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|CACHERESTXQ}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Caches [[RESTXQ]] modules once when starting the web application.<br/>The option is helpful in productive environments with a high load, but files should not be replaced while the web server is running.<br />
|}<br />
<br />
===RESTPATH===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|RESTPATH [path]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Points to the directory which contains XQuery files and command scripts, which can be evaluated via the [[REST#GET Requests|REST run operation]]. Relative paths will be resolved against the [[#WEBPATH|WEBPATH]] directory.<br />
|}<br />
<br />
===HTTPLOCAL===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|HTTPLOCAL [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|By default, if BaseX is run as [[Web Application]], a database server instance will be started as soon as the first HTTP service is called. The server can then be addressed by other BaseX clients in parallel to the HTTP services.<br/>If the option is set to {{Code|false}}, the database server will be disabled.<br />
|}<br />
<br />
===STOPPORT===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|STOPPORT [port]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|8985}}<br />
|-<br />
| '''Summary'''<br />
|This is the port on which the [[Startup#BaseX HTTP Server|HTTP Server]] can be locally closed:<br />
* The listener for stopping the web server will only be started if the specified value is greater than {{Code|0}}.<br />
* The option is ignored if BaseX is used as a [[Web Application]] or started via [[Web Application#Maven|Maven]].<br />
* This option can also be changed when running the HTTP server on [[Command-Line Options#BaseX Server|command line]] via <code>-s</code>.<br />
|}<br />
<br />
=Create Options=<br />
<br />
==General==<br />
<br />
===MAINMEM===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|MAINMEM [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|If this option is turned on, new databases will be exclusively created in main memory. Most queries will be evaluated faster in main memory mode, but all data is lost if BaseX is shut down. The value of this option will be assigned once to a new database, and cannot be changed after that.<br />
|}<br />
<br />
===ADDCACHE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|ADDCACHE [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|If this option is activated, data structures of documents will first be cached to disk before being added to the final database. This option is helpful when larger documents need to be added, and if the existing heuristics cannot estimate the input size (e.g. when adding directories or sending input streams).<br />
|}<br />
<br />
==Parsing==<br />
<br />
===CREATEFILTER===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|CREATEFILTER [filter]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|*.xml}}<br />
|-<br />
| '''Summary'''<br />
|File filter in the [[Commands#Glob Syntax|Glob Syntax]], which is applied whenever new databases are created, or resources are added to a database.<br />
|}<br />
<br />
===ADDARCHIVES===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|ADDARCHIVES [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|If this option is set to {{Code|true}}, files within archives (ZIP, GZIP, TAR, TGZ, DOCX, etc.) are parsed whenever new databases are created or resources are added to a database.<br />
|}<br />
<br />
===ARCHIVENAME===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|ARCHIVENAME [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|If this option is set to {{Code|true}}, the file name of parsed archives will be included in the document paths.<br />
|}<br />
<br />
===SKIPCORRUPT===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|SKIPCORRUPT [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Skips corrupt (i.e., not well-formed) files while creating a database or adding new documents. If this option is activated, document updates are slowed down, as all files will be parsed twice. Next, main memory consumption will be higher as parsed files will be cached in main memory.<br />
|}<br />
<br />
===ADDRAW===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|ADDRAW [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|If this option is activated, and if new resources are added to a database, all files that are not filtered by the [[#CREATEFILTER|CREATEFILTER]] option will be added as ''raw'' files (i.e., in their binary representation).<br />
|}<br />
<br />
===PARSER===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|PARSER [type]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|XML}}<br />
|-<br />
| '''Summary'''<br />
|Defines a [[Parsers|parser]] for importing new files to the database. Currently, 'XML', 'JSON', 'CSV', 'TEXT', 'HTML' are available as parsers. HTML will be parsed as normal XML files if [http://home.ccil.org/~cowan/XML/tagsoup/ Tagsoup] is not found in the classpath.<br />
|}<br />
<br />
===CSVPARSER===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|CSVPARSER [options]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Specifies the way how CSV data will be parsed. The available options are listed in the [[CSV Module#Options|CSV Module]].<br />
|}<br />
<br />
===JSONPARSER===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|JSONPARSER [options]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Specifies the way how JSON data will be parsed. The available options are listed in the [[JSON Module#Options|JSON Module]].<br />
|}<br />
<br />
===HTMLPARSER===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|HTMLPARSER [options]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Specifies the way how HTML data will be parsed. Available options are listed in the [[Parsers]] article.<br />
|}<br />
<br />
===TEXTPARSER===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|TEXTPARSER [options]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Specifies the way how TEXT data will be parsed. Available options are listed in the [[Parsers]] article.<br />
|}<br />
<br />
==XML Parsing==<br />
<br />
===CHOP===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|CHOP [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|Many XML documents include whitespaces that have been added to improve readability. The {{Code|CHOP}} option controls the [http://www.w3.org/TR/REC-xml/#sec-white-space white-space processing mode] of the XML parser:<br />
* By default, this option is set to {{Code|true}}. This way, leading and trailing whitespaces from text nodes will be chopped and all empty text nodes will be discarded.<br />
* The flag should be turned off if a document contains [[Full-Text#Mixed Content|mixed content]].<br />
* The flag can also be turned off on [[Command-Line Options#BaseX Standalone|command line]] via <code>-w</code>.<br />
* If the <code>xml:space="preserve"</code> attribute is attached to an element, chopping will be turned off for all descendant text nodes. In the following example document, the whitespaces in the text nodes of the {{Code|text}} element will not be chopped:<br />
<pre class="brush:xml"><br />
<xml><br />
<title><br />
Demonstrating the CHOP flag<br />
</title><br />
<text xml:space="preserve">To <b>be</b>, or not to <b>be</b>, that is the question.</text><br />
</xml><br />
</pre><br />
|}<br />
<br />
===STRIPNS===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|STRIPNS [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Strips all namespaces from an XML document and all elements while parsing.<br />
|}<br />
<br />
===INTPARSE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|INTPARSE [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Uses the internal XML parser instead of the standard Java XML parser. The internal parser is faster, more fault tolerant and supports common HTML entities out-of-the-box, but it does not support all features needed for parsing DTDs.<br />
|}<br />
<br />
===DTD===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|DTD [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Parses referenced DTDs and resolves XML entities. By default, this option is switched to {{Code|false}}, as many DTDs are located externally, which may completely block the process of creating new databases. The [[#CATFILE|CATFILE]] option can be changed to locally resolve DTDs.<br />
|}<br />
<br />
===XINCLUDE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|XINCLUDE [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|Resolves XInclude inclusion tags and merges referenced XML documents. By default, this option is switched to {{Code|true}}. This option is only available if the standard Java XML Parser is used (see [[#INTPARSE|INTPARSE]]).<br />
|}<br />
<br />
===CATFILE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|CATFILE [path]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Specifies a catalog file to locally resolve DTDs; see the entry on [[Catalog Resolver]]s for more details.<br />
|}<br />
<br />
==Indexing==<br />
<br />
The current values of the index options will be stored in a newly created database, and will be updated if indexes if the [[Commands#OPTIMIZE|OPTIMIZE]] command or the [[Database_Module#db:optimize|db:optimize]] function is called.<br />
<br />
===TEXTINDEX===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|TEXTINDEX [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|Creates a text index whenever a new database is created. A text index speeds up queries with equality comparisons on text nodes; see [[Indexes#Value Indexes|Indexes]] for more details.<br />
|}<br />
<br />
===ATTRINDEX===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|ATTRINDEX [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|Creates an attribute index whenever a new database is created. An attribute index speeds up queries with equality comparisons on attribute values; see [[Indexes#Value Indexes|Indexes]] for more details.<br />
|}<br />
<br />
===FTINDEX===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|FTINDEX [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Creates a full-text index whenever a new database is created. A full-text index speeds up queries with full-text expressions; see [[Indexes#Value Indexes|Indexes]] for more details.<br />
|}<br />
<br />
===TEXTINCLUDE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|TEXTINCLUDE [names]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Defines name patterns for the parent elements of texts that are indexed. By default, all text nodes will be indexed.<br/>Name patterns are separated by commas. See [[Indexes#Selective Indexing|Selective Indexing]] for more details.<br />
|}<br />
<br />
===ATTRINCLUDE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|ATTRINCLUDE [names]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Defines name patterns for the attributes to be indexed. By default, all attribute nodes will be indexed.<br/>Name patterns are separated by commas. See [[Indexes#Selective Indexing|Selective Indexing]] for more details.<br />
|}<br />
<br />
===FTINCLUDE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|FTINCLUDE [names]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Defines name patterns for the parent elements of texts that are indexed. By default, all text nodes will be indexed.<br/>Name patterns are separated by commas. See [[Indexes#Selective Indexing|Selective Indexing]] for more details.<br />
|}<br />
<br />
===MAXLEN===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|MAXLEN [int]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|96}}<br />
|-<br />
| '''Summary'''<br />
|Specifies the maximum length of strings that are to be indexed by the name, path, value, and full-text index structures. The value of this option will be assigned once to a new database, and cannot be changed after that.<br />
|}<br />
<br />
===MAXCATS===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|MAXCATS [int]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|100}}<br />
|-<br />
| '''Summary'''<br />
|Specifies the maximum number of distinct values (categories) that will be stored together with the element/attribute names or unique paths in the [[Index#Name Index|Name Index]] or [[Index#Path Index|Path Index]]. The value of this option will be assigned once to a new database, and cannot be changed after that.<br />
|}<br />
<br />
===UPDINDEX===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|UPDINDEX [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|If turned on, incremental indexing will be applied to new databases:<br />
* With each update, the text and attributes indexes will be refreshed as well.<br />
* The advantage is that the value index structures will always be up-to-date.<br />
* However, updates will usually take longer (the article on [[Index#Updates|Index Structures]] provides more details).<br />
* The value of this option will be assigned once to a new database. It can be reassigned by running [[Commands#OPTIMIZE|OPTIMIZE ALL]] or [[Database_Module#db:optimize|db:optimize($db, true())]].<br />
|}<br />
<br />
===AUTOOPTIMIZE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|AUTOOPTIMIZE [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|If turned on, auto optimization will be applied to new databases:<br />
* With each update, outdated indexes and database statistics will be recreated.<br />
* As a result, the index structures will always be up-to-date.<br />
* However, updates can take much longer, so this option should only be activated for medium-sized databases.<br />
* The value of this option will be assigned once to a new database. It can be reassigned by running [[Commands#OPTIMIZE|OPTIMIZE]] or [[Database_Module#db:optimize|db:optimize]].<br />
|}<br />
<br />
===INDEXSPLITSIZE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|INDEXSPLITSIZE [num]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|0}}<br />
|-<br />
| '''Summary'''<br />
|This option affects the [[Indexes#Index Construction|construction]] of new text and attribute indexes. It specifies the number of index build operations that are performed before writing partial index data to disk. By default, if the value is set to 0, some dynamic split heuristics are applied. By setting the value to its maximum (2147483647), the index will never be split.<br />
|}<br />
<br />
===FTINDEXSPLITSIZE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|FTINDEXSPLITSIZE [num]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|0}}<br />
|-<br />
| '''Summary'''<br />
|This option affects the [[Indexes#Index Construction|construction]] of new full-text indexes. It specifies the number of index build operations that are performed before writing partial index data to disk. By default, if the value is set to 0, some dynamic split heuristics are applied. By setting the value to its maximum (2147483647), the index will never be split.<br />
|}<br />
<br />
==Full-Text==<br />
<br />
===STEMMING===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|STEMMING [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|If {{Code|true}}, all tokens will be stemmed during full-text indexing, using a language-specific stemmer implementation. By default, token will not be stemmed.<br />
|}<br />
<br />
===CASESENS===<br />
<br />
{| width='100%'<br />
<br />
| width='120' | '''Signature'''<br />
|{{Code|CASESENS [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|If {{Code|true}}, the case of tokens will be preserved during full-text indexing. By default, case will be ignored (all tokens will be indexed in lower case).<br />
|}<br />
<br />
===DIACRITICS===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|DIACRITICS [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|If set to {{Code|true}}, diacritics will be preserved during full-text indexing. By default, diacritics will be removed.<br />
|}<br />
<br />
===LANGUAGE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|LANGUAGE [lang]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|en}}<br />
|-<br />
| '''Summary'''<br />
|The specified language will influence the way how an input text will be tokenized. This option is mainly important if tokens are to be stemmed, or if the tokenization of a language differs from Western languages.<br />
|}<br />
<br />
===STOPWORDS===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|STOPWORDS [path]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|A new full-text index will drop tokens that are listed in the specified stopword list. A stopword list may decrease the size of the full text index. A standard stopword list for English texts is provided in the directory {{Code|etc/stopwords.txt}} in the official releases or available online at http://files.basex.org/etc/stopwords.txt.<br />
|}<br />
<br />
=Query Options=<br />
<br />
===QUERYINFO===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|QUERYINFO [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Prints more information on internal query rewritings, optimizations, and performance. By default, this info is shown in the [[GUI#Visualizations|Info View]] in the GUI. It can also be activated on [[Command-Line Options#BaseX Standalone|command line]] via <code>-V</code>. <br />
|}<br />
<br />
===XQUERY3===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|XQUERY3}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|Enables all [[XQuery 3.0]] features supported by BaseX. If this option is set to {{Code|false}}, the XQuery parser will only accept expressions of the XQuery 1.0 specification.<br />
|}<br />
<br />
===MIXUPDATES===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|MIXUPDATES}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Allows queries to both contain updating and non-updating expressions. All updating constraints will be turned off, and nodes to be returned will be copied before they are modified by an updating expression. – By default, this option is set to {{Code|false}}, because the XQuery Update Facility does not allow an updating query to [[XQuery Update#Returning Results|return results]].<br />
|}<br />
<br />
===BINDINGS===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|BINDINGS [vars]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Contains external variables to be bound to a query. The string must comply with the following rules:<br />
* Variable names and values must be separated by equality signs.<br />
* Multiple variables must be delimited by commas.<br />
* Commas in values must be duplicated.<br />
* Variables may optionally be introduced with a leading dollar sign.<br />
* If a variable uses a namespace different to the default namespace, it can be specified with the [http://www.jclark.com/xml/xmlns.htm Clark Notation] or [http://www.w3.org/TR/xquery-30/#id-basics Expanded QName Notation].<br />
This option can also be used on [[Command-Line Options#BaseX Standalone|command line]] with the flag <code>-b</code>.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code>$a=1,$b=2</code> &nbsp; binds the values {{Code|1}} and {{Code|2}} to the variables $a and $b<br />
* <code>a=1,,2</code> &nbsp; binds the value {{Code|1,2}} to the variable $a<br />
* <code>{URI}a=x</code> &nbsp; binds the value {{Code|x}} to the variable $a with the namespace {{Code|URI}}.<br />
* In the following [[Commands#Command_Scripts| Command Script]], the value {{Code|hello world!}} is bound to the variable $GREETING:<br />
<pre class="brush:xml"><br />
SET BINDINGS GREETING="hello world!"<br />
XQUERY declare variable $GREETING external; $GREETING<br />
</pre><br />
|}<br />
<br />
===QUERYPATH===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|QUERYPATH [path]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Contains the path (''base URI'') to the executed query (default: ''empty''). This directory will be used to resolve relative paths to documents, query modules, and other resources addressed in a query.<br />
|}<br />
<br />
===INLINELIMIT===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|INLINELIMIT}}<br />
|-<br />
| '''Default'''<br />
|{{Code|100}}<br />
|-<br />
| '''Summary'''<br />
|The XQuery compiler inlines functions to speed up query evaluation. Inlining will only take place if a function body is not too large (i.e., if it does not contain too many expressions). With this option, this maximum number of expressions can be specified.<br/>Function inlining can be turned off by setting the value to {{Code|0}}. The limit can be locally overridden via the <code>[[XQuery_3.0#Annotations|%basex:inline]]</code> annotation.<br />
|}<br />
<br />
===TAILCALLS===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|TAILCALLS}}<br />
|-<br />
| '''Default'''<br />
|{{Code|256}}<br />
|-<br />
| '''Summary'''<br />
|Specifies how many stack frames of [http://en.wikipedia.org/wiki/Tail_call tail-calls] are allowed on the stack at any time. When this limit is reached, tail-call optimization takes place and some call frames are eliminated. The feature can be turned off by setting the value to {{Code|-1}}.<br />
|}<br />
<br />
===DEFAULTDB===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|DEFAULTDB}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|If this option is turned on, paths specified in the {{Code|fn:doc}} and {{Code|fn:collections}} functions will first be resolved against a database that has been opened in the global context outside the query (e.g. by the [[Commands#OPEN|OPEN]] command). If the path does not match any existing resources, it will be resolved as described in the article on [[Databases#Access Resources|accessing database resources]].<br />
|}<br />
<br />
===FORCECREATE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|FORCECREATE [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|By activating this option, the XQuery {{Code|doc()}} and {{Code|collection()}} functions will create database instances for the addressed input files.<br />
|}<br />
<br />
===CHECKSTRINGS===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|CHECKSTRINGS [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|If this option is turned off, strings from external sources will be adopted as is, i. e., without being checked for valid XML characters:<br />
* This option affects [[Java Bindings]] and the string conversion and input functions [[Archive Module#archive:create|archive:create]], [[Archive Module#archive:extract-text|archive:extract-text]], [[Archive Module#archive:update|archive:update]], [[Conversion Module#convert:binary-to-string|convert:binary-to-string]], [[Fetch Module#fetch:text|fetch:text]], [[File Module#file:read-text|file:read-text]], and [[ZIP Module#zip:text-entry|zip:text-entry]].<br />
* Please be aware that an inconsiderate use of this option may cause unexpected behavior when storing or outputting strings.<br />
|}<br />
<br />
===LSERROR===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|LSERROR [error]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|0}}<br />
|-<br />
| '''Summary'''<br />
|This option specifies the maximum Levenshtein error for the BaseX-specific fuzzy match option. See the page on [[Full-Text#Fuzzy_Querying|Full-Texts]] for more information on fuzzy querying.<br />
|}<br />
<br />
===RUNQUERY===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|RUNQUERY [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|Specifies if a query will be executed or parsed only. This option can also be changed on [[Command-Line Options#BaseX Standalone|command line]] via <code>-R</code>.<br />
|}<br />
<br />
===RUNS===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|RUNS [num]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|1}}<br />
|-<br />
| '''Summary'''<br />
|Specifies how often a query will be evaluated. The result is serialized only once, and the measured times are averages of all runs. This option can also be changed on [[Command-Line Options#BaseX Standalone|command line]] via <code>-r</code>.<br />
|}<br />
<br />
=Serialization Options=<br />
<br />
===SERIALIZE===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|SERIALIZE [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|Results of XQuery expressions will be serialized if this option is turned on. For debugging purposes and performance measurements, this option can be set to {{Code|false}}. It can also be turned off on [[Command-Line Options#BaseX Standalone|command line]] via <code>-z</code>. <br />
|}<br />
<br />
===SERIALIZER===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|SERIALIZER [params]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Parameters for [[Serialization|serializing]] query results. The string must comply with the following rules:<br />
* Variable names and values must be separated by equality signs.<br />
* Multiple variables must be delimited by commas.<br />
* Commas in values must be duplicated.<br />
The option can also be used on [[Command-Line Options#BaseX Standalone|command line]] with the flag <code>-s</code>.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code>encoding=US-ASCII,omit-xml-declaration=no</code> : sets the encoding to {{Code|US-ASCII}} and prints the XML declaration.<br />
* <code>item-separator=,,</code> : separates serialized items by a single comma.<br />
|}<br />
<br />
===EXPORTER===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|EXPORTER [params]}}<br />
|-<br />
| '''Default'''<br />
|''empty''<br />
|-<br />
| '''Summary'''<br />
|Contains parameters for exporting all resources of a database; see [[Serialization]] for more details. Keys and values are separated by equality signs, multiple parameters are delimited by commas.<br />
|}<br />
<br />
===XMLPLAN===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|XMLPLAN [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Prints the execution plan of an XQuery expression in its XML representation. This option can also be activated on [[Command-Line Options#BaseX Standalone|command line]] via <code>-x</code>. <br />
|}<br />
<br />
===COMPPLAN===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|COMPPLAN [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|Generates the query plan, which can be activated via [[#XMLPLAN|XMLPLAN], before or after query compilation. This option can also be activated on [[Command-Line Options#BaseX Standalone|command line]] via <code>-X</code>. <br />
|}<br />
<br />
===DOTPLAN===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|DOTPLAN [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Visualizes the execution plan of an XQuery expression with [http://www.graphviz.org dotty] and saves its dot file in the query directory.<br />
|}<br />
<br />
===DOTCOMPACT===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|DOTCOMPACT [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Chooses a compact dot representation.<br />
|}<br />
<br />
=Other Options=<br />
<br />
===AUTOFLUSH===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|AUTOFLUSH [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|true}}<br />
|-<br />
| '''Summary'''<br />
|Flushes database buffers to disk after each update. If this option is set to {{Code|false}}, bulk operations (multiple single updates) will be evaluated faster. As a drawback, the chance of data loss increases if the database is not explicitly flushed via the [[Commands#FLUSH|FLUSH]] command.<br />
|}<br />
<br />
===WRITEBACK===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|WRITEBACK [boolean]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|false}}<br />
|-<br />
| '''Summary'''<br />
|Propagates updates on main-memory instances of files that have been retrieved via {{Code|fn:doc}} or {{Code|fn:collection}} back to disk. This option can also be activated on [[Command-Line Options#BaseX Standalone|command line]] via <code>-u</code>. Please note that, when turning this option on, your original files will not be backed up.<br />
|}<br />
<br />
===MAXSTAT===<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signature'''<br />
|{{Code|MAXSTAT [num]}}<br />
|-<br />
| '''Default'''<br />
|{{Code|30}}<br />
|-<br />
| '''Summary'''<br />
|Specifies the maximum number of index occurrences printed by the <code>[[Commands#INFO|INFO INDEX]]</code> command.<br />
|}<br />
<br />
=Changelog=<br />
<br />
;Version 8.3<br />
<br />
* Added: <code>[[#CACHERESTXQ|CACHERESTXQ]]</code>, <code>[[#TEXTINCLUDE|TEXTINCLUDE]]</code>, <code>[[#ATTRINCLUDE|ATTRINCLUDE]]</code>, <code>[[#FTINCLUDE|FTINCLUDE]]</code>, <code>[[#ARCHIVENAME|ARCHIVENAME]]</code><br />
<br />
;Version 8.2<br />
<br />
* Removed: <code>EVENTPORT</code>, <code>CACHEQUERY</code><br />
<br />
;Version 8.1<br />
<br />
* Added: <code>[[#IGNORECERT|IGNORECERT]]</code>, <code>[[#RESTPATH|RESTPATH]]</code><br />
<br />
;Version 8.0<br />
<br />
* Added: <code>[[#MIXUPDATES|MIXUPDATES]]</code>, <code>[[#AUTOOPTIMIZE|AUTOOPTIMIZE]]</code>, <code>[[#AUTHMETHOD|AUTHMETHOD]]</code>, <code>[[#XINCLUDE|XINCLUDE]]</code><br />
* Updated: <code>[[#PROXYPORT|PROXYPORT]]</code>: default set to 0; will be ignored. <code>[[#PROXYHOST|PROXYHOST]]</code>, <code>[[#NONPROXYHOSTS|NONPROXYHOSTS]]</code>: empty strings will be ignored.<br />
<br />
;Version 7.8.1<br />
* Updated: <code>[[#ADDARCHIVES|ADDARCHIVES]]</code>: parsing of TAR and TGZ files.<br />
<br />
;Version 7.8<br />
<br />
* Added: <code>[[#CSVPARSER|CSVPARSER]]</code>, <code>[[#JSONPARSER|JSONPARSER]]</code>, <code>[[#TEXTPARSER|TEXTPARSER]]</code>, <code>[[#HTMLPARSER|HTMLPARSER]]</code>, <code>[[#INLINELIMIT|INLINELIMIT]]</code>, <code>[[#TAILCALLS|TAILCALLS]]</code>, <code>[[#DEFAULTDB|DEFAULTDB]]</code>, <code>[[#RUNQUERY|RUNQUERY]]</code><br />
* Updated: <code>[[#WRITEBACK|WRITEBACK]]</code> only applies to main-memory document instances.<br />
* Updated: <code>[[#DEBUG|DEBUG]]</code> option can be changed at runtime by users with admin permissions.<br />
* Updated: default of <code>[[#INTPARSE|INTPARSE]]</code> is now {{Code|false}}.<br />
* Removed: <code>HTMLOPT</code> (replaced with <code>[[#HTMLPARSER|HTMLPARSER]]</code>), <code>PARSEROPT</code> (replaced with parser-specific options), <code>DOTDISPLAY</code>, <code>DOTTY</code><br />
<br />
;Version 7.7<br />
<br />
* Added: <code>[[#ADDCACHE|ADDCACHE]]</code>, <code>[[#CHECKSTRINGS|CHECKSTRINGS]]</code>, <code>[[#FTINDEXSPLITSIZE|FTINDEXSPLITSIZE]]</code>, <code>[[#INDEXSPLITSIZE|INDEXSPLITSIZE]]</code><br />
<br />
;Version 7.6<br />
<br />
* Added: <code>[[#GLOBALLOCK|GLOBALLOCK]]</code><br />
* Added: store local options in configuration file after {{Code|# Local Options}} comments.<br />
<br />
;Version 7.5<br />
<br />
* Added: options can now be set via system properties<br />
* Added: a pragma expression can be used to locally change database options<br />
* Added: <code>[[#USER|USER]]</code>, <code>[[#PASSWORD|PASSWORD]]</code>, <code>[[#LOG|LOG]]</code>, <code>[[#LOGMSGMAXLEN|LOGMSGMAXLEN]]</code>, <code>[[#WEBPATH|WEBPATH]]</code>, <code>[[#RESTXQPATH|RESTXQPATH]]</code><code>[[#HTTPLOCAL|HTTPLOCAL]]</code>, <code>[[#CREATEONLY|CREATEONLY]]</code>, <code>[[#STRIPNS|STRIPNS]]</code><br />
* Removed: {{Code|HTTPPATH}}; {{Code|HTTPPORT}}: {{Code|jetty.xml}} configuration file is used instead<br />
* Removed: global options cannot be changed anymore during the lifetime of a BaseX instance<br />
<br />
;Version 7.3<br />
<br />
* Updated: <code>[[#KEEPALIVE|KEEPALIVE]]</code>, <code>[[#TIMEOUT|TIMEOUT]]</code>: default values changed<br />
* Removed: {{Code|WILDCARDS}}; new index supports both fuzzy and wildcard queries<br />
* Removed: {{Code|SCORING}}; new scoring model will focus on lengths of text nodes and match options<br />
<br />
;Version 7.2<br />
<br />
* Added: <code>[[#PROXYHOST|PROXYHOST]]</code>, <code>[[#PROXYPORT|PROXYPORT]]</code>, <code>[[#NONPROXYHOSTS|NONPROXYHOSTS]]</code>, <code>[[#HTMLOPT|HTMLOPT]]</code><br />
* Updated: <code>[[#TIMEOUT|TIMEOUT]]</code>: ignore timeout for admin users<br />
<br />
;Version 7.1<br />
<br />
* Added: <code>[[#ADDRAW|ADDRAW]]</code>, <code>[[#MAXLEN|MAXLEN]]</code>, <code>[[#MAXCATS|MAXCATS]]</code>, <code>[[#UPDINDEX|UPDINDEX]]</code><br />
* Updated: <code>[[#BINDINGS|BINDINGS]]</code><br />
<br />
;Version 7.0<br />
<br />
* Added: <code>[[#SERVERHOST|SERVERHOST]]</code>, <code>[[#KEEPALIVE|KEEPALIVE]]</code>, <code>[[#AUTOFLUSH|AUTOFLUSH]]</code>, <code>[[#QUERYPATH|QUERYPATH]]</code></div>James Ballhttps://docs.basex.org/index.php?title=Profiling_Module&diff=11732Profiling Module2015-05-13T08:02:45Z<p>James Ball: Corrected examples for prof:mem that had command as prof:mb</p>
<hr />
<div>This [[Module Library|XQuery Module]] contains various testing, profiling and helper functions.<br />
<br />
=Conventions=<br />
<br />
All functions in this module are assigned to the <code><nowiki>http://basex.org/modules/prof</nowiki></code> namespace, which is statically bound to the {{Code|prof}} prefix.<br/><br />
All errors are assigned to the <code><nowiki>http://basex.org/errors</nowiki></code> namespace, which is statically bound to the {{Code|bxerr}} prefix.<br />
<br />
=Functions=<br />
<br />
==prof:time==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|prof:time|$expr as item()|item()*}}<br />{{Func|prof:time|$expr as item(), $cache as xs:boolean|item()*}}<br />{{Func|prof:time|$expr as item(), $cache as xs:boolean, $label as xs:string|item()*}}<br />
|-<br />
| '''Summary'''<br />
|Measures the time needed to evaluate {{Code|$expr}} and sends it to standard error or, if the GUI is used, to the Info View.<br />If {{Code|$cache}} is set to {{Code|true()}}, the result will be temporarily cached. This way, a potential iterative execution of the expression (which often yields different memory usage) is blocked.<br/>A third, optional argument {{Code|$label}} may be specified to tag the profiling result.<br />
|-<br />
| '''Properties'''<br />
|The function is ''non-deterministic'': evaluation order will be preserved by the compiler.<br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|prof:time("1 to 100000")}} may output {{Code|25.69 ms}}.<br />
* {{Code|prof:time("1 to 100000", true())}} may output {{Code|208.12 ms}}.<br />
|}<br />
<br />
==prof:mem==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|prof:mem|$expr as item()|item()*}}<br />{{Func|prof:mem|$expr as item(), $cache as xs:boolean|item()*}}<br />{{Func|prof:mem|$expr as item(), $cache as xs:boolean, $label as xs:string|item()*}}<br />
|-<br />
| '''Summary'''<br />
|Measures the memory allocated by evaluating {{Code|$expr}} and sends it to standard error or, if the GUI is used, to the Info View.<br />If {{Code|$cache}} is set to {{Code|true()}}, the result will be temporarily cached. This way, a potential iterative execution of the expression (which often yields different memory usage) is blocked.<br/>A third, optional argument {{Code|$label}} may be specified to tag the profiling result.<br />
|-<br />
| '''Properties'''<br />
|The function is ''non-deterministic'': evaluation order will be preserved by the compiler.<br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|prof:mem("1 to 100000")}} may output {{Code|0 Bytes}}.<br />
* {{Code|prof:mem("1 to 100000", true())}} may output {{Code|26.678 mb}}.<br />
|}<br />
<br />
==prof:sleep==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|prof:sleep|$ms as xs:integer|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Sleeps for the specified number of milliseconds.<br />
|-<br />
| '''Properties'''<br />
|The function is ''non-deterministic'': evaluation order will be preserved by the compiler.<br />
|}<br />
<br />
==prof:human==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|prof:human|$number as xs:integer|xs:string}}<br />
|-<br />
| '''Summary'''<br />
|Returns a human-readable representation of the specified {{Code|$number}}.<br />
|-<br />
| '''Example'''<br />
|<br />
* {{Code|prof:human(16384)}} returns {{Code|16K}}.<br />
|}<br />
<br />
==prof:dump==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|prof:dump|$expr as item()|empty-sequence()}}<br />{{Func|prof:dump|$expr as item(), $label as xs:string|empty-sequence()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Dumps a serialized representation of {{Code|$expr}} to {{Code|STDERR}}, optionally prefixed with {{Code|$label}}, and returns an empty sequence. If the GUI is used, the dumped result is shown in the [[Graphical User Interface#Visualizations|Info View]].<br />
|-<br />
| '''Properties'''<br />
|In contrast to {{Code|fn:trace()}}, the consumed expression will not be passed on.<br />
|}<br />
<br />
==prof:variables==<br />
<br />
{{Mark|Introduced with Version 8.1}}:<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|prof:variables||empty-sequence()}}<br />
|-<br />
| '''Summary'''<br />
|Prints a list of all current local and global variable assignments to standard error or, if the GUI is used, to the Info View.<br />As every query is optimized before being evaluated, not all of the original variables may be visible in the output. Moreover, many variables of function calls will disappear because functions are inlined. Function inlining can be turned off by setting the [[Options#INLINELIMIT|INLINELIMIT]] option to <code>0</code>.<br />
|-<br />
| '''Properties'''<br />
|The function is ''non-deterministic'': evaluation order will be preserved by the compiler.<br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|for $x in 1 to 2 return prof:variables()}} will dump the values of <code>$x</code> to standard error.<br />
|}<br />
<br />
==prof:current-ms==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|prof:current-ms||xs:integer}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns the number of milliseconds passed since 1970/01/01 UTC. The granularity of the value depends on the underlying operating system and may be larger. For example, many operating systems measure time in units of tens of milliseconds.<br />
|-<br />
| '''Properties'''<br />
|In contrast to {{Code|fn:current-time()}}, the function is ''non-deterministic'', as it returns different values every time it is called. Its evaluation order will be preserved by the compiler.<br />
|}<br />
<br />
==prof:current-ns==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|prof:current-ns||xs:integer}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns the current value of the most precise available system timer in nanoseconds.<br />
|-<br />
| '''Properties'''<br />
|In contrast to {{Code|fn:current-time()}}, the function is ''non-deterministic'', as it returns different values every time it is called. Its evaluation order will be preserved by the compiler.<br />
|}<br />
<br />
==prof:void==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|prof:void|$value as item()*|empty-sequence()}}<br />
|-<br />
| '''Summary'''<br />
|Swallows all items of the specified {{Code|$value}} and returns an empty sequence. This function is helpful if some code needs to be evaluated and if the actual result is irrelevant.<br />
|-<br />
| '''Properties'''<br />
|The function is ''non-deterministic'': evaluation order will be preserved by the compiler.<br />
|-<br />
| '''Examples'''<br />
|<br />
* {{Code|prof:void(fetch:binary('http://my.rest.service'))}} performs an HTTP request and ignores the result.<br />
|}<br />
<br />
=Changelog=<br />
<br />
;Version 8.1<br />
<br />
* Added: <code>[[#prof:variables|prof:variables]]</code><br />
<br />
;Version 7.7<br />
<br />
* Added: <code>[[#prof:void|prof:void]]</code><br />
<br />
;Version 7.6<br />
<br />
* Added: <code>[[#prof:human|prof:human]]</code><br />
<br />
;Version 7.5<br />
<br />
* Added: <code>[[#prof:dump|prof:dump]]</code>, <code>[[#prof:current-ms|prof:current-ms]]</code>, <code>[[#prof:current-ns|prof:current-ns]]</code><br />
<br />
This module was introduced with Version 7.3.<br />
<br />
[[Category:XQuery]]</div>James Ballhttps://docs.basex.org/index.php?title=Web_Module&diff=11726Web Module2015-05-06T11:21:05Z<p>James Ball: Spelling correction: reponse to response</p>
<hr />
<div>This [[Module Library|XQuery Module]] provides convenience functions for building web applications with [[RESTXQ]].<br />
<br />
=Conventions=<br />
<br />
All functions in this module are assigned to the <code><nowiki>http://basex.org/modules/web</nowiki></code> namespace, which is statically bound to the {{Code|web}} prefix.<br/><br />
All errors are assigned to the <code><nowiki>http://basex.org/errors</nowiki></code> namespace, which is statically bound to the {{Code|bxerr}} prefix.<br />
<br />
=Functions=<br />
<br />
==web:content-type==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|web:content-type|$path as xs:string|xs:string}}<br/><br />
|-<br />
| '''Summary'''<br />
|Returns the content type of a path by analyzing its file suffix. <code>application/octet-stream</code> is returned if the file suffix is unknown.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code><nowiki>web:content-type("sample.mp3")</nowiki></code> returns <code>audio/mpeg</code><br />
|}<br />
<br />
==web:create-url==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|web:create-url|$url as xs:string, $parameters as map(*)|xs:string}}<br/><br />
|-<br />
| '''Summary'''<br />
|Creates a new URL from the specified <code>$url</code> string and the <code>$parameters</code> specified in a map. The keys and and values of the map entries will be converted to strings, URI-encoded, and appended to the url as query parameters. If a map entry has more than a single item, all of them will be appended as single parameters.<br />
|-<br />
| '''Examples'''<br />
|<br />
* <code><nowiki>web:create-url('http://find.me', map { 'q': 'dog' })</nowiki></code> returns <code><nowiki>http://find.me?q=dog</nowiki></code><br />
* <code><nowiki>web:create-url('search', map { 'year': (2000,2001), 'title':() })</nowiki></code> returns <code>search?year=2000&year=2001</code><br />
|}<br />
<br />
==web:redirect==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|web:redirect|$location as xs:string|element(rest:response)}}<br/>{{Func|web:redirect|$location as xs:string, $parameters as map(*)|element(rest:response)}}<br/><br />
|-<br />
| '''Summary'''<br />
|Creates a [[RESTXQ#Forwards and Redirects|RESTXQ redirection]] to the specified location. The returned response will only work if no other items are returned by the RESTXQ function.<br/>If <code>$parameters</code> are specified, they will be appended as query parameters to the URL as described for [[#web:create-url|web:create-url]].<br />
|-<br />
| '''Examples'''<br />
|The query <code><nowiki>web:redirect('/a/b')</nowiki></code> returns the following response element:<br />
<pre class="brush:xml"><br />
<rest:response xmlns:rest="http://exquery.org/ns/restxq"><br />
<http:response xmlns:http="http://expath.org/ns/http-client" status="302"><br />
<http:header name="location" value="/a/b"/><br />
</http:response><br />
</rest:response><br />
</pre><br />
|}<br />
<br />
==web:response-header==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|web:response-header||element(rest:response)}}<br/>{{Func|web:response-header|$headers as map(*)|element(rest:response)}}<br/>{{Func|web:response-header|$headers as map(*), $output as map(*)|element(rest:response)}}<br/><br />
|-<br />
| '''Summary'''<br />
|Creates a [[RESTXQ#Response|RESTXQ response header]] with a default caching directive and default serialization parameters.<br/><br />
Header options can be supplied via the <code>$headers</code> argument. Empty string values can be specified to invalidate default values. By default, the following header options will be returned:<br />
* <code>Cache-Control</code>: <code>max-age=3600,public</code><br />
Serialization parameters can be supplied via the <code>$output</code> argument. Empty string values can be specified to invalidate default values. By default, the following serialization parameters will be returned:<br />
* <code>media-type</code>: <code>application/octet-stream</code><br />
* <code>method</code>: <code>raw</code><br />
|-<br />
| '''Examples'''<br />
|<br />
* The function call <code>web:response-header()</code> returns the following response element:<br />
<pre class="brush:xml"><br />
<rest:response xmlns:rest="http://exquery.org/ns/restxq"><br />
<http:response xmlns:http="http://expath.org/ns/http-client"><br />
<http:header name="Cache-Control" value="max-age=3600,public"/><br />
</http:response><br />
<output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization"><br />
<output:media-type value="application/octet-stream"/><br />
<output:method value="raw"/><br />
</output:serialization-parameters><br />
</rest:response><br />
</pre><br />
* If the following RESTXQ function is called by a web browser…<br/><br />
<pre class="brush:xquery"><br />
declare %rest:path('media/{$file}') function local:get($file) {<br />
let $path := 'path/to/' || $file<br />
return (<br />
web:response-header(map { 'media-type': web:content-type($path) }),<br />
file:read-binary($path)<br />
)<br />
};<br />
</pre><br />
…a file resource with the correct content-type will be returned to user (provided that it exists in the web server's file system).<br />
|}<br />
<br />
=Changelog=<br />
<br />
The module was introduced with Version 8.1.<br />
<br />
[[Category:XQuery]]</div>James Ballhttps://docs.basex.org/index.php?title=Binary_Module&diff=10958Binary Module2014-08-22T20:03:57Z<p>James Ball: Correction bin:length was listed as bin:bin in the title.</p>
<hr />
<div>This [[Module Library|XQuery Module]] contains functions to process binary data, including extracting subparts, searching, basic binary operations and conversion between binary and structured forms.<br />
<br />
This module is based on the [http://expath.org/spec/binary EXPath Binary Module].<br />
<br />
=Conventions=<br />
<br />
All functions and errors in this module are assigned to the {{Code|http://expath.org/ns/binary}} namespace, which is statically bound to the {{Code|bin}} prefix.<br/><br />
<br />
=Constants and Conversions=<br />
<br />
==bin:hex==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:hex|$in as xs:string?|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns the binary form of the set of octets written as a sequence of (ASCII) hex digits ([0-9A-Fa-f]).<br/>{{Code|$in}} will be effectively zero-padded from the left to generate an integral number of octets, i.e. an even number of hexadecimal digits. If {{Code|$in}} is an empty string, then the result will be an {{Code|xs:base64Binary}} with no embedded data. Byte order in the result follows (per-octet) character order in the string. If the value of {{Code|$in}} is the empty sequence, the function returns an empty sequence.<br />
|-<br />
| '''Errors'''<br />
|{{Error|non-numeric-character|#Errors}} the input cannot be parsed as a hexadecimal number.<br />
|-<br />
| '''Examples'''<br />
|<code>bin:hex('11223F4E')</code> yields <code>ESI/Tg==</code>.<br/><code>xs:hexBinary(bin:hex('FF'))</code> yields <code>FF</code>.<br />
|}<br />
<br />
==bin:bin==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:bin|$in as xs:string?|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns the binary form of the set of octets written as a sequence of (8-wise) (ASCII) binary digits ([01]).<br/><code>$in</code> will be effectively zero-padded from the left to generate an integral number of octets. If <code>$in</code> is an empty string, then the result will be an <code>xs:base64Binary</code> with no embedded data. Byte order in the result follows (per-octet) character order in the string. If the value of <code>$in</code> is the empty sequence, the function returns an empty sequence.<br />
|-<br />
| '''Errors'''<br />
|{{Error|non-numeric-character|#Errors}} the input cannot be parsed as a binary number.<br />
|-<br />
| '''Examples'''<br />
|<code>bin:bin('1101000111010101')</code> yields <code>0dU=</code>.<br/><code>xs:hexBinary(bin:bin('1000111010101'))</code> yields <code>11D5</code>.<br />
|}<br />
<br />
==bin:octal==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:octal|$in as xs:string?|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns the binary form of the set of octets written as a sequence of (ASCII) octal digits ([0-7]).<br/><code>$in</code> will be effectively zero-padded from the left to generate an integral number of octets. If <code>$in</code> is an empty string, then the result will be an <code>xs:base64Binary</code> with no embedded data. Byte order in the result follows (per-octet) character order in the string. If the value of <code>$in</code> is the empty sequence, the function returns an empty sequence.<br />
|-<br />
| '''Errors'''<br />
|{{Error|non-numeric-character|#Errors}} the input cannot be parsed as an octal number.<br />
|-<br />
| '''Examples'''<br />
|<code>xs:hexBinary(bin:octal('11223047'))</code> yields <code>252627</code>.<br />
|}<br />
<br />
==bin:to-octets==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:to-octets|$in as xs:base64Binary|xs:integer*}}<br />
|-<br />
| '''Summary'''<br />
|Returns binary data as a sequence of octets.<br/>If <code>$in</code> is a zero length binary data then the empty sequence is returned. Octets are returned as integers from 0 to 255.<br />
|}<br />
<br />
==bin:from-octets==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:from-octets|$in as xs:integer*|xs:base64Binary}}<br />
|-<br />
| '''Summary'''<br />
|Converts a sequence of octets into binary data.<br/>Octets are integers from 0 to 255. If the value of <code>$in</code> is the empty sequence, the function returns zero-sized binary data.<br />
|-<br />
| '''Errors'''<br />
|{{Error|octet-out-of-range|#Errors}} one of the octets lies outside the range 0 - 255.<br />
|}<br />
<br />
=Basic Operations=<br />
<br />
==bin:length==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:length|$in as xs:base64Binary|xs:integer}}<br />
|-<br />
| '''Summary'''<br />
|Returns the size of binary data in octets.<br />
|}<br />
<br />
==bin:part==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:part|$in as xs:base64Binary?, $offset as xs:integer|xs:base64Binary?}}<br/>{{Func|bin:part|$in as xs:base64Binary?, $offset as xs:integer, $size as xs:integer|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns a section of binary data starting at the {{Code|$offset}} octet.<br/>If {{Code|$size}} is specified, the size of the returned binary data is {{Code|$size}} octets. If {{Code|$size}} is absent, all remaining data from {{Code|$offset}} is returned. The {{Code|$offset}} is zero based. If the value of {{Code|$in}} is the empty sequence, the function returns an empty sequence.<br />
|-<br />
| '''Errors'''<br />
|{{Error|negative-size|#Errors}} the specified size is negative.<br/>{{Error|index-out-of-range|#Errors}} the specified offset + size is out of range.<br />
|-<br />
| '''Examples'''<br />
|Test whether binary data starts with binary content consistent with a PDF file:<br/><code>bin:part($data, 0, 4) eq bin:hex("25504446")</code>.<br />
|}<br />
<br />
==bin:join==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:join|$in as xs:base64Binary*|xs:base64Binary}}<br />
|-<br />
| '''Summary'''<br />
|Returns an {{Code|xs:base64Binary}} created by concatenating the items in the sequence {{Code|$in}}, in order. If the value of {{Code|$in}} is the empty sequence, the function returns a binary item containing no data bytes.<br />
|}<br />
<br />
==bin:insert-before==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:insert-before|$in as xs:base64Binary?, $offset as xs:integer, $extra as xs:base64Binary?|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns binary data consisting sequentially of the data from {{Code|$in}} up to and including the {{Code|$offset - 1}} octet, followed by all the data from {{Code|$extra}}, and then the remaining data from {{Code|$in}}.<br/>The {{Code|$offset}} is zero based. If the value of {{Code|$in}} is the empty sequence, the function returns an empty sequence.<br />
|-<br />
| '''Errors'''<br />
|{{Error|index-out-of-range|#Errors}} the specified offset is out of range.<br />
|}<br />
<br />
==bin:pad-left==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:pad-left|$in as xs:base64Binary?, $size as xs:integer|xs:base64Binary?}}<br/>{{Func|bin:pad-left|$in as xs:base64Binary?, $size as xs:integer, $octet as xs:integer|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns an {{Code|xs:base64Binary}} created by padding the input with {{Code|$size}} octets in front of the input. If {{Code|$octet}} is specified, the padding octets each have that value, otherwise they are zero.<br/>If the value of {{Code|$in}} is the empty sequence, the function returns an empty sequence.<br />
|-<br />
| '''Errors'''<br />
|{{Error|negative-size|#Errors}} the specified size is negative.<br/>{{Error|octet-out-of-range|#Errors}} the specified octet lies outside the range 0-255.<br />
|}<br />
<br />
==bin:pad-right==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:pad-right|$in as xs:base64Binary?, $size as xs:integer|xs:base64Binary?}}<br/>{{Func|bin:pad-right|$in as xs:base64Binary?, $size as xs:integer, $octet as xs:integer|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns an {{Code|xs:base64Binary}} created by padding the input with {{Code|$size}} octets after the input. If {{Code|$octet}} is specified, the padding octets each have that value, otherwise they are zero.<br/>If the value of {{Code|$in}} is the empty sequence, the function returns an empty sequence.<br />
|-<br />
| '''Errors'''<br />
|{{Error|negative-size|#Errors}} the specified size is negative.<br/>{{Error|octet-out-of-range|#Errors}} the specified octet lies outside the range 0-255.<br />
|}<br />
<br />
==bin:find==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:find|$in as xs:base64Binary?, $offset as xs:integer, $search as xs:base64Binary|xs:integer?}}<br />
|-<br />
| '''Summary'''<br />
|Returns the first location of the binary search sequence in the input, or if not found, the empty sequence.<br/>The {{Code|$offset}} and the returned location are zero based. If the value of {{Code|$in}} is the empty sequence, the function returns an empty sequence.<br />
|-<br />
| '''Errors'''<br />
|{{Error|index-out-of-range|#Errors}} the specified offset + size is out of range.<br />
|}<br />
<br />
=Text Decoding and Encoding=<br />
<br />
==bin:decode-string==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:decode-string|$in as xs:base64Binary?, $encoding as xs:string|xs:string?}}<br/>{{Func|bin:decode-string|$in as xs:base64Binary?, $encoding as xs:string, $offset as xs:integer|xs:string?}}<br/>{{Func|bin:decode-string|$in as xs:base64Binary?, $encoding as xs:string, $offset as xs:integer, $size as xs:integer|xs:string?}}<br/><br />
|-<br />
| '''Summary'''<br />
|Decodes binary data as a string in a given {{Code|$encoding}}.<br/>If {{Code|$offset}} and {{Code|$size}} are provided, the {{Code|$size}} octets from {{Code|$offset}} are decoded. If {{Code|$offset}} alone is provided, octets from {{Code|$offset}} to the end are decoded.If the value of {{Code|$in}} is the empty sequence, the function returns an empty sequence.<br />
|-<br />
| '''Errors'''<br />
|{{Error|negative-size|#Errors}} the specified size is negative.<br/>{{Error|index-out-of-range|#Errors}} the specified offset + size is out of range.<br/>{{Error|unknown-encoding|#Errors}} the specified encoding is unknown.<br/>{{Error|conversion-error|#Errors}} an error or malformed input occurred during decoding the string.<br />
|-<br />
| '''Examples'''<br />
|Tests whether the binary data starts with binary content consistent with a PDF file:<br/><code>bin:decode-string($data, 'UTF-8', 0, 4) eq '%PDF'</code>.<br />
|}<br />
<br />
==bin:encode-string==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:encode-string|$in as xs:string?, $encoding as xs:string|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Encodes a string into binary data using a given {{Code|$encoding}}.<br/>If the value of {{Code|$in}} is the empty sequence, the function returns an empty sequence.<br />
|-<br />
| '''Errors'''<br />
|{{Error|unknown-encoding|#Errors}} the specified encoding is unknown.<br/>{{Error|conversion-error|#Errors}} an error or malformed input occurred during encoding the string.<br />
|}<br />
<br />
=Packing and Unpacking of Numeric Values=<br />
<br />
The functions have an optional parameter $octet-order whose string value controls the order: Least-significant-first order is indicated by any of the values {{Code|least-significant-first}}, {{Code|little-endian}}, or {{Code|LE}}. Most-significant-first order is indicated by any of the values {{Code|most-significant-first}}, {{Code|big-endian}}, or {{Code|BE}}.<br />
<br />
==bin:pack-double==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:pack-double|$in as xs:double|xs:base64Binary}}<br/>{{Func|bin:pack-double|$in as xs:double, $octet-order as xs:string|xs:base64Binary}}<br />
|-<br />
| '''Summary'''<br />
|Returns the 8-octet binary representation of a double value.<br/>Most-significant-octet-first number representation is assumed unless the {{Code|$octet-order}} parameter is specified.<br />
|-<br />
| '''Errors'''<br />
|{{Error|unknown-significance-order|#Errors}} the specified octet order is unknown.<br />
|}<br />
<br />
==bin:pack-float==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:pack-float|$in as xs:double|xs:base64Binary}}<br/>{{Func|bin:pack-float|$in as xs:double, $octet-order as xs:string|xs:base64Binary}}<br />
|-<br />
| '''Summary'''<br />
|Returns the 4-octet binary representation of a float value.<br/>Most-significant-octet-first number representation is assumed unless the {{Code|$octet-order}} parameter is specified.<br />
|-<br />
| '''Errors'''<br />
|{{Error|unknown-significance-order|#Errors}} the specified octet order is unknown.<br />
|}<br />
<br />
==bin:pack-integer==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:pack-integer|$in as xs:double, $size as xs:integer|xs:base64Binary}}<br/>{{Func|bin:pack-float|$in as xs:double, $size as xs:integer, $octet-order as xs:string|xs:base64Binary}}<br />
|-<br />
| '''Summary'''<br />
|Returns the twos-complement binary representation of an integer value treated as {{Code|$size}} octets long. Any 'excess' high-order bits are discarded.<br/>Most-significant-octet-first number representation is assumed unless the {{Code|$octet-order}} parameter is specified. Specifying a {{Code|$size}} of zero yields an empty binary data.<br />
|-<br />
| '''Errors'''<br />
|{{Error|unknown-significance-order|#Errors}} the specified octet order is unknown.<br/>{{Error|negative-size|#Errors}} the specified size is negative.<br/><br />
|}<br />
<br />
==bin:unpack-double==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:unpack-double|$in as xs:base64Binary, $offset as xs:integer|xs:double}}<br/>{{Func|bin:unpack-double|$in as xs:base64Binary, $offset as xs:integer, $octet-order as xs:string|xs:double}}<br />
|-<br />
| '''Summary'''<br />
|Extracts the double value stored at the particular offset in binary data.<br/>Most-significant-octet-first number representation is assumed unless the {{Code|$octet-order}} parameter is specified. The {{Code|$offset}} is zero based.<br />
|-<br />
| '''Errors'''<br />
|{{Error|index-out-of-range|#Errors}} the specified offset is out of range.<br/>{{Error|unknown-significance-order|#Errors}} the specified octet order is unknown.<br />
|}<br />
<br />
==bin:unpack-float==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:unpack-float|$in as xs:base64Binary, $offset as xs:integer|xs:double}}<br/>{{Func|bin:unpack-float|$in as xs:base64Binary, $offset as xs:integer, $octet-order as xs:string|xs:float}}<br />
|-<br />
| '''Summary'''<br />
|Extracts the float value stored at the particular offset in binary data.<br/>Most-significant-octet-first number representation is assumed unless the {{Code|$octet-order}} parameter is specified. The {{Code|$offset}} is zero based.<br />
|-<br />
| '''Errors'''<br />
|{{Error|index-out-of-range|#Errors}} the specified offset + size is out of range.<br/>{{Error|unknown-significance-order|#Errors}} the specified octet order is unknown.<br />
|}<br />
<br />
==bin:unpack-integer==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:unpack-integer|$in as xs:base64Binary, $offset as xs:integer, $size as xs:integer|xs:double}}<br/>{{Func|bin:unpack-integer|$in as xs:base64Binary, $offset as xs:integer, $size as xs:integer, $octet-order as xs:string|xs:float}}<br />
|-<br />
| '''Summary'''<br />
|Returns a signed integer value represented by the {{Code|$size}} octets starting from {{Code|$offset}} in the input binary representation. Necessary sign extension is performed (i.e. the result is negative if the high order bit is '1').<br/>Most-significant-octet-first number representation is assumed unless the {{Code|$octet-order}} parameter is specified. The {{Code|$offset}} is zero based. Specifying a {{Code|$size}} of zero yields the integer {{Code|0}}.<br />
|-<br />
| '''Errors'''<br />
|{{Error|negative-size|#Errors}} the specified size is negative.<br/>{{Error|index-out-of-range|#Errors}} the specified offset + size is out of range.<br/>{{Error|unknown-significance-order|#Errors}} the specified octet order is unknown.<br />
|}<br />
<br />
==bin:unpack-unsigned-integer==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:unpack-unsigned-integer|$in as xs:base64Binary, $offset as xs:integer, $size as xs:integer|xs:double}}<br/>{{Func|bin:unpack-unsigned-integer|$in as xs:base64Binary, $offset as xs:integer, $size as xs:integer, $octet-order as xs:string|xs:float}}<br />
|-<br />
| '''Summary'''<br />
|Returns an unsigned integer value represented by the {{Code|$size}} octets starting from {{Code|$offset}} in the input binary representation.<br/>Most-significant-octet-first number representation is assumed unless the {{Code|$octet-order}} parameter is specified. The {{Code|$offset}} is zero based. Specifying a {{Code|$size}} of zero yields the integer {{Code|0}}.<br />
|-<br />
| '''Errors'''<br />
|{{Error|negative-size|#Errors}} the specified size is negative.<br/>{{Error|index-out-of-range|#Errors}} the specified offset + size is out of range.<br/>{{Error|unknown-significance-order|#Errors}} the specified octet order is unknown.<br />
|}<br />
<br />
=Bitwise Operations=<br />
<br />
==bin:or==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:or|$a as xs:base64Binary?, $b as xs:base64Binary?|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns the "bitwise or" of two binary arguments.<br/>If either argument is the empty sequence, an empty sequence is returned.<br />
|-<br />
| '''Errors'''<br />
|{{Error|differing-length-arguments|#Errors}} the input arguments are of differing length.<br />
|}<br />
<br />
==bin:xor==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:xor|$a as xs:base64Binary?, $b as xs:base64Binary?|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns the "bitwise xor" of two binary arguments.<br/>If either argument is the empty sequence, an empty sequence is returned.<br />
|-<br />
| '''Errors'''<br />
|{{Error|differing-length-arguments|#Errors}} the input arguments are of differing length.<br />
|}<br />
<br />
==bin:and==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:and|$a as xs:base64Binary?, $b as xs:base64Binary?|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns the "bitwise and" of two binary arguments.<br/>If either argument is the empty sequence, an empty sequence is returned.<br />
|-<br />
| '''Errors'''<br />
|{{Error|differing-length-arguments|#Errors}} the input arguments are of differing length.<br />
|}<br />
<br />
==bin:not==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:not|$in as xs:base64Binary?|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Returns the "bitwise not" of a binary argument.<br/>If the argument is the empty sequence, an empty sequence is returned.<br />
|}<br />
<br />
==bin:shift==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|bin:shift|$in as xs:base64Binary?, $by as xs:integer|xs:base64Binary?}}<br />
|-<br />
| '''Summary'''<br />
|Shifts bits in binary data.<br/>If {{Code|$by}} is zero, the result is identical to {{Code|$in}}. If {{Code|$by}} is positive then bits are shifted to the left. Otherwise, bits are shifted to the right. If the absolute value of <code>$by</code> is greater than the bit-length of {{Code|$in}} then an all-zeros result is returned. The result always has the same size as {{Code|$in}}. The shifting is logical: zeros are placed into discarded bits. If the value of {{Code|$in}} is the empty sequence, the function returns an empty sequence.<br />
|}<br />
<br />
=Errors=<br />
<br />
{| class="wikitable" width="100%"<br />
! width="240"|Code<br />
|Description<br />
|-<br />
|{{Code|differing-length-arguments}} <br />
|The arguments to a bitwise operation have different lengths.<br />
|-<br />
|{{Code|index-out-of-range}}<br />
|An offset value is out of range.<br />
|-<br />
|{{Code|negative-size}}<br />
|A size value is negative.<br />
|-<br />
|{{Code|octet-out-of-range}}<br />
|An octet value lies outside the range 0-255.<br />
|-<br />
|{{Code|non-numeric-character}}<br />
|Binary data cannot be parsed as number.<br />
|-<br />
|{{Code|unknown-encoding}}<br />
|An encoding is not supported.<br />
|-<br />
|{{Code|conversion-error}}<br />
|An error or malformed input during converting a string.<br />
|-<br />
|{{Code|unknown-significance-order}}<br />
|An octet-order value is unknown.<br />
|}<br />
<br />
=Changelog=<br />
<br />
Introduced with Version 7.8.<br />
<br />
[[Category:XQuery]]</div>James Ballhttps://docs.basex.org/index.php?title=XQuery_Errors&diff=10957XQuery Errors2014-08-22T19:49:38Z<p>James Ball: Alteration of wording to XPST0005 to make it clearer.</p>
<hr />
<div>This article is part of the [[XQuery|XQuery Portal]].<br />
It summarizes all error codes that may be thrown by the BaseX XQuery processor.<br />
<br />
As the original specifications are rather bulky and meticulous, <br />
we tried our best to make this overview comprehensible to a wider<br />
range of readers. The following tables list the error codes that<br />
are known to BaseX, a short description, and examples of queries<br />
raising that errors.<br />
<br />
Original definitions of the error codes are found in the<br />
[http://www.w3.org/TR/xquery-30/ XQuery 3.0],<br />
[http://www.w3.org/TR/xpath-functions-30/ XQuery 3.0 Functions],<br />
[http://www.w3.org/TR/xquery-update-10/ XQuery 1.0 Update],<br />
[http://www.w3.org/TR/xpath-full-text-10/ XQuery 1.0 Full Text],<br />
and [http://www.expath.org/spec/http-client EXPath HTTP]<br />
Specifications.<br />
<br />
==BaseX Errors==<br />
<br />
Error Codes: {{Code|BASX}}<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
! width="50%" |Description<br />
|Examples<br />
|- valign="top" scope="row"<br />
|<code>BASX0000</code><br />
| Generic error, which is used for exceptions in [[Java Bindings#Context-Awareness|context-aware Java bindings]].<br />
|<br />
|- valign="top" scope="row"<br />
|<code>BASX0001</code><br />
| The current user has insufficient [[User Management|permissions]] to execute an expression.<br />
|<code>file:delete('file.txt')</code>: ''Create'' rights needed.<br />
|- valign="top" scope="row"<br />
|<code>BASX0002</code><br />
| The specified database option is unknown.<br />
|<code>declare option db:xyz "no"; 1</code><br />
|- valign="top" scope="row"<br />
|<code>BASX0003</code><br />
| Errors related to [[RESTXQ]].<br />
|<code>%restxq:GET('x')</code><br />
|}<br />
<br />
Additional, module-specific error codes are listed in the descriptions of the query modules.<br />
<br />
==Static Errors==<br />
<br />
Error Codes: {{Code|XPST}}, {{Code|XQST}}<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
! width="50%" |Description<br />
|Examples<br />
|- valign="top" scope="row"<br />
|<code>XPST0003</code><br />
|An error occurred while ''parsing'' the query string (i.e., before the query could be compiled and executed). This error is the most common one, and may be accompanied by a variety of different error messages. <br />
|<code>1+<hr/>for i in //* return $i</code><br />
|- valign="top" scope="row"<br />
|<code>XPST0005</code><br />
|An expression will never return any results, no matter what input is provided.<br />
|<code>doc('input')/..</code><br />
|- valign="top" scope="row"<br />
|<code>XPST0008</code><br />
|A variable or type name is used that has not been defined in the current scope.<br />
|<code>$a---<hr/>element(*, x)</code><br />
|- valign="top" scope="row"<br />
|<code>XPST0017</code><br />
| • The specified function is unknown, or<br />• it uses the wrong number of arguments.<br />
|<code>unknown()<hr/>count(1,2,3)</code><br />
|- valign="top" scope="row"<br />
|<code>XPST0051</code><br />
| An unknown QName is used in a ''sequence type'' (e.g. in the target type of the {{Code|cast}} expression).<br />
|<code>1 instance of x<hr/>"test"&nbsp;cast&nbsp;as&nbsp;xs:itr</code><br />
|- valign="top" scope="row"<br />
|<code>XPST0080</code><br />
|<code>xs:NOTATION</code> or {{Code|xs:anyAtomicType}} is used as target type of {{Code|cast}} or {{Code|castable}}.<br />
|<code>1 castable as xs:NOTATION</code><br />
|- valign="top" scope="row"<br />
|<code>XPST0081</code><br />
| • A QName uses a prefix that has not been bound to any namespace, or<br />• a pragma or option declaration has not been prefixed.<br />
|<code>unknown:x<hr/>(# pragma #) { 1 }</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>XQST0009</code><br />
| The query imports a schema (schema import is not supported by BaseX).<br />
|<code>import schema "x"; ()</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0022</code><br />
| Namespace values must be constant strings.<br />
|<code><elem xmlns="{ 'dynamic' }"/></code><br />
|- valign="top" scope="row"<br />
|<code>XQST0031</code><br />
| The specified XQuery version is not specified.<br />
|<code>xquery version "9.9"; ()</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0032</code><br />
| The base URI was declared more than once.<br />
|<code>declare base-uri ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0033</code><br />
| A namespace prefix was declared more than once.<br />
|<code>declare namespace a="a";<br/>declare namespace a="b"; ()</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0034</code><br />
| A function was declared more than once.<br />
|<code>declare function local:a() { 1 };<br/>declare function local:a() { 2 }; local:a()</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0038</code><br />
| The default collation was declared more than once.<br />
|<code>declare default collation ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0039</code><br />
| Two or more parameters in a user-defined function have the same name.<br />
|<code>declare function local:fun($a, $a) { $a * $a };<br/>local:fun(1,2)</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0040</code><br />
| Two or more attributes in an element have the same node name.<br />
|<code><elem a="1" a="12"/></code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0045</code><br />
| A user-defined function uses a reserved namespace.<br />
|<code>declare function fn:fun() { 1 }; ()</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0047</code><br />
| A module was defined more than once.<br />
|<code>import module ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0048</code><br />
| A module declaration does not match the namespace of the specified module.<br />
|<code>import module namespace invalid="uri"; 1</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0049</code><br />
| A global variable was declared more than once.<br />
|<code>declare variable $a := 1;<br/>declare variable $a := 1; $a</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0054</code><br />
| A global variable depends on itself. This may be triggered by a circular variable definition.<br />
|<code>declare variable $a := local:a();<br/>declare function local:a() { $a }; $a</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0055</code><br />
| The mode for copying namespaces was declared more than once.<br />
|<code>declare copy-namespaces ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0057</code><br />
| The namespace of a schema import may not be empty.<br />
|<code>import schema ""; ()</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0059</code><br />
| The schema or module with the specified namespace cannot be found or processed.<br />
|<code>import module "unknown"; ()</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0060</code><br />
| A user-defined function has no namespace.<br />
|<code>declare default function namespace "";<br/>declare function x() { 1 }; 1</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0065</code><br />
| The ordering mode was declared more than once.<br />
|<code>declare ordering ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0065</code><br />
| The default namespace mode for elements or functions was declared more than once.<br />
|<code>declare default element namespace ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0067</code><br />
| The construction mode was declared more than once.<br />
|<code>declare construction ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0068</code><br />
| The mode for handling boundary spaces was declared more than once.<br />
|<code>declare boundary-space ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0069</code><br />
| The default order for empty sequences was declared more than once.<br />
|<code>declare default order empty ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0070</code><br />
| A namespace declaration overwrites a reserved namespace.<br />
|<code>declare namespace xml=""; ()</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0071</code><br />
| A namespace is declared more than once in an element constructor.<br />
|<code><a xmlns="uri1" xmlns="uri2"/></code><br />
|- valign="top" scope="row"<br />
|<code>XQST0075</code><br />
| The query contains a validate expression (validation is not supported by BaseX).<br />
|<code>validate strict { () }</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0076</code><br />
| A {{Code|group by}} or {{Code|order by}} clause specifies an unknown collation.<br />
|<code>for $i in 1 to 10<br/>order by $i collation "unknown"<br/>return $i</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0079</code><br />
| A pragma was specified without the expression that is to be evaluated.<br />
|<code>(# xml:a #) {}</code><br />
|- valign="top" scope="row"<br />
|- valign="top" scope="row"<br />
|<code>XQST0085</code><br />
| An empty namespace URI was specified.<br />
|<code><pref:elem xmlns:pref=""/></code><br />
|- valign="top" scope="row"<br />
|<code>XQST0087</code><br />
| An unknown encoding was specified. Note that the encoding declaration is currently ignored in BaseX.<br />
|<code>xquery version "1.0" encoding "a b"; ()</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0088</code><br />
| An empty module namespace was specified.<br />
|<code>import module ""; ()</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0089</code><br />
| Two variables in a {{Code|for}} or {{Code|let}} clause have the same name.<br />
|<code>for $a at $a in 1 return $i</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0090</code><br />
| A character reference specifies an invalid character.<br />
|<code>"&#0;"</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0093</code><br />
| A module depends on itself. This may be triggered by a circular module definition.<br />
|<code>import module ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0094</code><br />
|<code>group by</code> references a variable that has not been declared before.<br />
|<code>for $a in 1 group by $b return $a</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0097</code><br />
| A {{Code|decimal-format}} property is invalid.<br />
|<code>declare default decimal-format digit = "xxx"; 1</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0098</code><br />
| A single {{Code|decimal-format}} character was assigned to multiple properties.<br />
|<code>declare default decimal-format digit = "%"; 1</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0099</code><br />
| The context item was declared more than once.<br />
|<code>declare context item ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0106</code><br />
| An annotation has been declared twice in a variable or function declaration.<br />
|<code>declare %updating %updating function ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0108</code><br />
| Output declarations may only be specified in the main module.<br />
|Module: <code>declare output ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0109</code><br />
| The specified serialization parameter is unknown.<br />
|<code>declare option output:unknown "..."; 1</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0110</code><br />
| A serialization parameter was specified more than once in the output declarations.<br />
|<code>declare option output:indent "no";<br/>declare option output:indent "no"; 1</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0111</code><br />
| A decimal format was declared more than once.<br />
|<code>declare decimal-format ...</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0113</code><br />
| Context item values may only be in the main module.<br />
|Module: <code>declare context item := 1;</code><br />
|- valign="top" scope="row"<br />
|<code>XQST0114</code><br />
| A decimal-format property has been specified more than once.<br />
|<code>declare decimal-format EN NaN="!" NaN="?"; ()</code><br />
|}<br />
<br />
==Type Errors==<br />
<br />
Error Codes: {{Code|XPTY}}, {{Code|XQTY}}<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
! width="50%" |Description<br />
|Examples<br />
|- valign="top" scope="row"<br />
|<code>XPTY0004</code><br />
| This error is raised if an expression has the wrong type, or cannot be cast into the specified type. It may be raised both statically (during query compilation) or dynamically (at runtime).<br />
|<code>1 + "A"<hr/>abs("a")<hr/>1 cast as xs:gYear</code><br />
|- valign="top" scope="row"<br />
|<code>XPTY0018</code><br />
| The result of the last step in a path expression contains both nodes and atomic values.<br />
|<code>doc('input.xml')/(*, 1)</code><br />
|- valign="top" scope="row"<br />
|<code>XPTY0019</code><br />
| The result of a step (other than the last step) in a path expression contains an atomic values.<br />
|<code>(1 to 10)/*</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>XQTY0024</code><br />
| An attribute node cannot be bound to its parent element, as other nodes of a different type were specified before.<br />
|<code><elem>text { attribute a { "val" } }</elem></code><br />
|- valign="top" scope="row"<br />
|<code>XQTY0105</code><br />
| A function item has been specified as content of an element.<br />
|<code><X>{ false#0 }</X></code><br />
|}<br />
<br />
==Dynamic Errors==<br />
<br />
Error Codes: {{Code|XPDY}}, {{Code|XQDY}}<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
! width="50%" |Description<br />
|Examples<br />
|- valign="top" scope="row"<br />
|<code>XPDY0002</code><br />
| • No value has been defined for an external variable, or<br />• no context item has been set before the query was executed.<br />
|<code>declare variable $x external; $x<hr/>descendant::*</code><br />
|- valign="top" scope="row"<br />
|<code>XPDY0050</code><br />
| • The operand type of a {{Code|treat}} expression does not match the type of the argument, or<br/>• the root of the context item must be a document node.<br />
|<code>"string" treat as xs:int<hr/>"string"[/]</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>XQDY0025</code><br />
| Two or more attributes in a constructed element have the same node name.<br />
|<code>element x { attribute a { "" } attribute a { "" } }</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0026</code><br />
| The content of a computed processing instruction contains "?>".<br />
|<code>processing-instruction pi { "?>" }</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0041</code><br />
| The name of a processing instruction is invalid.<br />
|<code>processing-instruction { "1" } { "" }</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0044</code><br />
| The node name of an attribute uses reserved prefixes or namespaces.<br />
|<code>attribute xmlns { "etc" }</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0064</code><br />
| The name of a processing instruction equals "XML" (case insensitive).<br />
|<code>processing-instruction xml { "etc" }</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0072</code><br />
| The content of a computed comment contains "--" or ends with "-".<br />
|<code>comment { "one -- two" }</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0074</code><br />
| The name of a computed attribute or element is invalid, or uses an unbound prefix.<br />
|<code>element { "x y" } { "" }</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0095</code><br />
| A sequence with more than one item was bound to a {{Code|group by}} clause.<br />
|<code>let $a := (1,2) group by $a return $a</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0096</code><br />
| The node name of an element uses reserved prefixes or namespaces.<br />
|<code>element { QName("uri", "xml:n") } {}</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0101</code><br />
| Invalid namespace declaration.<br />
|<code>namespace xmlns { 'x' }</code><br />
|- valign="top" scope="row"<br />
|<code>XQDY0102</code><br />
| Duplicate namespace declaration.<br />
|<code>element x { namespace a {'b'}, namespace a {'c'} }</code><br />
|}<br />
<br />
==Functions Errors==<br />
<br />
Error Codes: {{Code|FOAR}}, {{Code|FOCA}}, {{Code|FOCH}}, {{Code|FODC}}, {{Code|FODF}}, {{Code|FODT}}, {{Code|FOER}}, {{Code|FOFD}}, {{Code|FONS}}, {{Code|FORG}}, {{Code|FORX}}, {{Code|FOTY}}, {{Code|FOUT}}<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
! width="50%" |Description<br />
|Examples<br />
|- valign="top" scope="row"<br />
|<code>FOAR0001</code><br />
| A value was divided by zero.<br />
|<code>1 div 0</code><br />
|- valign="top" scope="row"<br />
|<code>FOAR0002</code><br />
| A numeric declaration or operation causes an over- or underflow.<br />
|<code>12345678901234567890<hr/>xs:double("-INF") idiv 1</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FOCA0002</code><br />
| • A float number cannot be converted to a decimal or integer value, or<br />• a function argument cannot be converted to a valid QName.<br />
|<code>xs:int(xs:double("INF"))<hr/>QName("", "el em")</code><br />
|- valign="top" scope="row"<br />
|<code>FOCA0003</code><br />
| A value is too large to be represented as integer.<br />
|<code>xs:integer(99e100)</code><br />
|- valign="top" scope="row"<br />
|<code>FOCA0005</code><br />
|<code>"NaN"</code> is supplied to duration operations.<br />
|<code>xs:yearMonthDuration("P1Y") * xs:double("NaN")</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FOCH0001</code><br />
| A codepoint was specified that does not represent a valid XML character.<br />
|<code>codepoints-to-string(0)</code><br />
|- valign="top" scope="row"<br />
|<code>FOCH0002</code><br />
| A unsupported collation was specified in a function.<br />
|<code>compare('a', 'a', 'unknown')</code><br />
|- valign="top" scope="row"<br />
|<code>FOCH0003</code><br />
| A unsupported normalization form was specified in a function.<br />
|<code>normalize-unicode('a', 'unknown')</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FODC0001</code><br />
| The argument specified in {{Code|fn:id()}} or {{Code|fn:idref()}} must have a document node as root.<br />
|<code>id("id0", <xml/>)</code><br />
|- valign="top" scope="row"<br />
|<code>FODC0002</code><br />
| The specified document resource cannot be retrieved.<br />
|<code>doc("unknown.xml")</code><br />
|- valign="top" scope="row"<br />
|<code>FODC0004</code><br />
| The specified collection cannot be retrieved.<br />
|<code>collection("unknown")</code><br />
|- valign="top" scope="row"<br />
|<code>FODC0005</code><br />
| The specified URI to a document resource is invalid.<br />
|<code>doc("<xml/>")</code><br />
|- valign="top" scope="row"<br />
|<code>FODC0006</code><br />
| The string passed to {{Code|fn:parse-xml()}} is not well-formed.<br />
|<code>parse-xml("<x/")</code><br />
|- valign="top" scope="row"<br />
|<code>FODC0007</code><br />
| The base URI passed to {{Code|fn:parse-xml()}} is invalid.<br />
|<code>parse-xml("<x/>", ":")</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FODF1280</code><br />
| The name of the decimal format passed to {{Code|fn:format-number()}} is invalid.<br />
|<code>format-number(1, "0", "invalid")</code><br />
|- valign="top" scope="row"<br />
|<code>FODF1310</code><br />
| The picture string passed to {{Code|fn:format-number()}} is invalid.<br />
|<code>format-number(1, "invalid")</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FODT0001</code><br />
| An arithmetic duration operation causes an over- or underflow.<br />
|<code>xs:date('2000-01-01') + xs:duration('P99999Y')</code><br />
|- valign="top" scope="row"<br />
|<code>FODT0002</code><br />
| A duration declaration or operation causes an over- or underflow.<br />
|<code>implicit-timezone() div 0</code><br />
|- valign="top" scope="row"<br />
|<code>FODT0003</code><br />
| An invalid timezone was specified.<br />
|<code>adjust-time-to-timezone(xs:time("01:01:01"), xs:dayTimeDuration("PT20H"))</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FOER0000</code><br />
| Error triggered by the {{Code|fn:error()}} function.<br />
|<code>error()</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FOFD1340</code><br />
| The picture string passed to {{Code|fn:format-date()}}, {{Code|fn:format-time()}} or {{Code|fn:format-dateTime()}} is invalid.<br />
|<code>format-date(current-date(), "[]")</code><br />
|- valign="top" scope="row"<br />
|<code>FOFD1350</code><br />
| The picture string passed to {{Code|fn:format-date()}}, {{Code|fn:format-time()}} or {{Code|fn:format-dateTime()}} specifies an non-available component.<br />
|<code>format-time(current-time(), "[Y2]")</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FONS0004</code><br />
| A function has a QName as argument that specifies an unbound prefix.<br />
|<code>resolve-QName("x:e", <e/>)</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FORG0001</code><br />
| A value cannot be cast to the required target type.<br />
|<code>xs:integer("A")<hr/>1 + <x>a</x></code><br />
|- valign="top" scope="row"<br />
|<code>FORG0002</code><br />
| The URI passed to {{Code|fn:resolve-URI()}} is invalid.<br />
|<code>resolve-URI(":")</code><br />
|- valign="top" scope="row"<br />
|<code>FORG0003</code><br />
|<code>fn:zero-or-one()</code> was called with more than one item.<br />
|<code>zero-or-one((1, 2))</code><br />
|- valign="top" scope="row"<br />
|<code>FORG0004</code><br />
|<code>fn:one-or-more()</code> was called with zero items.<br />
|<code>one-or-more(())</code><br />
|- valign="top" scope="row"<br />
|<code>FORG0005</code><br />
|<code>fn:exactly-one()</code> was called with zero or more than one item.<br />
|<code>exactly-one((1, 2))</code><br />
|- valign="top" scope="row"<br />
|<code>FORG0006</code><br />
| A wrong argument type was specified in a function call.<br />
|<code>sum((1, "string"))</code><br />
|- valign="top" scope="row"<br />
|<code>FORG0008</code><br />
| The arguments passed to {{Code|fn:dateTime()}} have different timezones.<br />
|<code>dateTime(xs:date("2001-01-01+01:01"), current-time())</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FORX0001</code><br />
| A function specifies an invalid regular expression flag.<br />
|<code>matches('input', 'query', 'invalid')</code><br />
|- valign="top" scope="row"<br />
|<code>FORX0002</code><br />
| A function specifies an invalid regular expression.<br />
|<code>matches('input', '[')</code><br />
|- valign="top" scope="row"<br />
|<code>FORX0003</code><br />
| A regular expression matches an empty string.<br />
|<code>tokenize('input', '.?')</code><br />
|- valign="top" scope="row"<br />
|<code>FORX0004</code><br />
| The replacement string of a regular expression is invalid.<br />
|<code>replace("input", "match", "\")</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FOTY0012</code><br />
| An item has no typed value.<br />
|<code>count#1</code><br />
|- valign="top" scope="row"<br />
|<code>FOTY0013</code><br />
| Functions items cannot be atomized, have no defined equality, and have no string representation.<br />
|<code>data(false#0)</code><br />
|- valign="top" scope="row"<br />
|<code>FOTY0014</code><br />
| Function items have no string representation.<br />
|<code>string(map {})</code><br />
|- valign="top" scope="row"<br />
|<code>FOTY0015</code><br />
| Function items cannot be compared.<br />
|<code>deep-equal(false#0, true#0)</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FOUT1170</code><br />
| Function argument cannot be used to retrieve a text resource.<br />
|<code>unparsed-text(':')</code><br />
|- valign="top" scope="row"<br />
|<code>FOUT1190</code><br />
| Encoding to retrieve a text resource is invalid or not supported.<br />
|<code>unparsed-text('file.txt', 'InvalidEncoding')</code><br />
|}<br />
<br />
==Serialization Errors==<br />
<br />
Error Codes: {{Code|SEPM}}, {{Code|SERE}}, {{Code|SESU}}<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
! width="50%" |Description<br />
|Examples<br />
|- valign="top" scope="row"<br />
|<code>SESU0007</code><br />
| The specified encoding is not supported.<br />
|<code>declare option output:encoding "xyz"; 1</code><br />
|- valign="top" scope="row"<br />
|<code>SEPM0009</code><br />
|<code>omit-xml-declaration</code> is set to {{Code|yes}}, and {{Code|standalone}} has a value other than {{Code|omit}}.<br />
|<br />
|- valign="top" scope="row"<br />
|<code>SEPM0010</code><br />
|<code>method</code> is set to {{Code|xml}}, {{Code|undeclare-prefixes}} is set to {{Code|yes}}, and {{Code|version}} is set to {{Code|1.0}}.<br />
|<br />
|- valign="top" scope="row"<br />
|<code>SERE0014</code><br />
|<code>method</code> is set to {{Code|html}}, and an invalid HTML character is found.<br />
|<br />
|- valign="top" scope="row"<br />
|<code>SERE0015</code><br />
|<code>method</code> is set to {{Code|html}}, and a closing bracket (&gt;) appears inside a processing instruction.<br />
|<br />
|- valign="top" scope="row"<br />
|<code>SEPM0016</code><br />
| A specified parameter is unknown or has an invalid value.<br />
|<code>declare option output:indent "nope"; 1</code><br />
|<br />
|}<br />
<br />
==Update Errors==<br />
<br />
Error Codes: {{Code|FOUP}}, {{Code|XUDY}}, {{Code|XUST}}, {{Code|XUTY}}<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
! width="50%" |Description<br />
|Examples<br />
|- valign="top" scope="row"<br />
|<code>FOUP0001</code><br />
| The first argument of {{Code|fn:put()}} must be a document node or element.<br />
|<code>fn:put(text { 1 }, 'file.txt')</code><br />
|- valign="top" scope="row"<br />
|<code>FOUP0002</code><br />
| The second argument of {{Code|fn:put()}} is not a valid URI.<br />
|<code>fn:put(<a/>, '//')</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>XUDY0009</code><br />
| The target node of a replace expression needs a parent in order to be replaced.<br />
|<code>replace node <target/> with <new/></code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0014</code><br />
| The expression updated by the {{Code|modify}} clause was not created by the {{Code|copy}} clause.<br />
|<code>let $a := doc('a') return copy $b := $a modify delete node $a/* return $b</code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0015</code><br />
| In a {{Code|rename}} expression, a target is renamed more than once.<br />
|<code>let $a := <xml/> return (rename node $a as 'a', rename node $a as 'b')</code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0016</code><br />
| In a {{Code|replace}} expression, a target is replaced more than once.<br />
|<code>let $a := <x>x</x>/node() return (replace node $a with <a/>, replace node $a with <b/>)</code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0017</code><br />
| In a {{Code|replace value of}} expression, a target is replaced more than once.<br />
|<code>let $a := <x/> return (replace value of node $a with 'a', replace value of node $a with 'a')</code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0021</code><br />
| The resulting update expression contains duplicate attributes.<br />
|<code>copy $c := <x a='a'/> modify insert node attribute a {""} into $c return $c</code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0023</code><br />
| The resulting update expression conflicts with existing namespaces.<br />
|<code>rename node <a:ns xmlns:a='uri'/> as QName('URI', 'a:ns')</code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0024</code><br />
| New namespaces conflict with each other.<br />
|<code>copy $n := <x/> modify (insert node attribute { QName('uri1', 'a') } { "" } into $n, insert node attribute { QName('uri2', 'a') } { "" } into $n) return $n</code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0027</code><br />
| Target of an update expression is an empty sequence.<br />
|<code>insert node <x/> into ()</code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0029</code><br />
| The target of an update expression has no parent node.<br />
|<code>insert node <new/> before <target/></code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0030</code><br />
| Attributes cannot be inserted before or after the child of a document node.<br />
|<code>insert node <e a='a'/>/@a after document { <e/> }/*</code><br />
|- valign="top" scope="row"<br />
|<code>XUDY0031</code><br />
| Multiple calls to {{Code|fn:put()}} address the same URI.<br />
|<code>for $i in 1 to 3 return put(<a/>, 'file.txt')</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>XUST0001</code><br />
| No updating expression is allowed here.<br />
|<code>delete node /, "finished."</code><br />
|- valign="top" scope="row"<br />
|<code>XUST0002</code><br />
| An updating expression is expected in the {{Code|modify}} clause or an updating function.<br />
|<code>copy $a := <x/> modify 1 return $a</code><br />
|- valign="top" scope="row"<br />
|<code>XUST0003</code><br />
| The revalidation mode was declared more than once.<br />
|<code>declare revalidation ...</code><br />
|- valign="top" scope="row"<br />
|<code>XUST0026</code><br />
| The query contains a revalidate expression (revalidation is not supported by BaseX).<br />
|<code>declare revalidation ...</code><br />
|- valign="top" scope="row"<br />
|<code>XUST0028</code><br />
| no return type may be specified in an updating function.<br />
|<code>declare updating function local:x() as item() { () }; ()</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>XUTY0004</code><br />
| New attributes to be inserted must directly follow the root node.<br />
|<code>insert node (<a/>, attribute a {""}) into <a/></code><br />
|- valign="top" scope="row"<br />
|<code>XUTY0005</code><br />
| A single element or document node is expected as target of an {{Code|insert}} expression.<br />
|<code>insert node <new/> into attribute a { "" }</code><br />
|- valign="top" scope="row"<br />
|<code>XUTY0006</code><br />
| A single element, text, comment or processing instruction is expected as target of an {{Code|insert before/after}} expression.<br />
|<code>insert node <new/> after attribute a { "" }</code><br />
|- valign="top" scope="row"<br />
|<code>XUTY0007</code><br />
| Only nodes can be deleted.<br />
|<code>delete node "string"</code><br />
|- valign="top" scope="row"<br />
|<code>XUTY0008</code><br />
| A single element, text, attribute, comment or processing instruction is expected as target of a {{Code|replace}} expression.<br />
|<code>replace node document { <a/> } with <b/></code><br />
|- valign="top" scope="row"<br />
|<code>XUTY0010</code><br />
| In a {{Code|replace}} expression, in which no attributes are targeted, the replacing nodes must not be attributes as well.<br />
|<code>replace node <a><b/></a>/b with attribute size { 1 }</code><br />
|- valign="top" scope="row"<br />
|<code>XUTY0011</code><br />
| In the {{Code|replace}} expression, in which attributes are targeted, the replacing nodes must be attributes as well.<br />
|<code>replace node <e a=""/>/@a with <a/></code><br />
|- valign="top" scope="row"<br />
|<code>XUTY0012</code><br />
| In a {{Code|rename}} expression, the target nodes must be an element, attribute or processing instruction.<br />
|<code>rename node text { 1 } as <x/></code><br />
|- valign="top" scope="row"<br />
|<code>XUTY0013</code><br />
| An expression in the {{Code|copy}} clause must return a single node.<br />
|<code>copy $c := (<a/>, <b/>) modify () return $c</code><br />
|- valign="top" scope="row"<br />
|<code>XUTY0022</code><br />
| An attribute must not be inserted into a document node.<br />
|<code>insert node <e a=""/>/@a into document {'a'}</code><br />
|}<br />
<br />
==Full-Text Errors==<br />
<br />
Error Codes: {{Code|FTDY}}, {{Code|FTST}}<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
! width="50%" |Description<br />
|Examples<br />
|- valign="top" scope="row"<br />
|<code>FTDY0016</code><br />
| The specified weight value is out of range.<br />
|<code>'a' contains text 'a' weight { 1001 }</code><br />
|- valign="top" scope="row"<br />
|<code>FTDY0017</code><br />
| The {{Code|not in}} operator contains a ''string exclude''.<br />
|<code>'a' contains text 'a' not in (ftnot 'a')</code><br />
|- valign="top" scope="row"<br />
|<code>FTDY0020</code><br />
| The search term uses an invalid wildcard syntax.<br />
|<code>'a' contains text '.{}' using wildcards</code><br />
|-<br />
| colspan=3 style="background-color:white;"|<br />
|- valign="top" scope="row"<br />
|<code>FTST0007</code><br />
| The full-text expression contains an ignore option (the {{Code|ignore option}} is not supported by BaseX).<br />
|<code>'a' contains text 'a' without content 'x'</code><br />
|- valign="top" scope="row"<br />
|<code>FTST0008</code><br />
| The specified stop word file could not be opened or processed.<br />
|<code>'a' contains text 'a' using stop words at 'unknown.txt'</code><br />
|- valign="top" scope="row"<br />
|<code>FTST0009</code><br />
| The specified language is not supported.<br />
|<code>'a' contains text 'a' using language 'aaa'</code><br />
|- valign="top" scope="row"<br />
|<code>FTST0018</code><br />
| The specified thesaurus file could not be opened or processed.<br />
|<code>'a' contains text 'a' using thesaurus at 'aaa'</code><br />
|- valign="top" scope="row"<br />
|<code>FTST0019</code><br />
| A match option was specified more than once.<br />
|<code>'a' contains text 'a' using stemming using stemming</code><br />
|}<br />
<br />
[[Category:XQuery]]</div>James Ballhttps://docs.basex.org/index.php?title=Index_File_Structure&diff=10956Index File Structure2014-08-22T16:29:30Z<p>James Ball: Added category to page - internals</p>
<hr />
<div>This page is provided to help those who are interested in the specific file format of the index files used by BaseX. It was predominantly written to aid research into the reasons for ever increasing file size when using the <code>[[Options#UPDINDEX|UPDINDEX]]</code> option.<br />
<br />
== Attribute Index Files ==<br />
<br />
=== atvr.basex ===<br />
<br />
This file is made up of a series of 5-byte records. There is one record for each attribute value in the index and the records are sorted by ascending order of value. The record itself is big-endian integer value giving the position of the start of the ID list in the atvl.basex file.<br />
<br />
In the example below we have a file for a database with attribute values entered in the order 100, 200, 1 & d. The actual attribute values are not held in this file but because we know the records in this file are ordered by value so we can interpret that the ID list for 1 is at position 8, the ID list for 100 is at position 4, the ID list for 200 is at position 6.<br />
<br />
The bytes of the file (in hex) are:<br />
<br />
<code>00 00 00 00 08</code><code>00 00 00 00 04</code><code>00 00 00 00 06</code><code>00 00 00 00 0A</code><br />
<br />
=== atvl.basex ===<br />
<br />
This file provides a number of details.<br />
*The total number of attribute values<br />
*The number of times each attribute value appears<br />
*The offsets for each occurrence of each attribute value<br />
<br />
In the example below we have the file for the database used in the atrv.basex example above. The first four bytes provide a big-endian integer value of the total number of different attribute values in the index - in this case 4.<br />
<br />
The remainder of the file is made up of ID lists. Each list starts on one of the bytes from atvr.basex - in the case of our example there is a list starting on byte at position 8 (counting starting from 0). The first item in the list is a count of the number of attributes will this value - in our case here it's 1. Then the list has the locations of the attributes - in our case there is only one attribute and it's at a position 8. This means that it is offset 8 positions from the beginning of the database (use [[Commands#INFO_STORAGE|INFO STORAGE]] command to view the order).<br />
<br />
<code>00 00 00 04</code><code>[01] 02</code><code>[01] 05</code>'''<code>[01] 08</code>'''<code>[01] 0B</code><br />
<br />
The offset is against the beginning of the database for the first occurrence of the attribute and then against the previous attribute for all the rest in the list. So:<br />
<br />
<code>[03] 0B 02 02</code><br />
<br />
would show that there are three values of the attribute. The first is offset 11 places from the beginning of the database (ID 11), the second is offset 2 places from that (ID 13) and the third is offset two places from the second (ID 15).<br />
<br />
==== Compressed Integers ====<br />
<br />
The example here is from a very small database so the elements of the ID lists are all one byte long. However the ID lists actually use compressed integers (see <code>Num</code> under [[Storage_Layout#Data Types|Storage Layout]] and [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/Num.java Num.java]). This means that each element in the list can be from one to five bytes in length. The value of the first byte tells you how many bytes the element is and how to interpret the value.<br />
<br />
Values from <code>00</code> to <code>3F</code> are single byte elements can be read directly as 0 to 63.<br />
<br />
Value from <code>40</code> to <code>7F</code> are dual byte elements. The value is given by the integer value of the first byte, subtract 64 and multiply by 256 plus the direct value of the second byte.<br />
<br />
<code>40 A1</code> = ((64 - 64) * 256) + 161 = 161<br />
<br />
<code>41 A1</code> = ((65 - 64) * 256) + 161 = 417<br />
<br />
<code>51 A1</code> = ((81 - 64) * 256) + 161 = 4513<br />
<br />
<code>7F FF</code> = ((127 - 64) * 256) + 161 = 16383<br />
<br />
Values from <code>80</code> to <code>BF</code> are four-byte elements. The value is given by the integer value of the first byte, subtract 128 and multiply by 16,777,216 plus the direct value of the following three bytes.<br />
<br />
<code>80 11 12 13</code> = ((128 - 128) * 16777216) + 1118739 = 1,118,739<br />
<br />
A value of <code>C0</code> shows a five-byte element. The value of the element is the direct value of the following four bytes.<br />
<br />
<code>C0 11 12 13 14</code> is 286,397,204<br />
<br />
==== Gaps in the File ====<br />
<br />
When an index is created by using the functions in the GUI, commands or XQuery then the atvl.basex file will be a continuous run of ID lists. However, when the index is being updated automatically because UPDINDEX is set to true then gaps can be created in the file.<br />
<br />
Consider the example we had before where each attribute had only once occurrence in the database. Now we add a new file and there are now two instances of '''d''' as an attribute value.<br />
<br />
The atvr.basex file may now look like this:<br />
<br />
<span style="background-color:#FFE4C4"><code>00 00 00 00 08</code></span><br />
<span style="background-color:#FFF8DC"><code>00 00 00 00 04</code></span><br />
<span style="background-color:#DEB887"><code>00 00 00 00 06</code></span><br />
<span style="background-color:#F4A460"><code>00 00 00 00 0C</code></span><br />
<br />
The atvl.basex file may now look like this:<br />
<br />
<code>00 00 00 04</code><br />
<span style="background-color:#FFF8DC"><code>[01] 02</code></span><br />
<span style="background-color:#DEB887"><code>[01] 05</code></span><br />
<span style="background-color:#FFE4C4"><code>[01] 08</code></span><br />
<code>[01] 0B</code><br />
<span style="background-color:#F4A460"><code>[02] 0B 03</code></span><br />
<br />
The header tells us that there are 4 attribute values but we can see there are 5 ID lists in the file. One has become orphaned - a new longer list was required to include the newly added attribute and has been appended to the end of the file.<br />
<br />
In versions of BaseX prior to 8.0 when items are deleted and a shorter list is required it will be updated in place. When items are added and a longer list is required the new list is always added at the end of the file. Over a period of time the file will grow - running the [[Commands#OPTIMIZE|OPTIMIZE]] command will recreate the index from scratch and recover the lost space.<br />
<br />
From BaseX 8.0 some optimisations have been applies so that while a database is open a list of free spaces is maintained and a new list will only be added to the end of the file if there isn't a free space available that is large enough. However this list of free spaces is lost when the database is closed and future operations will not be aware of any free space available when the database is opened. This, and the fact that small spaces are unlikely to be filled (single bytes for example) mean that the index file may still grow larger than it needs to be. This space can be recovered, as before, by running [[Commands#OPTIMIZE|OPTIMIZE]].<br />
<br />
== Value Index Files ==<br />
<br />
These files, txtr.basex and txtl.basex work in the same way as the attribute index files but with references to the text nodes instead of attributes.<br />
<br />
[[Category:Internals]]</div>James Ballhttps://docs.basex.org/index.php?title=Index_File_Structure&diff=10955Index File Structure2014-08-22T16:25:56Z<p>James Ball: Corrected link to OPTIMIZE</p>
<hr />
<div>This page is provided to help those who are interested in the specific file format of the index files used by BaseX. It was predominantly written to aid research into the reasons for ever increasing file size when using the <code>[[Options#UPDINDEX|UPDINDEX]]</code> option.<br />
<br />
== Attribute Index Files ==<br />
<br />
=== atvr.basex ===<br />
<br />
This file is made up of a series of 5-byte records. There is one record for each attribute value in the index and the records are sorted by ascending order of value. The record itself is big-endian integer value giving the position of the start of the ID list in the atvl.basex file.<br />
<br />
In the example below we have a file for a database with attribute values entered in the order 100, 200, 1 & d. The actual attribute values are not held in this file but because we know the records in this file are ordered by value so we can interpret that the ID list for 1 is at position 8, the ID list for 100 is at position 4, the ID list for 200 is at position 6.<br />
<br />
The bytes of the file (in hex) are:<br />
<br />
<code>00 00 00 00 08</code><code>00 00 00 00 04</code><code>00 00 00 00 06</code><code>00 00 00 00 0A</code><br />
<br />
=== atvl.basex ===<br />
<br />
This file provides a number of details.<br />
*The total number of attribute values<br />
*The number of times each attribute value appears<br />
*The offsets for each occurrence of each attribute value<br />
<br />
In the example below we have the file for the database used in the atrv.basex example above. The first four bytes provide a big-endian integer value of the total number of different attribute values in the index - in this case 4.<br />
<br />
The remainder of the file is made up of ID lists. Each list starts on one of the bytes from atvr.basex - in the case of our example there is a list starting on byte at position 8 (counting starting from 0). The first item in the list is a count of the number of attributes will this value - in our case here it's 1. Then the list has the locations of the attributes - in our case there is only one attribute and it's at a position 8. This means that it is offset 8 positions from the beginning of the database (use [[Commands#INFO_STORAGE|INFO STORAGE]] command to view the order).<br />
<br />
<code>00 00 00 04</code><code>[01] 02</code><code>[01] 05</code>'''<code>[01] 08</code>'''<code>[01] 0B</code><br />
<br />
The offset is against the beginning of the database for the first occurrence of the attribute and then against the previous attribute for all the rest in the list. So:<br />
<br />
<code>[03] 0B 02 02</code><br />
<br />
would show that there are three values of the attribute. The first is offset 11 places from the beginning of the database (ID 11), the second is offset 2 places from that (ID 13) and the third is offset two places from the second (ID 15).<br />
<br />
==== Compressed Integers ====<br />
<br />
The example here is from a very small database so the elements of the ID lists are all one byte long. However the ID lists actually use compressed integers (see <code>Num</code> under [[Storage_Layout#Data Types|Storage Layout]] and [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/Num.java Num.java]). This means that each element in the list can be from one to five bytes in length. The value of the first byte tells you how many bytes the element is and how to interpret the value.<br />
<br />
Values from <code>00</code> to <code>3F</code> are single byte elements can be read directly as 0 to 63.<br />
<br />
Value from <code>40</code> to <code>7F</code> are dual byte elements. The value is given by the integer value of the first byte, subtract 64 and multiply by 256 plus the direct value of the second byte.<br />
<br />
<code>40 A1</code> = ((64 - 64) * 256) + 161 = 161<br />
<br />
<code>41 A1</code> = ((65 - 64) * 256) + 161 = 417<br />
<br />
<code>51 A1</code> = ((81 - 64) * 256) + 161 = 4513<br />
<br />
<code>7F FF</code> = ((127 - 64) * 256) + 161 = 16383<br />
<br />
Values from <code>80</code> to <code>BF</code> are four-byte elements. The value is given by the integer value of the first byte, subtract 128 and multiply by 16,777,216 plus the direct value of the following three bytes.<br />
<br />
<code>80 11 12 13</code> = ((128 - 128) * 16777216) + 1118739 = 1,118,739<br />
<br />
A value of <code>C0</code> shows a five-byte element. The value of the element is the direct value of the following four bytes.<br />
<br />
<code>C0 11 12 13 14</code> is 286,397,204<br />
<br />
==== Gaps in the File ====<br />
<br />
When an index is created by using the functions in the GUI, commands or XQuery then the atvl.basex file will be a continuous run of ID lists. However, when the index is being updated automatically because UPDINDEX is set to true then gaps can be created in the file.<br />
<br />
Consider the example we had before where each attribute had only once occurrence in the database. Now we add a new file and there are now two instances of '''d''' as an attribute value.<br />
<br />
The atvr.basex file may now look like this:<br />
<br />
<span style="background-color:#FFE4C4"><code>00 00 00 00 08</code></span><br />
<span style="background-color:#FFF8DC"><code>00 00 00 00 04</code></span><br />
<span style="background-color:#DEB887"><code>00 00 00 00 06</code></span><br />
<span style="background-color:#F4A460"><code>00 00 00 00 0C</code></span><br />
<br />
The atvl.basex file may now look like this:<br />
<br />
<code>00 00 00 04</code><br />
<span style="background-color:#FFF8DC"><code>[01] 02</code></span><br />
<span style="background-color:#DEB887"><code>[01] 05</code></span><br />
<span style="background-color:#FFE4C4"><code>[01] 08</code></span><br />
<code>[01] 0B</code><br />
<span style="background-color:#F4A460"><code>[02] 0B 03</code></span><br />
<br />
The header tells us that there are 4 attribute values but we can see there are 5 ID lists in the file. One has become orphaned - a new longer list was required to include the newly added attribute and has been appended to the end of the file.<br />
<br />
In versions of BaseX prior to 8.0 when items are deleted and a shorter list is required it will be updated in place. When items are added and a longer list is required the new list is always added at the end of the file. Over a period of time the file will grow - running the [[Commands#OPTIMIZE|OPTIMIZE]] command will recreate the index from scratch and recover the lost space.<br />
<br />
From BaseX 8.0 some optimisations have been applies so that while a database is open a list of free spaces is maintained and a new list will only be added to the end of the file if there isn't a free space available that is large enough. However this list of free spaces is lost when the database is closed and future operations will not be aware of any free space available when the database is opened. This, and the fact that small spaces are unlikely to be filled (single bytes for example) mean that the index file may still grow larger than it needs to be. This space can be recovered, as before, by running [[Commands#OPTIMIZE|OPTIMIZE]].<br />
<br />
== Value Index Files ==<br />
<br />
These files, txtr.basex and txtl.basex work in the same way as the attribute index files but with references to the text nodes instead of attributes.</div>James Ballhttps://docs.basex.org/index.php?title=Index_File_Structure&diff=10954Index File Structure2014-08-22T16:24:29Z<p>James Ball: Corrected link to INFO STORAGE</p>
<hr />
<div>This page is provided to help those who are interested in the specific file format of the index files used by BaseX. It was predominantly written to aid research into the reasons for ever increasing file size when using the <code>[[Options#UPDINDEX|UPDINDEX]]</code> option.<br />
<br />
== Attribute Index Files ==<br />
<br />
=== atvr.basex ===<br />
<br />
This file is made up of a series of 5-byte records. There is one record for each attribute value in the index and the records are sorted by ascending order of value. The record itself is big-endian integer value giving the position of the start of the ID list in the atvl.basex file.<br />
<br />
In the example below we have a file for a database with attribute values entered in the order 100, 200, 1 & d. The actual attribute values are not held in this file but because we know the records in this file are ordered by value so we can interpret that the ID list for 1 is at position 8, the ID list for 100 is at position 4, the ID list for 200 is at position 6.<br />
<br />
The bytes of the file (in hex) are:<br />
<br />
<code>00 00 00 00 08</code><code>00 00 00 00 04</code><code>00 00 00 00 06</code><code>00 00 00 00 0A</code><br />
<br />
=== atvl.basex ===<br />
<br />
This file provides a number of details.<br />
*The total number of attribute values<br />
*The number of times each attribute value appears<br />
*The offsets for each occurrence of each attribute value<br />
<br />
In the example below we have the file for the database used in the atrv.basex example above. The first four bytes provide a big-endian integer value of the total number of different attribute values in the index - in this case 4.<br />
<br />
The remainder of the file is made up of ID lists. Each list starts on one of the bytes from atvr.basex - in the case of our example there is a list starting on byte at position 8 (counting starting from 0). The first item in the list is a count of the number of attributes will this value - in our case here it's 1. Then the list has the locations of the attributes - in our case there is only one attribute and it's at a position 8. This means that it is offset 8 positions from the beginning of the database (use [[Commands#INFO_STORAGE|INFO STORAGE]] command to view the order).<br />
<br />
<code>00 00 00 04</code><code>[01] 02</code><code>[01] 05</code>'''<code>[01] 08</code>'''<code>[01] 0B</code><br />
<br />
The offset is against the beginning of the database for the first occurrence of the attribute and then against the previous attribute for all the rest in the list. So:<br />
<br />
<code>[03] 0B 02 02</code><br />
<br />
would show that there are three values of the attribute. The first is offset 11 places from the beginning of the database (ID 11), the second is offset 2 places from that (ID 13) and the third is offset two places from the second (ID 15).<br />
<br />
==== Compressed Integers ====<br />
<br />
The example here is from a very small database so the elements of the ID lists are all one byte long. However the ID lists actually use compressed integers (see <code>Num</code> under [[Storage_Layout#Data Types|Storage Layout]] and [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/Num.java Num.java]). This means that each element in the list can be from one to five bytes in length. The value of the first byte tells you how many bytes the element is and how to interpret the value.<br />
<br />
Values from <code>00</code> to <code>3F</code> are single byte elements can be read directly as 0 to 63.<br />
<br />
Value from <code>40</code> to <code>7F</code> are dual byte elements. The value is given by the integer value of the first byte, subtract 64 and multiply by 256 plus the direct value of the second byte.<br />
<br />
<code>40 A1</code> = ((64 - 64) * 256) + 161 = 161<br />
<br />
<code>41 A1</code> = ((65 - 64) * 256) + 161 = 417<br />
<br />
<code>51 A1</code> = ((81 - 64) * 256) + 161 = 4513<br />
<br />
<code>7F FF</code> = ((127 - 64) * 256) + 161 = 16383<br />
<br />
Values from <code>80</code> to <code>BF</code> are four-byte elements. The value is given by the integer value of the first byte, subtract 128 and multiply by 16,777,216 plus the direct value of the following three bytes.<br />
<br />
<code>80 11 12 13</code> = ((128 - 128) * 16777216) + 1118739 = 1,118,739<br />
<br />
A value of <code>C0</code> shows a five-byte element. The value of the element is the direct value of the following four bytes.<br />
<br />
<code>C0 11 12 13 14</code> is 286,397,204<br />
<br />
==== Gaps in the File ====<br />
<br />
When an index is created by using the functions in the GUI, commands or XQuery then the atvl.basex file will be a continuous run of ID lists. However, when the index is being updated automatically because UPDINDEX is set to true then gaps can be created in the file.<br />
<br />
Consider the example we had before where each attribute had only once occurrence in the database. Now we add a new file and there are now two instances of '''d''' as an attribute value.<br />
<br />
The atvr.basex file may now look like this:<br />
<br />
<span style="background-color:#FFE4C4"><code>00 00 00 00 08</code></span><br />
<span style="background-color:#FFF8DC"><code>00 00 00 00 04</code></span><br />
<span style="background-color:#DEB887"><code>00 00 00 00 06</code></span><br />
<span style="background-color:#F4A460"><code>00 00 00 00 0C</code></span><br />
<br />
The atvl.basex file may now look like this:<br />
<br />
<code>00 00 00 04</code><br />
<span style="background-color:#FFF8DC"><code>[01] 02</code></span><br />
<span style="background-color:#DEB887"><code>[01] 05</code></span><br />
<span style="background-color:#FFE4C4"><code>[01] 08</code></span><br />
<code>[01] 0B</code><br />
<span style="background-color:#F4A460"><code>[02] 0B 03</code></span><br />
<br />
The header tells us that there are 4 attribute values but we can see there are 5 ID lists in the file. One has become orphaned - a new longer list was required to include the newly added attribute and has been appended to the end of the file.<br />
<br />
In versions of BaseX prior to 8.0 when items are deleted and a shorter list is required it will be updated in place. When items are added and a longer list is required the new list is always added at the end of the file. Over a period of time the file will grow - running the [[Options#OPTIMIZE|OPTIMIZE]] command will recreate the index from scratch and recover the lost space.<br />
<br />
From BaseX 8.0 some optimisations have been applies so that while a database is open a list of free spaces is maintained and a new list will only be added to the end of the file if there isn't a free space available that is large enough. However this list of free spaces is lost when the database is closed and future operations will not be aware of any free space available when the database is opened. This, and the fact that small spaces are unlikely to be filled (single bytes for example) mean that the index file may still grow larger than it needs to be. This space can be recovered, as before, by running [[Options#OPTIMIZE|OPTIMIZE]].<br />
<br />
== Value Index Files ==<br />
<br />
These files, txtr.basex and txtl.basex work in the same way as the attribute index files but with references to the text nodes instead of attributes.</div>James Ballhttps://docs.basex.org/index.php?title=Storage_Layout&diff=10953Storage Layout2014-08-22T14:51:51Z<p>James Ball: Provided a link to the Index File Structure page giving more details of the layout of those files.</p>
<hr />
<div>This article is part of the [[Advanced User's Guide]]. It presents some low-level details on how data is stored in the database files.<br />
<br />
=Data Types=<br />
<br />
The following data types are used for specifying the storage layout:<br />
<br />
{| class="wikitable" width="100%"<br />
|-<br />
! Type<br />
! Description<br />
! Example (native → hex integers)<br />
|-<br />
| {{Type|Num}}<br />
| Compressed integer (1-5 bytes), specified in [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/Num.java Num.java]<br />
| {{Code|15}} → {{Code|0F}}; {{Code|511}} → {{Code|41 FF}}<br/><br />
|-<br />
| {{Type|Token}}<br />
| Length ({{Type|Num}}) and bytes of UTF8 byte representation<br />
| {{Code|Hello}} → {{Code|05 48 65 6c 6c 6f}}<br />
|-<br />
| {{Type|Double}}<br />
| Number, stored as token<br />
| {{Code|123}} → {{Code|03 31 32 33}}<br />
|-<br />
| {{Type|Boolean}}<br />
| Boolean (1 byte, {{Code|00}} or {{Code|01}})<br />
| {{Code|true}} → {{Code|01}}<br />
|-<br />
| {{Type|Nums}}, {{Type|Tokens}}, {{Type|Doubles}}<br />
| Arrays of values, introduced with the number of entries<br />
| {{Code|1,2}} → {{Code|02 01 31 01 32}}<br />
|-<br />
| {{Type|TokenSet}}<br />
| Key array ({{Type|Tokens}}), next/bucket/size arrays (3x {{Type|Nums}})<br />
|<br />
|}<br />
<br />
=Database Files=<br />
<br />
The following tables illustrate the layout of the BaseX database files. All files are suffixed with {{Code|.basex}}.<br />
<br />
==Meta Data, Name/Path/Doc Indexes: {{Code|inf}}==<br />
<br />
{| class="wikitable" width="100%"<br />
|-<br />
! Description<br />
! Format<br />
! Method<br />
|-<br />
| '''1. Meta Data'''<br />
| 1. Key/value pairs, in no particular order ({{Type|Token}}/{{Type|Token}}):<br/>&nbsp; &bull; Examples: {{Code|FNAME}}, {{Code|TIME}}, {{Code|SIZE}}, ...<br />&nbsp; &bull; {{Code|PERM}} → Number of users ({{Type|Num}}), and name/password/permission values for each user ({{Type|Token}}/{{Type|Token}}/{{Type|Num}})<br/>2. Empty key as finalizer<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/data/DiskData.java DiskData()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/data/MetaData.java MetaData()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/core/Users.java Users()]<br />
|-<br />
| '''2. Main memory indexes'''<br />
| 1. Key/value pairs, in no particular order ({{Type|Token}}/{{Type|Token}}):<br />&nbsp; &bull; {{Code|TAGS}} → Tag Index<br />&nbsp; &bull; {{Code|ATTS}} → Attribute Name Index<br />&nbsp; &bull; {{Code|PATH}} → Path Index<br />&nbsp; &bull; {{Code|NS}} → Namespaces<br />&nbsp; &bull; {{Code|DOCS}} → Document Index<br/>2. Empty key as finalizer<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/data/DiskData.java DiskData()]<br />
|-<br />
| '''2 a) Name Index'''<br/>Tag/attribute names<br />
| 1. Token set, storing all names ({{Type|TokenSet}})<br />2. One StatsKey instance per entry:<br/>2.1. Content kind ({{Type|Num}}):<br />2.1.1. Number: min/max ({{Type|Doubles}})<br />2.1.2. Category: number of entries ({{Type|Num}}), entries ({{Type|Tokens}})<br />2.2. Number of entries ({{Type|Num}})<br />2.3. Leaf flag ({{Type|Boolean}})<br />2.4. Maximum text length ({{Type|Double}}; legacy, could be {{Type|Num}})<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/index/Names.java Names()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/hash/TokenSet.java TokenSet.read()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/index/StatsKey.java StatsKey()]<br />
|-<br />
| '''2 b) Path Index'''<br />
| 1. Flag for path definition ({{Type|Boolean}}, always {{Code|true}}; legacy)<br/>2. PathNode:<br/>2.1. Name reference ({{Type|Num}})<br/>2.2. Node kind ({{Type|Num}})<br/>2.3. Number of occurrences ({{Type|Num}})<br/>2.4. Number of children ({{Type|Num}})<br/>2.5. {{Type|Double}}; legacy, can be reused or discarded<br/>2.6. Recursive generation of child nodes (→ 2)<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/index/path/PathSummary.java PathSummary()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/index/path/PathNode.java PathNode()]<br />
|-<br />
| '''2 c) Namespaces'''<br />
| 1. Token set, storing prefixes ({{Type|TokenSet}})<br/>2. Token set, storing URIs ({{Type|TokenSet}})<br/>3. NSNode:<br/>3.1. pre value ({{Type|Num}})<br/>3.2. References to prefix/URI pairs ({{Type|Nums}})<br/>3.3. Number of children ({{Type|Num}})<br/>3.4. Recursive generation of child nodes (→ 3)<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/data/Namespaces.java Namespaces()]<br/>[https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/data/NSNode.java NSNode()]<br />
|-<br />
| '''2 d) Document Index'''<br />
| Array of integers, representing the distances between all document pre values ({{Type|Nums}})<br />
| [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/index/DocIndex.java DocIndex()]<br />
|}<br />
<br />
==Node Table: {{Code|tbl}}, {{Code|tbli}}==<br />
<br />
* {{Code|tbl}}: Main database table, stored in blocks.<br />
* {{Code|tbli}}: Database directory, organizing the database blocks.<br />
<br />
Some more information on the [[Node Storage|node storage]] is available.<br />
<br />
==Texts: {{Code|txt}}, {{Code|atv}}==<br />
<br />
* {{Code|txt}}: Heap file for text values (document names, string values of texts, comments and processing instructions)<br />
* {{Code|atv}}: Heap file for attribute values.<br />
<br />
==Value Indexes: {{Code|txtl}}, {{Code|txtr}}, {{Code|atvl}}, {{Code|atvr}}==<br />
<br />
'''Text Index:'''<br />
* {{Code|txtl}}: Heap file with ID lists.<br />
* {{Code|txtr}}: Index file with references to ID lists.<br />
The '''Attribute Index''' is contained in the files {{Code|atvl}} and {{Code|atvr}}; it uses the same layout.<br />
<br />
For a more detailed discussion and examples of these file formats please see [[Index File Structure]].<br />
<br />
==Full-Text Fuzzy Index: {{Code|ftxx}}, {{Code|ftxy}}, {{Code|ftxz}}==<br />
<br />
...may soon be reimplemented.<br />
<br />
[[Category:Internals]]</div>James Ballhttps://docs.basex.org/index.php?title=Index_File_Structure&diff=10952Index File Structure2014-08-22T14:49:53Z<p>James Ball: Correction from saying holds list of ids to saying holds list of offsets.</p>
<hr />
<div>This page is provided to help those who are interested in the specific file format of the index files used by BaseX. It was predominantly written to aid research into the reasons for ever increasing file size when using the <code>[[Options#UPDINDEX|UPDINDEX]]</code> option.<br />
<br />
== Attribute Index Files ==<br />
<br />
=== atvr.basex ===<br />
<br />
This file is made up of a series of 5-byte records. There is one record for each attribute value in the index and the records are sorted by ascending order of value. The record itself is big-endian integer value giving the position of the start of the ID list in the atvl.basex file.<br />
<br />
In the example below we have a file for a database with attribute values entered in the order 100, 200, 1 & d. The actual attribute values are not held in this file but because we know the records in this file are ordered by value so we can interpret that the ID list for 1 is at position 8, the ID list for 100 is at position 4, the ID list for 200 is at position 6.<br />
<br />
The bytes of the file (in hex) are:<br />
<br />
<code>00 00 00 00 08</code><code>00 00 00 00 04</code><code>00 00 00 00 06</code><code>00 00 00 00 0A</code><br />
<br />
=== atvl.basex ===<br />
<br />
This file provides a number of details.<br />
*The total number of attribute values<br />
*The number of times each attribute value appears<br />
*The offsets for each occurrence of each attribute value<br />
<br />
In the example below we have the file for the database used in the atrv.basex example above. The first four bytes provide a big-endian integer value of the total number of different attribute values in the index - in this case 4.<br />
<br />
The remainder of the file is made up of ID lists. Each list starts on one of the bytes from atvr.basex - in the case of our example there is a list starting on byte at position 8 (counting starting from 0). The first item in the list is a count of the number of attributes will this value - in our case here it's 1. Then the list has the locations of the attributes - in our case there is only one attribute and it's at a position 8. This means that it is offset 8 positions from the beginning of the database (use [[Options#Info|INFO STORAGE]] command to view the order).<br />
<br />
<code>00 00 00 04</code><code>[01] 02</code><code>[01] 05</code>'''<code>[01] 08</code>'''<code>[01] 0B</code><br />
<br />
The offset is against the beginning of the database for the first occurrence of the attribute and then against the previous attribute for all the rest in the list. So:<br />
<br />
<code>[03] 0B 02 02</code><br />
<br />
would show that there are three values of the attribute. The first is offset 11 places from the beginning of the database (ID 11), the second is offset 2 places from that (ID 13) and the third is offset two places from the second (ID 15).<br />
<br />
==== Compressed Integers ====<br />
<br />
The example here is from a very small database so the elements of the ID lists are all one byte long. However the ID lists actually use compressed integers (see <code>Num</code> under [[Storage_Layout#Data Types|Storage Layout]] and [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/Num.java Num.java]). This means that each element in the list can be from one to five bytes in length. The value of the first byte tells you how many bytes the element is and how to interpret the value.<br />
<br />
Values from <code>00</code> to <code>3F</code> are single byte elements can be read directly as 0 to 63.<br />
<br />
Value from <code>40</code> to <code>7F</code> are dual byte elements. The value is given by the integer value of the first byte, subtract 64 and multiply by 256 plus the direct value of the second byte.<br />
<br />
<code>40 A1</code> = ((64 - 64) * 256) + 161 = 161<br />
<br />
<code>41 A1</code> = ((65 - 64) * 256) + 161 = 417<br />
<br />
<code>51 A1</code> = ((81 - 64) * 256) + 161 = 4513<br />
<br />
<code>7F FF</code> = ((127 - 64) * 256) + 161 = 16383<br />
<br />
Values from <code>80</code> to <code>BF</code> are four-byte elements. The value is given by the integer value of the first byte, subtract 128 and multiply by 16,777,216 plus the direct value of the following three bytes.<br />
<br />
<code>80 11 12 13</code> = ((128 - 128) * 16777216) + 1118739 = 1,118,739<br />
<br />
A value of <code>C0</code> shows a five-byte element. The value of the element is the direct value of the following four bytes.<br />
<br />
<code>C0 11 12 13 14</code> is 286,397,204<br />
<br />
==== Gaps in the File ====<br />
<br />
When an index is created by using the functions in the GUI, commands or XQuery then the atvl.basex file will be a continuous run of ID lists. However, when the index is being updated automatically because UPDINDEX is set to true then gaps can be created in the file.<br />
<br />
Consider the example we had before where each attribute had only once occurrence in the database. Now we add a new file and there are now two instances of '''d''' as an attribute value.<br />
<br />
The atvr.basex file may now look like this:<br />
<br />
<span style="background-color:#FFE4C4"><code>00 00 00 00 08</code></span><br />
<span style="background-color:#FFF8DC"><code>00 00 00 00 04</code></span><br />
<span style="background-color:#DEB887"><code>00 00 00 00 06</code></span><br />
<span style="background-color:#F4A460"><code>00 00 00 00 0C</code></span><br />
<br />
The atvl.basex file may now look like this:<br />
<br />
<code>00 00 00 04</code><br />
<span style="background-color:#FFF8DC"><code>[01] 02</code></span><br />
<span style="background-color:#DEB887"><code>[01] 05</code></span><br />
<span style="background-color:#FFE4C4"><code>[01] 08</code></span><br />
<code>[01] 0B</code><br />
<span style="background-color:#F4A460"><code>[02] 0B 03</code></span><br />
<br />
The header tells us that there are 4 attribute values but we can see there are 5 ID lists in the file. One has become orphaned - a new longer list was required to include the newly added attribute and has been appended to the end of the file.<br />
<br />
In versions of BaseX prior to 8.0 when items are deleted and a shorter list is required it will be updated in place. When items are added and a longer list is required the new list is always added at the end of the file. Over a period of time the file will grow - running the [[Options#OPTIMIZE|OPTIMIZE]] command will recreate the index from scratch and recover the lost space.<br />
<br />
From BaseX 8.0 some optimisations have been applies so that while a database is open a list of free spaces is maintained and a new list will only be added to the end of the file if there isn't a free space available that is large enough. However this list of free spaces is lost when the database is closed and future operations will not be aware of any free space available when the database is opened. This, and the fact that small spaces are unlikely to be filled (single bytes for example) mean that the index file may still grow larger than it needs to be. This space can be recovered, as before, by running [[Options#OPTIMIZE|OPTIMIZE]].<br />
<br />
== Value Index Files ==<br />
<br />
These files, txtr.basex and txtl.basex work in the same way as the attribute index files but with references to the text nodes instead of attributes.</div>James Ballhttps://docs.basex.org/index.php?title=Index_File_Structure&diff=10951Index File Structure2014-08-22T14:48:42Z<p>James Ball: First full draft of the information.</p>
<hr />
<div>This page is provided to help those who are interested in the specific file format of the index files used by BaseX. It was predominantly written to aid research into the reasons for ever increasing file size when using the <code>[[Options#UPDINDEX|UPDINDEX]]</code> option.<br />
<br />
== Attribute Index Files ==<br />
<br />
=== atvr.basex ===<br />
<br />
This file is made up of a series of 5-byte records. There is one record for each attribute value in the index and the records are sorted by ascending order of value. The record itself is big-endian integer value giving the position of the start of the ID list in the atvl.basex file.<br />
<br />
In the example below we have a file for a database with attribute values entered in the order 100, 200, 1 & d. The actual attribute values are not held in this file but because we know the records in this file are ordered by value so we can interpret that the ID list for 1 is at position 8, the ID list for 100 is at position 4, the ID list for 200 is at position 6.<br />
<br />
The bytes of the file (in hex) are:<br />
<br />
<code>00 00 00 00 08</code><code>00 00 00 00 04</code><code>00 00 00 00 06</code><code>00 00 00 00 0A</code><br />
<br />
=== atvl.basex ===<br />
<br />
This file provides a number of details.<br />
*The total number of attribute values<br />
*The number of times each attribute value appears<br />
*The <code>ID</code> numbers of each occurrence of each attribute value<br />
<br />
In the example below we have the file for the database used in the atrv.basex example above. The first four bytes provide a big-endian integer value of the total number of different attribute values in the index - in this case 4.<br />
<br />
The remainder of the file is made up of ID lists. Each list starts on one of the bytes from atvr.basex - in the case of our example there is a list starting on byte at position 8 (counting starting from 0). The first item in the list is a count of the number of attributes will this value - in our case here it's 1. Then the list has the locations of the attributes - in our case there is only one attribute and it's at a position 8. This means that it is offset 8 positions from the beginning of the database (use [[Options#Info|INFO STORAGE]] command to view the order).<br />
<br />
<code>00 00 00 04</code><code>[01] 02</code><code>[01] 05</code>'''<code>[01] 08</code>'''<code>[01] 0B</code><br />
<br />
The offset is against the beginning of the database for the first occurrence of the attribute and then against the previous attribute for all the rest in the list. So:<br />
<br />
<code>[03] 0B 02 02</code><br />
<br />
would show that there are three values of the attribute. The first is offset 11 places from the beginning of the database (ID 11), the second is offset 2 places from that (ID 13) and the third is offset two places from the second (ID 15).<br />
<br />
==== Compressed Integers ====<br />
<br />
The example here is from a very small database so the elements of the ID lists are all one byte long. However the ID lists actually use compressed integers (see <code>Num</code> under [[Storage_Layout#Data Types|Storage Layout]] and [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/Num.java Num.java]). This means that each element in the list can be from one to five bytes in length. The value of the first byte tells you how many bytes the element is and how to interpret the value.<br />
<br />
Values from <code>00</code> to <code>3F</code> are single byte elements can be read directly as 0 to 63.<br />
<br />
Value from <code>40</code> to <code>7F</code> are dual byte elements. The value is given by the integer value of the first byte, subtract 64 and multiply by 256 plus the direct value of the second byte.<br />
<br />
<code>40 A1</code> = ((64 - 64) * 256) + 161 = 161<br />
<br />
<code>41 A1</code> = ((65 - 64) * 256) + 161 = 417<br />
<br />
<code>51 A1</code> = ((81 - 64) * 256) + 161 = 4513<br />
<br />
<code>7F FF</code> = ((127 - 64) * 256) + 161 = 16383<br />
<br />
Values from <code>80</code> to <code>BF</code> are four-byte elements. The value is given by the integer value of the first byte, subtract 128 and multiply by 16,777,216 plus the direct value of the following three bytes.<br />
<br />
<code>80 11 12 13</code> = ((128 - 128) * 16777216) + 1118739 = 1,118,739<br />
<br />
A value of <code>C0</code> shows a five-byte element. The value of the element is the direct value of the following four bytes.<br />
<br />
<code>C0 11 12 13 14</code> is 286,397,204<br />
<br />
==== Gaps in the File ====<br />
<br />
When an index is created by using the functions in the GUI, commands or XQuery then the atvl.basex file will be a continuous run of ID lists. However, when the index is being updated automatically because UPDINDEX is set to true then gaps can be created in the file.<br />
<br />
Consider the example we had before where each attribute had only once occurrence in the database. Now we add a new file and there are now two instances of '''d''' as an attribute value.<br />
<br />
The atvr.basex file may now look like this:<br />
<br />
<span style="background-color:#FFE4C4"><code>00 00 00 00 08</code></span><br />
<span style="background-color:#FFF8DC"><code>00 00 00 00 04</code></span><br />
<span style="background-color:#DEB887"><code>00 00 00 00 06</code></span><br />
<span style="background-color:#F4A460"><code>00 00 00 00 0C</code></span><br />
<br />
The atvl.basex file may now look like this:<br />
<br />
<code>00 00 00 04</code><br />
<span style="background-color:#FFF8DC"><code>[01] 02</code></span><br />
<span style="background-color:#DEB887"><code>[01] 05</code></span><br />
<span style="background-color:#FFE4C4"><code>[01] 08</code></span><br />
<code>[01] 0B</code><br />
<span style="background-color:#F4A460"><code>[02] 0B 03</code></span><br />
<br />
The header tells us that there are 4 attribute values but we can see there are 5 ID lists in the file. One has become orphaned - a new longer list was required to include the newly added attribute and has been appended to the end of the file.<br />
<br />
In versions of BaseX prior to 8.0 when items are deleted and a shorter list is required it will be updated in place. When items are added and a longer list is required the new list is always added at the end of the file. Over a period of time the file will grow - running the [[Options#OPTIMIZE|OPTIMIZE]] command will recreate the index from scratch and recover the lost space.<br />
<br />
From BaseX 8.0 some optimisations have been applies so that while a database is open a list of free spaces is maintained and a new list will only be added to the end of the file if there isn't a free space available that is large enough. However this list of free spaces is lost when the database is closed and future operations will not be aware of any free space available when the database is opened. This, and the fact that small spaces are unlikely to be filled (single bytes for example) mean that the index file may still grow larger than it needs to be. This space can be recovered, as before, by running [[Options#OPTIMIZE|OPTIMIZE]].<br />
<br />
== Value Index Files ==<br />
<br />
These files, txtr.basex and txtl.basex work in the same way as the attribute index files but with references to the text nodes instead of attributes.</div>James Ballhttps://docs.basex.org/index.php?title=Index_File_Structure&diff=10948Index File Structure2014-08-21T14:24:28Z<p>James Ball: Provided link to UPDINDEX on Options page</p>
<hr />
<div>This page is provided to help those who are interested in the specific file format of the index files used by BaseX. It was predominantly written to aid research into the reasons for ever increasing file size when using the <code>[[Options#UPDINDEX|UPDINDEX]]</code> option.<br />
<br />
== Attribute Index Files ==<br />
<br />
=== atvr.basex ===<br />
<br />
=== atvl.basex ===</div>James Ballhttps://docs.basex.org/index.php?title=Index_File_Structure&diff=10947Index File Structure2014-08-20T15:38:02Z<p>James Ball: First structure for page</p>
<hr />
<div>This page is provided to help those who are interested in the specific file format of the index files used by BaseX. It was predominantly written to aid research into the reasons for ever increasing file size when using the UPDINDEX option.<br />
<br />
== Attribute Index Files ==<br />
<br />
=== atvr.basex ===<br />
<br />
=== atvl.basex ===</div>James Ballhttps://docs.basex.org/index.php?title=XQuery_3.0&diff=10559XQuery 3.02014-04-09T14:22:35Z<p>James Ball: Changed 'oder' to 'or'</p>
<hr />
<div>This article is part of the [[XQuery|XQuery Portal]].<br />
It summarizes the most interesting features the upcoming [http://www.w3.org/TR/xquery-30/ XQuery 3.0]<br />
and [http://www.w3.org/TR/xpath-30/ XPath 3.0] Recommendations.<br />
All extensions are already available in the latest versions of BaseX.<br />
<br />
==Enhanced FLWOR Expressions==<br />
<br />
Most clauses of FLWOR expressions can now be specified in an arbitrary order: additional {{Code|let}} and {{Code|for}} clauses can be put after a {{Code|where}} clause, and multiple {{Code|where}}, {{Code|order by}} and {{Code|group by}} statements can be used. This means that many nested loops can now be rewritten to a single FLWOR expression.<br />
<br />
'''Example:''' <br />
<pre class="brush:xquery"><br />
for $country in db:open('factbook')//country<br />
where $country/@population > 100000000<br />
let $name := $country/name[1]<br />
for $city in $country//city[population > 1000000]<br />
group by $name<br />
return &lt;country name='{ $name }'&gt;{ $city/name }&lt;/country&gt;<br />
</pre><br />
<br />
A new {{Code|count}} clause enhances the FLWOR expression with a variable that enumerates the iterated tuples.<br />
<br />
<pre class="brush:xquery"><br />
for $n in (1 to 10)[. mod 2 = 1]<br />
count $c<br />
return &lt;number count="{ $c }" number="{ $n }"/&gt;<br />
</pre><br />
<br />
The {{Code|allowing empty}} provides functionality similar to outer joins in SQL:<br />
<br />
<pre class="brush:xquery"><br />
for $n allowing empty in ()<br />
return 'empty? ' || empty($n)<br />
</pre><br />
<br />
Window clauses provide a rich set of variable declarations to process sub-sequences of iterated tuples. An example:<br />
<br />
<pre class="brush:xquery"><br />
for tumbling window $w in (2, 4, 6, 8, 10, 12, 14)<br />
start at $s when fn:true()<br />
only end at $e when $e - $s eq 2<br />
return &lt;window&gt;{ $w }&lt;/window&gt;<br />
</pre><br />
<br />
More information on window clauses, and all other enhancements, can be found in the [http://www.w3.org/TR/xquery-30/#id-windows specification].<br />
<br />
==Simple Map Operator==<br />
<br />
The [http://www.w3.org/TR/xquery-30/#id-map-operator simple map] operator {{Code|!}} provides a compact notation for applying the results of a first to a second expression: the resulting items of the first expression are bound to the context item one by one, and the second expression is evaluated for each item. The map operator may be used as replacement for FLWOR expressions:<br />
<br />
'''Example:''' <br />
<pre class="brush:xquery"><br />
(: Simple map notation :)<br />
(1 to 10) ! element node { . },<br />
(: FLWOR notation :)<br />
for $i in 1 to 10<br />
return element node { $i }<br />
</pre><br />
<br />
A map operator is defined to be part of a path expression, which may now be mixed of path and map operators. In contrast to the map operator, the results of the map operator will not be made duplicate-free and returned in document order.<br />
<br />
==Group By==<br />
<br />
FLWOR expressions have been extended to include the [http://www.w3.org/TR/xquery-30/#id-group-by group by] clause, which is well-established among relational database systems. <code>group by</code> can be used to apply value-based partitioning to query results:<br />
<br />
'''Example:''' <br />
<pre class="brush:xquery"> <br />
for $ppl in doc('xmark')//people/person <br />
let $ic := $ppl/profile/@income<br />
let $income := if($ic < 30000) then<br />
"challenge" <br />
else if($ic >= 30000 and $ic < 100000) then <br />
"standard" <br />
else if($ic >= 100000) then <br />
"preferred" <br />
else <br />
"na" <br />
group by $income<br />
order by $income<br />
return element { $income } { count($ppl) }<br />
<br />
</pre> <br />
<br />
This query is a rewrite of [http://www.ins.cwi.nl/projects/xmark/Assets/xmlquery.txt Query #20] contained in the [http://www.ins.cwi.nl/projects/xmark XMark Benchmark Suite] to use <code>group by</code>.<br />
The query partitions the customers based on their income. <br />
<br />
'''Result:''' <br />
<pre class="brush:xml"><br />
<challenge>4731</challenge><br />
<na>12677</na><br />
<prefered>314</prefered><br />
<standard>7778</standard><br />
</pre><br />
<br />
In contrast to the relational GROUP BY statement, the XQuery counterpart<br />
concatenates the values of all non-grouping variables that belong to a specific group.<br />
In the context of our example, all nodes in <code>//people/person</code> that belong to the <code>preferred</code> partition are concatenated in <code class="brush:xquery">$ppl</code> after grouping has finished.<br />
You can see this effect by changing the return statement to:<br />
<br />
<pre class="brush:xquery"> <br />
...<br />
return element { $income } { $ppl }<br />
</pre><br />
'''Result:'''<br />
<pre class="brush:xml"><br />
<challenge><br />
<person id="person0"><br />
<name>Kasidit Treweek</name><br />
…<br />
<person id="personX"><br />
…<br />
</challenge><br />
</pre><br />
<br />
==Try/Catch==<br />
<br />
The [http://www.w3.org/TR/xquery-30/#id-try-catch try/catch] construct can be used to handle errors at runtime:<br />
<br />
'''Example:''' <br />
<pre class="brush:xquery"><br />
try {<br />
1 + '2'<br />
} catch err:XPTY0004 {<br />
'Typing error: ' || $err:description<br />
} catch * {<br />
'Error [' || $err:code || ']: ' || $err:description<br />
}<br />
</pre><br />
'''Result:''' <code>Typing error: '+' operator: number expected, xs:string found.</code><br />
<br />
Within the scope of the catch clause, a number of variables are implicitly declared, giving information about the error that occurred:<br />
<br />
* {{Code|$err:code}} error code<br />
* {{Code|$err:description}}: error message<br />
* {{Code|$err:value}}: value associated with the error (optional)<br />
* {{Code|$err:module}}: URI of the module where the error occurred<br />
* {{Code|$err:line-number}}: line number where the error occurred<br />
* {{Code|$err:column-number}}: column number where the error occurred<br />
* {{Code|$err:additional}}: error stack trace<br />
<br />
==Switch==<br />
<br />
The [http://www.w3.org/TR/xquery-30/#id-switch switch] statement is available in many other programming languages. It chooses one of several expressions to evaluate based on its input value.<br />
<br />
'''Example:''' <br />
<pre class="brush:xquery"><br />
for $fruit in ("Apple", "Pear", "Peach")<br />
return switch ($fruit)<br />
case "Apple" return "red"<br />
case "Pear" return "green"<br />
case "Peach" return "pink"<br />
default return "unknown"<br />
</pre> <br />
'''Result:''' <code>red green pink</code><br />
<br />
==Function Items==<br />
<br />
One of the most distinguishing features added in ''XQuery 3.0'' are ''function items'', also known as ''lambdas'' or ''lambda functions''. They make it possible to abstract over functions and thus write more modular code.<br />
<br />
'''Examples:'''<br />
<br />
Function items can be obtained in three different ways:<br />
<br />
<ul><br />
<li>Declaring a new ''inline function'':<br />
<pre class="brush:xquery">let $f := function($x, $y) { $x + $y }<br />
return $f(17, 25)</pre> <br />
'''Result:''' <code>42</code><br />
</li><br />
<li>Getting the function item of an existing (built-in or user-defined) XQuery function. The arity (number of arguments) has to be specified as there can be more than one function with the same name:<br />
<pre class="brush:xquery">let $f := math:pow#2<br />
return $f(5, 2)</pre> <br />
'''Result:''' <code>25</code><br />
</li><br />
<li>''Partially applying'' another function or function item. This is done by supplying only some of the required arguments, writing the placeholder <code>?</code> in the positions of the arguments left out. The produced function item has one argument for every placeholder.<br />
<pre class="brush:xquery">let $f := fn:substring(?, 1, 3)<br />
return (<br />
$f('foo123'),<br />
$f('bar456')<br />
)</pre> <br />
'''Result:''' <code>foo bar</code><br />
</li><br />
</ul><br />
<br />
Function items can also be passed as arguments to and returned as results from functions. These so-called [[Higher-Order Functions]] like <code>fn:map</code> and <code>fn:fold-left</code> are discussed in more depth on their own Wiki page.<br />
<br />
==Expanded QNames==<br />
<br />
A ''QName'' can now be directly prefixed with the letter "Q" and a namespace URI in the [http://www.jclark.com/xml/xmlns.htm Clark Notation].<br />
<br />
'''Examples:'''<br />
* <code><nowiki>Q{http://www.w3.org/2005/xpath-functions/math}pi()</nowiki></code> returns the number π<br />
* <code>Q{java:java.io.FileOutputStream}new("output.txt")</code> creates a new Java file output stream<br />
<br />
The syntax differed in older versions of the XQuery 3.0 specification, in which the prefixed namespace URI was quoted:<br />
<br />
* <code><nowiki>"http://www.w3.org/2005/xpath-functions/math":pi()</nowiki></code><br />
* <code>"java:java.io.FileOutputStream":new("output")</code><br />
<br />
==Namespace Constructors==<br />
<br />
New namespaces can now be created via so-called 'Computed Namespace Constructors'.<br />
<br />
<pre class="brush:xquery"> <br />
element node { namespace pref { 'http://url.org/' } }<br />
</pre><br />
<br />
==String Concatenations==<br />
<br />
Two vertical bars <code>||</code> (also names ''pipe characters'') can be used to concatenate strings. This operator is a shortcut for the {{Code|fn:concat()}} function.<br />
<br />
<pre class="brush:xquery"> <br />
'Hello' || ' ' || 'Universe'<br />
</pre><br />
<br />
==External Variables==<br />
<br />
Default values can now be attached to external variable declarations. This way, an expression can also be evaluated if its external variables have not been bound to a new value.<br />
<br />
<pre class="brush:xquery"> <br />
declare variable $user external := "admin";<br />
"User:", $user<br />
</pre><br />
<br />
==Serialization==<br />
<br />
[[Serialization|Serialization ]]parameters can now be defined within XQuery expressions. Parameters are placed in the query prolog and need to be specified as option declarations, using the <code>output</code> prefix.<br />
<br />
'''Example:''' <br />
<pre class="brush:xquery"><br />
declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";<br />
declare option output:omit-xml-declaration "no";<br />
declare option output:method "xhtml";<br />
&lt;html/&gt;<br />
</pre> <br />
'''Result:''' <code>&lt;?xml version="1.0" encoding="UTF-8"?&gt;&lt;html&gt;&lt;/html&gt;</code><br />
<br />
In BaseX, the {{Code|output}} prefix is statically bound and can thus be omitted. Note that all namespaces need to be specified when using external APIs, such as [http://xqj.net/basex/ XQJ].<br />
<br />
==Context Item==<br />
<br />
The context item can now be specified in the prolog of an XQuery expressions:<br />
<br />
'''Example:''' <br />
<pre class="brush:xquery"><br />
declare context item := document {<br />
<xml><br />
<text>Hello</text><br />
<text>World</text><br />
</xml><br />
};<br />
<br />
for $t in .//text()<br />
return string-length($t)<br />
</pre> <br />
'''Result:''' <code>5 5</code><br />
<br />
==Annotations==<br />
<br />
XQuery 3.0 introduces annotations to declare properties associated with functions and variables. For instance, a function may be declared %public, %private, or %updating.<br />
<br />
'''Example:''' <br />
<pre class="brush:xquery"><br />
declare %private function local:max($x1, $x2) {<br />
if($x1 > $x2) then $x1 else $x2<br />
};<br />
<br />
local:max(2, 3)<br />
</pre><br />
<br />
==Functions==<br />
<br />
BaseX supports all functions that have been added in Version 3.0 of the [http://www.w3.org/TR/xpath-functions-30/ XQuery Functions and Operators] Working Draft. The new functions are listed below:<br />
<br />
* <code>math:pi()</code>, <code>math:sin()</code>, and many others (see [[Math Module]])<br />
* <code>fn:analyze-string()</code><br />
* <code>fn:available-environment-variables()</code><br />
* <code>fn:element-with-id()</code><br />
* <code>fn:environment-variable()</code><br />
* <code>fn:filter()</code><br />
* <code>fn:fold-left()</code><br />
* <code>fn:fold-right()</code><br />
* <code>fn:format-date()</code><br />
* <code>fn:format-dateTime()</code><br />
* <code>fn:format-integer()</code><br />
* <code>fn:format-number()</code><br />
* <code>fn:format-time()</code><br />
* <code>fn:function-arity()</code><br />
* <code>fn:function-lookup()</code><br />
* <code>fn:function-name()</code><br />
* <code>fn:generate-id()</code><br />
* <code>fn:has-children()</code><br />
* <code>fn:head()</code><br />
* <code>fn:innermost()</code><br />
* <code>fn:map()</code><br />
* <code>fn:map-pairs()</code><br />
* <code>fn:outermost()</code><br />
* <code>fn:parse-xml()</code><br />
* <code>fn:parse-xml-fragment()</code><br />
* <code>fn:path()</code><br />
* <code>fn:serialize()</code><br />
* <code>fn:tail()</code><br />
* <code>fn:unparsed-text()</code><br />
* <code>fn:unparsed-text-available()</code><br />
* <code>fn:unparsed-text-lines()</code><br />
* <code>fn:uri-collection()</code><br />
<br />
New signatures have been added for the following functions:<br />
<br />
* <code>fn:document-uri()</code> with 0 arguments<br />
* <code>fn:string-join()</code> with 1 argument<br />
* <code>fn:node-name()</code> with 0 arguments<br />
* <code>fn:round()</code> with 2 arguments<br />
* <code>fn:data()</code> with 0 arguments<br />
<br />
=Changelog=<br />
<br />
;Version 7.7<br />
<br />
* Added: [[#Enhanced FLWOR Expressions|Enhanced FLWOR Expressions]]<br />
<br />
;Version 7.3<br />
<br />
* Added: [[#Simple Map Operator|Simple Map Operator]]<br />
<br />
;Version 7.2<br />
<br />
* Added: [[#Annotations|Annotations]]<br />
* Updated: [[#Expanded QNames|Expanded QNames]]<br />
<br />
;Version 7.1<br />
<br />
* Added: [[#Expanded QNames|Expanded QNames]], [[#Namespace Constructors|Namespace Constructors]]<br />
<br />
;Version 7.0<br />
<br />
* Added: [[#String Concatenations|String Concatenations]]<br />
<br />
[[Category:XQuery]]</div>James Ballhttps://docs.basex.org/index.php?title=HTML_Module&diff=10542HTML Module2014-03-31T17:36:01Z<p>James Ball: Corrected fetch:content-binary to fetch:binary in section Parsing Binary Input</p>
<hr />
<div>This [[Module Library|XQuery Module]] provides functions for converting HTML to XML. Conversion will only take place if [http://home.ccil.org/~cowan/XML/tagsoup/ TagSoup] is included in the classpath (see [[Parsers#HTML Parser|HTML Parsing]] for more details).<br />
<br />
=Conventions=<br />
<br />
All functions in this module are assigned to the {{Code|http://basex.org/modules/html}} namespace, which is statically bound to the {{Code|html}} prefix.<br/><br />
All errors are assigned to the {{Code|http://basex.org/errors}} namespace, which is statically bound to the {{Code|bxerr}} prefix.<br />
<br />
=Functions=<br />
<br />
==html:parser==<br />
<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Code|'''html:parser'''() as xs:string}}<br /><br />
|-<br />
| '''Summary'''<br />
|Returns the name of the applied HTML parser (currently: {{Code|TagSoup}}). If an ''empty string'' is returned, TagSoup was not found in the classpath, and the input will be treated as well-formed XML.<br /><br />
|}<br />
<br />
==html:parse==<br />
{| width='100%'<br />
|-<br />
| width='120' | '''Signatures'''<br />
|{{Func|html:parse|$input as xs:anyAtomicType|document-node()}}<br />{{Func|html:parse|$input as xs:anyAtomicType, $options as item()|document-node()}}<br /><br />
|-<br />
| '''Summary'''<br />
|Converts the HTML document specified by {{Code|$input}} to XML, and returns a document node:<br/><br />
* The input may either be a string or a binary item (xs:hexBinary, xs:base64Binary).<br />
* If the input is passed on in its binary representation, the HTML parser will try to automatically choose the correct encoding.<br />
<br />
The {{Code|$options}} argument can be used to set [[Parsers#TagSoup Options|TagSoup Options]], which can be specified…<br /><br />
* as children of an {{Code|<html:options/>}} element; e.g.:<br />
<pre class="brush:xml"><br />
<html:options><br />
<html:key1 value='value1'/><br />
...<br />
</html:options><br />
</pre><br />
* as map, which contains all key/value pairs:<br />
<pre class="brush:xml"><br />
map { "key1" := "value1", ... }<br />
</pre><br />
|-<br />
| '''Errors'''<br />
|{{Error|BXHL0001|#Errors}} the input cannot be converted to XML.<br />
|}<br />
<br />
=Examples=<br />
<br />
===Basic Example===<br />
<br />
The following query converts the specified string to an XML document node.<br />
<br />
;Query:<br />
<pre class="brush:xquery"><br />
html:parse("<html>")<br />
</pre><br />
<br />
;Result:<br />
<pre class="brush:xml"><br />
<html xmlns="http://www.w3.org/1999/xhtml"/><br />
</pre><br />
<br />
===Specifying Options===<br />
<br />
The next query creates an XML document without namespaces:<br />
<br />
;Query:<br />
<pre class="brush:xquery"><br />
html:parse("<a href='ok.html'/>", map { 'nons' := true() })<br />
</pre><br />
<br />
;Result:<br />
<pre class="brush:xml"><br />
<html><br />
<body><br />
<a shape="rect" href="ok.html"/><br />
</body><br />
</html><br />
</pre><br />
<br />
===Parsing Binary Input===<br />
<br />
If the input encoding is unknown, the data to be processed can be passed on in its binary representation.<br />
The HTML parser will automatically try to detect the correct encoding:<br />
<br />
;Query:<br />
<pre class="brush:xquery"><br />
html:parse(fetch:binary("http://en.wikipedia.org"))<br />
</pre><br />
<br />
;Result:<br />
<pre class="brush:xml"><br />
<html xmlns="http://www.w3.org/1999/xhtml" class="client-nojs" dir="ltr" lang="en"><br />
<head><br />
<title>Wikipedia, the free encyclopedia</title><br />
<meta charset="UTF-8"/><br />
...<br />
</pre><br />
<br />
=Errors=<br />
<br />
{| class="wikitable" width="100%"<br />
! width="110"|Code<br />
|Description<br />
|-<br />
|{{Code|BXHL0001}}<br />
|The input cannot be converted to XML.<br />
|}<br />
<br />
=Changelog=<br />
<br />
The module was introduced with Version 7.6.<br />
<br />
[[Category:XQuery]]</div>James Ball