Difference between revisions of "Serialization"
(48 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
This page is part of the [[XQuery|XQuery Portal]]. | This page is part of the [[XQuery|XQuery Portal]]. | ||
− | |||
− | |||
− | |||
− | * | + | Serialization parameters define how XQuery items and XML nodes will be ''serialized'' (i.e., returned to the client or an API, usually in textual form). The official parameters are defined in the [http://www.w3.org/TR/xslt-xquery-serialization-31 W3C XQuery Serialization 3.1] document. In BaseX, they can be specified by: |
− | * | + | |
− | * | + | * including them in the [[XQuery_3.0#Serialization|prolog of the XQuery expression]]; |
− | * | + | * specifying them in the XQuery functions [[File_Module#file:write|file:write()]] or [[XQuery_3.0#Functions|fn:serialize()]]. The serialization parameters are specified as |
− | * | + | ** children of an {{Code|<output:serialization-parameters/>}} element, as defined for the [http://www.w3.org/TR/xpath-functions-30/#func-serialize fn:serialize()] function, or as |
− | * | + | ** map, which contains all key/value pairs: <code>map { "method": "xml", "cdata-section-elements": "div", ... }</code>; |
+ | * using the {{Code|-s}} flag of the BaseX [[Command-Line Options#BaseX Standalone|command-line]] clients; | ||
+ | * setting the {{Option|SERIALIZER}} option before running a query; | ||
+ | * setting the {{Option|EXPORTER}} option before exporting a database; or | ||
+ | * setting them as [[REST#Parameters|REST]] query parameters. | ||
+ | |||
+ | Due to the wide range of ways how parameters can be supplied, we deliberately ignored one rule of the specification, which requires non-official features to be defined in a non-null namespace URI. In the following, we will indicate which features are specific to our implementation. | ||
=Parameters= | =Parameters= | ||
− | The following | + | The following serialization parameters are supported by BaseX (further details can be looked up in the official specification): |
{| class="wikitable sortable" width="100%" | {| class="wikitable sortable" width="100%" | ||
|- valign="top" | |- valign="top" | ||
! width="140" | Parameter | ! width="140" | Parameter | ||
− | ! | + | ! Description |
! Allowed | ! Allowed | ||
! Default | ! Default | ||
|- valign="top" | |- valign="top" | ||
| {{Code|method}} | | {{Code|method}} | ||
− | | Specifies the serialization method | + | | Specifies the serialization method. {{Code|xml}}, {{Code|xhtml}}, {{Code|html}}, {{Code|text}} and {{Code|adaptive}} are part of the official specification. For more details on {{Code|basex}}, {{Code|csv}} and {{Code|json}}, see [[XQuery Extensions#Serialization|XQuery Extensions]]. |
− | + | | {{Code|xml}}, {{Code|xhtml}}, {{Code|html}}, {{Code|text}}, {{Code|json}}, {{Code|adaptive}}, {{Code|csv}}, {{Code|basex}} | |
− | + | | {{Code|basex}} | |
− | |||
− | |||
− | |||
− | |||
− | | {{Code| | ||
|- valign="top" | |- valign="top" | ||
| {{Code|version}} | | {{Code|version}} | ||
| Specifies the version of the serialization method. | | Specifies the version of the serialization method. | ||
− | + | | xml/xhtml: {{Code|1.0}}, {{Code|1.1}}<br/>html: {{Code|4.0}}, {{Code|4.01}}, {{Code|5.0}}<br/> | |
| {{Code|1.0}} | | {{Code|1.0}} | ||
|- valign="top" | |- valign="top" | ||
Line 44: | Line 42: | ||
| {{Code|item-separator}} | | {{Code|item-separator}} | ||
| Determines a string to be used as item separator. If a separator is specified, the default separation of atomic values with single whitespaces will be skipped. | | Determines a string to be used as item separator. If a separator is specified, the default separation of atomic values with single whitespaces will be skipped. | ||
− | | ''arbitrary strings'' | + | | ''arbitrary strings'' |
| ''empty'' | | ''empty'' | ||
|- valign="top" | |- valign="top" | ||
Line 96: | Line 94: | ||
| | | | ||
| {{Code|application/xml}} | | {{Code|application/xml}} | ||
+ | |- valign="top" | ||
+ | | {{Code|parameter-document}} | ||
+ | | Parses the value as XML document with additional serialization parameters (see the [http://www.w3.org/TR/xslt-xquery-serialization-31/#serparams-in-xdm-instance Serialization Specification] for more details). | ||
+ | | | ||
+ | | | ||
|- valign="top" | |- valign="top" | ||
| {{Code|use-character-maps}} | | {{Code|use-character-maps}} | ||
− | | Defines character mappings | + | | Defines character mappings. May only occur in documents parsed with {{Code|parameter-document}}. |
| | | | ||
| | | | ||
Line 113: | Line 116: | ||
|- valign="top" | |- valign="top" | ||
| {{Code|include-content-type}} | | {{Code|include-content-type}} | ||
− | | | + | | Inserts a {{Code|meta}} content-type element into the head element if the result is output as HTML<br />Example: <code><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></head></code>. The head element must already exist or nothing will be added. Any existing {{Code|meta}} content-type elements will be removed. |
| {{Code|yes}}, {{Code|no}} | | {{Code|yes}}, {{Code|no}} | ||
− | | {{Code| | + | | {{Code|yes}} |
|} | |} | ||
− | BaseX provides additional | + | BaseX provides some additional serialization parameters: |
{| class="wikitable sortable" width="100%" | {| class="wikitable sortable" width="100%" | ||
|- valign="top" | |- valign="top" | ||
! width="140" | Parameter | ! width="140" | Parameter | ||
− | ! | + | ! Description |
! Allowed | ! Allowed | ||
! Default | ! Default | ||
|- valign="top" | |- valign="top" | ||
| {{Code|csv}} | | {{Code|csv}} | ||
− | | Defines the way how data is serialized as CSV. | + | | Defines the way how data is serialized as CSV. |
| see [[CSV Module]] | | see [[CSV Module]] | ||
| | | | ||
|- valign="top" | |- valign="top" | ||
| {{Code|json}} | | {{Code|json}} | ||
− | | Defines the way how data is serialized as JSON. | + | | Defines the way how data is serialized as JSON. |
| see [[JSON Module]] | | see [[JSON Module]] | ||
| | | | ||
− | |||
− | |||
− | |||
− | |||
− | |||
|- valign="top" | |- valign="top" | ||
| {{Code|tabulator}} | | {{Code|tabulator}} | ||
− | | Uses tab characters ({{Code|\t}}) for indenting elements. | + | | Uses tab characters ({{Code|\t}}) instead of spaces for indenting elements. |
| {{Code|yes}}, {{Code|no}} | | {{Code|yes}}, {{Code|no}} | ||
| {{Code|no}} | | {{Code|no}} | ||
Line 151: | Line 149: | ||
| ''positive number'' | | ''positive number'' | ||
| {{Code|2}} | | {{Code|2}} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
|- valign="top" | |- valign="top" | ||
| {{Code|newline}} | | {{Code|newline}} | ||
Line 161: | Line 154: | ||
| {{Code|\n}}, {{Code|\r\n}}, {{Code|\r}} | | {{Code|\n}}, {{Code|\r\n}}, {{Code|\r}} | ||
| ''system dependent'' | | ''system dependent'' | ||
+ | |- valign="top" | ||
+ | | {{Code|limit}} | ||
+ | | Stops serialization after the specified number of bytes has been serialized. If a negative number is specified, everything will be output. | ||
+ | | ''positive number'' | ||
+ | | {{Code|-1}} | ||
+ | |- valign="top" | ||
+ | | {{Code|binary}} | ||
+ | | Indicates if items of binary type are output in their native byte representation. Only applicable to the <code>base</code> serialization method. | ||
+ | | {{Code|yes}}, {{Code|no}} | ||
+ | | {{Code|yes}} | ||
|} | |} | ||
− | The | + | The {{Code|csv}} and {{Code|json}} parameters are supplied with a list of options. Option names and values are combined with <code>=</code>, several options are separated by <code>,</code>: |
− | |||
<pre class="brush:xquery"> | <pre class="brush:xquery"> | ||
+ | (: The output namespace declaration is optional, because it is statically declared in BaseX) :) | ||
+ | declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization"; | ||
declare option output:method "csv"; | declare option output:method "csv"; | ||
declare option output:csv "header=yes, separator=semicolon"; | declare option output:csv "header=yes, separator=semicolon"; | ||
<csv> | <csv> | ||
<record> | <record> | ||
− | < | + | <Name>John</Name> |
+ | <City>Newton</City> | ||
+ | </record> | ||
+ | <record> | ||
+ | <Name>Jack</Name> | ||
+ | <City>Oldtown</City> | ||
</record> | </record> | ||
</csv> | </csv> | ||
+ | </pre> | ||
+ | |||
+ | If {{Code|fn:serialize}} is called, output-specific parameters can be supplied via nested options: | ||
+ | |||
+ | <pre class="brush:xquery"> | ||
+ | serialize( | ||
+ | <csv> | ||
+ | <record> | ||
+ | <Name>John</Name> | ||
+ | <City>Newton</City> | ||
+ | </record> | ||
+ | <record> | ||
+ | <Name>Jack</Name> | ||
+ | <City>Oldtown</City> | ||
+ | </record> | ||
+ | </csv>, | ||
+ | map { | ||
+ | 'method': 'csv', | ||
+ | 'csv': map { 'header': 'yes', 'separator': ';' } | ||
+ | } | ||
+ | ) | ||
</pre> | </pre> | ||
'''Result''': | '''Result''': | ||
<pre class="brush:xml"> | <pre class="brush:xml"> | ||
− | + | Name;City | |
− | ... | + | John;Newton |
+ | Jack;Oldtown | ||
+ | </pre> | ||
+ | |||
+ | =Character mappings= | ||
+ | |||
+ | Character maps allow a specific character in the instance of the data model to be replaced with a specified string of characters during serialization. The string that is substituted is output "as is," and the serializer performs no checks that the resulting document is well-formed. This may only occur in documents parsed with {{Code|parameter-document}}. If a character is mapped, then it is not subjected to XML or HTML escaping. For details refer to section [https://www.w3.org/TR/2015/CR-xslt-xquery-serialization-31-20151217/#character-maps 11 Character maps] in the [http://www.w3.org/TR/xslt-xquery-serialization-31 W3C XQuery Serialization 3.1] document | ||
+ | |||
+ | This example maps the Unicode U+00A0 NO-BREAK SPACE as &#160; (without the serialization parameter, the Unicode character would be output): | ||
+ | |||
+ | '''Example query''': | ||
+ | <pre class="brush:xquery"> | ||
+ | declare option output:parameter-document "map.xml"; | ||
+ | <x>&#xA0;</x> | ||
+ | </pre> | ||
+ | |||
+ | '''Example parameter-document''': | ||
+ | <pre class="brush:xml"> | ||
+ | <serialization-parameters | ||
+ | xmlns="http://www.w3.org/2010/xslt-xquery-serialization"> | ||
+ | <use-character-maps> | ||
+ | <character-map character="&#160;" map-string="&amp;#160;"/> | ||
+ | </use-character-maps> | ||
+ | </serialization-parameters> | ||
</pre> | </pre> | ||
=Changelog= | =Changelog= | ||
+ | |||
+ | ;Version 9.2 | ||
+ | |||
+ | * Updated: New default value for {{Code|include-content-type}} is {{Code|yes}}. | ||
+ | |||
+ | ;Version 8.4 | ||
+ | |||
+ | * Added: Serialization parameter {{Code|binary}}. | ||
+ | * Updated: New serialization method <code>basex</code>. By default, items of binary type are now output in their native byte representation. The method <code>raw</code> was removed. | ||
+ | |||
+ | ;Version 8.0 | ||
+ | |||
+ | * Added: Support for {{Code|use-character-maps}} and {{Code|parameter-document}}. | ||
+ | * Added: Serialization method {{Code|adaptive}}. | ||
+ | * Updated: {{Code|adaptive}} is new default method (before: {{Code|xml}}). | ||
+ | * Removed: {{Code|format}}, {{Code|wrap-prefix}}, {{Code|wrap-uri}}. | ||
+ | |||
+ | ;Version 7.8.2 | ||
+ | |||
+ | * Added: {{Code|limit}}: Stops serialization after the specified number of bytes has been serialized. | ||
;Version 7.8 | ;Version 7.8 | ||
− | * Added: {{Code|csv}} and {{Code|json}} serialization parameters | + | * Added: {{Code|csv}} and {{Code|json}} serialization parameters. |
− | * Removed: {{Code|separator}} option (use {{Code|item-separator}} instead) | + | * Removed: {{Code|separator}} option (use {{Code|item-separator}} instead). |
;Version 7.7.2 | ;Version 7.7.2 | ||
− | * Added: {{Code|csv}} serialization method | + | * Added: {{Code|csv}} serialization method. |
− | * Added: temporary serialization methods {{Code|csv-header}}, {{Code|csv-separator}}, {{Code|json-unescape}}, {{Code|json-spec}}, {{Code|json-format}} | + | * Added: temporary serialization methods {{Code|csv-header}}, {{Code|csv-separator}}, {{Code|json-unescape}}, {{Code|json-spec}}, {{Code|json-format}}. |
;Version 7.5 | ;Version 7.5 | ||
− | * Added: official {{Code|item-separator}} and {{Code|html-version}} parameter | + | * Added: official {{Code|item-separator}} and {{Code|html-version}} parameter. |
* Updated: <code>method=html5</code> removed; serializers updated with the [http://www.w3.org/TR/2013/WD-xslt-xquery-serialization-30-20130108/ latest version of the specification], using <code>method=html</code> and <code>version=5.0</code>. | * Updated: <code>method=html5</code> removed; serializers updated with the [http://www.w3.org/TR/2013/WD-xslt-xquery-serialization-30-20130108/ latest version of the specification], using <code>method=html</code> and <code>version=5.0</code>. | ||
;Version 7.2 | ;Version 7.2 | ||
− | * Added: {{Code|separator}} parameter | + | * Added: {{Code|separator}} parameter. |
;Version 7.1 | ;Version 7.1 | ||
− | * Added: {{Code|newline}} parameter | + | * Added: {{Code|newline}} parameter. |
;Version 7.0 | ;Version 7.0 | ||
− | * Added: Serialization parameters added to [[REST API]]; JSON/JsonML/raw methods | + | * Added: Serialization parameters added to [[REST API]]; JSON/JsonML/raw methods. |
Revision as of 06:47, 29 November 2019
This page is part of the XQuery Portal.
Serialization parameters define how XQuery items and XML nodes will be serialized (i.e., returned to the client or an API, usually in textual form). The official parameters are defined in the W3C XQuery Serialization 3.1 document. In BaseX, they can be specified by:
- including them in the prolog of the XQuery expression;
- specifying them in the XQuery functions file:write() or fn:serialize(). The serialization parameters are specified as
- children of an
<output:serialization-parameters/>
element, as defined for the fn:serialize() function, or as - map, which contains all key/value pairs:
map { "method": "xml", "cdata-section-elements": "div", ... }
;
- children of an
- using the
-s
flag of the BaseX command-line clients; - setting the
SERIALIZER
option before running a query; - setting the
EXPORTER
option before exporting a database; or - setting them as REST query parameters.
Due to the wide range of ways how parameters can be supplied, we deliberately ignored one rule of the specification, which requires non-official features to be defined in a non-null namespace URI. In the following, we will indicate which features are specific to our implementation.
Parameters
The following serialization parameters are supported by BaseX (further details can be looked up in the official specification):
Parameter | Description | Allowed | Default |
---|---|---|---|
method
|
Specifies the serialization method. xml , xhtml , html , text and adaptive are part of the official specification. For more details on basex , csv and json , see XQuery Extensions.
|
xml , xhtml , html , text , json , adaptive , csv , basex
|
basex
|
version
|
Specifies the version of the serialization method. | xml/xhtml: 1.0 , 1.1 html: 4.0 , 4.01 , 5.0 |
1.0
|
html-version
|
Specifies the version of the HTML serialization method. | 4.0 , 4.01 , 5.0
|
4.0
|
item-separator
|
Determines a string to be used as item separator. If a separator is specified, the default separation of atomic values with single whitespaces will be skipped. | arbitrary strings | empty |
encoding
|
Encoding to be used for outputting the data. | all encodings supported by Java | UTF-8
|
indent
|
Adjusts whitespaces to make the output better readable. | yes , no
|
yes
|
cdata-section-elements
|
List of elements to be output as CDATA, separated by whitespaces. Example: <text><![CDATA[ <> ]]></text>
|
||
omit-xml-declaration
|
Omits the XML declaration, which is serialized before the actual query result Example: <?xml version="1.0" encoding="UTF-8"?>
|
yes , no
|
yes
|
standalone
|
Prints or omits the "standalone" attribute in the XML declaration. | yes , no , omit
|
omit
|
doctype-system
|
Introduces the output with a document type declaration and the given system identifier. Example: <!DOCTYPE x SYSTEM "entities.dtd">
|
||
doctype-public
|
If doctype-system is specified, adds a public identifier.Example: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
|
||
undeclare-prefixes
|
Undeclares prefixes in XML 1.1. | yes , no
|
no
|
normalization-form
|
Specifies a normalization form. BaseX supports Form C (NFC ).
|
NFC , none
|
NFC
|
media-type
|
Specifies the media type. | application/xml
| |
parameter-document
|
Parses the value as XML document with additional serialization parameters (see the Serialization Specification for more details). | ||
use-character-maps
|
Defines character mappings. May only occur in documents parsed with parameter-document .
|
||
byte-order-mark
|
Prints a byte-order-mark before starting serialization. | yes , no
|
no
|
escape-uri-attributes
|
Escapes URI information in certain HTML attributes Example: <a href="%C3%A4%C3%B6%C3%BC">äöü<a>
|
yes , no
|
no
|
include-content-type
|
Inserts a meta content-type element into the head element if the result is output as HTMLExample: <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></head> . The head element must already exist or nothing will be added. Any existing meta content-type elements will be removed.
|
yes , no
|
yes
|
BaseX provides some additional serialization parameters:
Parameter | Description | Allowed | Default |
---|---|---|---|
csv
|
Defines the way how data is serialized as CSV. | see CSV Module | |
json
|
Defines the way how data is serialized as JSON. | see JSON Module | |
tabulator
|
Uses tab characters (\t ) instead of spaces for indenting elements.
|
yes , no
|
no
|
indents
|
Specifies the number of characters to be indented. | positive number | 2
|
newline
|
Specifies the type of newline to be used as end-of-line marker. | \n , \r\n , \r
|
system dependent |
limit
|
Stops serialization after the specified number of bytes has been serialized. If a negative number is specified, everything will be output. | positive number | -1
|
binary
|
Indicates if items of binary type are output in their native byte representation. Only applicable to the base serialization method.
|
yes , no
|
yes
|
The csv
and json
parameters are supplied with a list of options. Option names and values are combined with =
, several options are separated by ,
:
(: The output namespace declaration is optional, because it is statically declared in BaseX) :) declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization"; declare option output:method "csv"; declare option output:csv "header=yes, separator=semicolon"; <csv> <record> <Name>John</Name> <City>Newton</City> </record> <record> <Name>Jack</Name> <City>Oldtown</City> </record> </csv>
If fn:serialize
is called, output-specific parameters can be supplied via nested options:
serialize( <csv> <record> <Name>John</Name> <City>Newton</City> </record> <record> <Name>Jack</Name> <City>Oldtown</City> </record> </csv>, map { 'method': 'csv', 'csv': map { 'header': 'yes', 'separator': ';' } } )
Result:
Name;City John;Newton Jack;Oldtown
Character mappings
Character maps allow a specific character in the instance of the data model to be replaced with a specified string of characters during serialization. The string that is substituted is output "as is," and the serializer performs no checks that the resulting document is well-formed. This may only occur in documents parsed with parameter-document
. If a character is mapped, then it is not subjected to XML or HTML escaping. For details refer to section 11 Character maps in the W3C XQuery Serialization 3.1 document
This example maps the Unicode U+00A0 NO-BREAK SPACE as   (without the serialization parameter, the Unicode character would be output):
Example query:
declare option output:parameter-document "map.xml"; <x> </x>
Example parameter-document:
<serialization-parameters xmlns="http://www.w3.org/2010/xslt-xquery-serialization"> <use-character-maps> <character-map character=" " map-string="&#160;"/> </use-character-maps> </serialization-parameters>
Changelog
- Version 9.2
- Updated: New default value for
include-content-type
isyes
.
- Version 8.4
- Added: Serialization parameter
binary
. - Updated: New serialization method
basex
. By default, items of binary type are now output in their native byte representation. The methodraw
was removed.
- Version 8.0
- Added: Support for
use-character-maps
andparameter-document
. - Added: Serialization method
adaptive
. - Updated:
adaptive
is new default method (before:xml
). - Removed:
format
,wrap-prefix
,wrap-uri
.
- Version 7.8.2
- Added:
limit
: Stops serialization after the specified number of bytes has been serialized.
- Version 7.8
- Added:
csv
andjson
serialization parameters. - Removed:
separator
option (useitem-separator
instead).
- Version 7.7.2
- Added:
csv
serialization method. - Added: temporary serialization methods
csv-header
,csv-separator
,json-unescape
,json-spec
,json-format
.
- Version 7.5
- Added: official
item-separator
andhtml-version
parameter. - Updated:
method=html5
removed; serializers updated with the latest version of the specification, usingmethod=html
andversion=5.0
.
- Version 7.2
- Added:
separator
parameter.
- Version 7.1
- Added:
newline
parameter.
- Version 7.0
- Added: Serialization parameters added to REST API; JSON/JsonML/raw methods.