Serialization

From BaseX Documentation
Revision as of 00:41, 10 January 2013 by CG (talk | contribs) (→‎Changelog)
Jump to navigation Jump to search

This page is part of the XQuery Portal. Serialization parameters define how XQuery items and XML nodes are textually output, i.e., serialized. (For input see Parsers.) They have been formalized in the W3C XQuery Serialization 3.0 document. In BaseX, they can be specified in several ways:

Parameters

The following table gives a brief summary of all serialization parameters recognized by BaseX. For details, please refer to official specification.

Parameter Description Allowed Default Examples
method Specifies the serialization method:
  • xml, xhtml, html, and text are adopted from the official specification.
  • json and jsonml are specific to BaseX and can be used to output XML nodes in the JSON format (see the JSON Module for more details).
  • raw is BaseX-specific as well: Binary data types are output in their raw form, i.e., without modifications. For all other types, the items’ string values are returned. No indentation takes place, and and no characters are encoded via entities.
xml, xhtml, html, text, json, jsonml, raw xml method=xml
version Specifies the version of the serialization method. Added with Version 7.5. xml/xhtml: 1.0, 1.1
html: 4.0, 4.01, 5.0
1.0 version=1.0
html-version Specifies the version of the HTML serialization method. Added with Version 7.5. 4.0, 4.01, 5.0 4.0 html-version=5.0
item-separator Determines a string to be used as item separator. If a separator is specified, the default separation of atomic values with single whitespaces will be skipped. arbitrary strings, \n, \r\n, \r empty item-separator=&#a;
encoding Encoding to be used for outputting the data. all encodings supported by Java UTF-8 encoding=US-ASCII
indent Adjusts whitespaces to make the output better readable. yes, no yes indent=no
cdata-section-elements List of elements to be output as CDATA, separated by whitespaces.
Example: <text><![CDATA[ <> ]]></text>
cdata-section-elements=text
omit-xml-declaration Omits the XML declaration, which is serialized before the actual query result
Example: <?xml version="1.0" encoding="UTF-8"?>
yes, no yes omit-xml-declaration=no
standalone Prints or omits the "standalone" attribute in the XML declaration. yes, no, omit omit standalone=yes
doctype-system Introduces the output with a document type declaration and the given system identifier.
Example: <!DOCTYPE x SYSTEM "entities.dtd">
doctype-system=entities.dtd
doctype-public If doctype-system is specified, adds a public identifier.
Example: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
doctype-public=-//W3C//DTD HTML 4.01//EN,
doctype-system=http://www.w3.org/TR/html4/strict.dtd
undeclare-prefixes Undeclares prefixes in XML 1.1. yes, no no undeclare-prefixes=yes
normalization-form Specifies a normalization form. BaseX supports Form C (NFC). NFC, none NFC normalization-form=none
media-type Specifies the media type. application/xml media-type=text/plain
use-character-maps Defines character mappings (not supported).
byte-order-mark Prints a byte-order-mark before starting serialization. yes, no no byte-order-mark=yes
escape-uri-attributes Escapes URI information in certain HTML attributes
Example: <a href="%C3%A4%C3%B6%C3%BC">äöü<a>
yes, no no escape-uri-attributes=yes, method=html
include-content-type Includes a meta content-type element if the result is output as HTML
Example: <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></head>
yes, no no include-content-type=yes, method=html

BaseX provides some additional, implementation-specific serialization parameters:

Parameter Description Allowed Default Examples
format Turns output formatting on/off, including the conversion of special characters to entities and insertion of item separators. yes, no yes format=no
tabulator Uses tab characters (\t) for indenting elements. yes, no no tabulator=yes
indents Specifies the number of characters to be indented. positive number 2 indents=1, tabulator=yes
wrap-prefix,
wrap-uri
Specifies a prefix and/or URI for wrapping the query results. wrap-prefix=rest, wrap-uri=http://basex.org/rest
newline Specifies the type of newline to be used as end-of-line marker. \n, \r\n, \r system dependent newline=\r\n
separator Determines the string to be used as item separator (deprecated, replaced with item-separator). \n, \r\n, \r, arbitrary strings single space separator=\n

Changelog

Version 7.5
  • Added: official item-separator and html-version parameter
  • Updated: method=html5 removed; serializers updated with the latest version of the specification, using method=html and version=5.0.
Version 7.2
  • Added: separator parameter
Version 7.1
  • Added: newline parameter
Version 7.0
  • Added: Serialization parameters added to REST API; JSON/JsonML/raw methods