Difference between revisions of "Serialization"

From BaseX Documentation
Jump to navigation Jump to search
Line 129: Line 129:
 
| {{Code|csv}}
 
| {{Code|csv}}
 
| Defines the way how data is serialized as CSV. {{Version|7.8}}
 
| Defines the way how data is serialized as CSV. {{Version|7.8}}
| see [[CSV Module#csv:serialize|csv:serialize]]
+
| see [[CSV Module]]
 
|
 
|
 
|- valign="top"
 
|- valign="top"
 
| {{Code|json}}
 
| {{Code|json}}
 
| Defines the way how data is serialized as JSON. {{Version|7.8}}
 
| Defines the way how data is serialized as JSON. {{Version|7.8}}
| see [[JSON Module#json:serialize|json:serialize]]
+
| see [[JSON Module]]
 
|  
 
|  
 
|- valign="top"
 
|- valign="top"
Line 174: Line 174:
 
</csv>
 
</csv>
 
</pre>
 
</pre>
 
{| class="wikitable sortable" width="100%"
 
|- valign="top"
 
! width="140" | Parameter
 
! width="60%" | Description
 
! Allowed
 
! Default
 
|- valign="top"
 
| {{Code|separator}}
 
| Defines the character which separates the entries of a record in a single line.
 
| {{Code|comma}}, {{Code|semicolon}}, {{Code|colon}}, {{Code|tab}}, {{Code|space}} or a ''single character''
 
| {{Code|comma}}
 
|- valign="top"
 
| {{Code|format}}
 
| Specifies the CSV format of the XML data.
 
| {{Code|direct}}, {{Code|attributes}}
 
| {{Code|direct}}
 
|- valign="top"
 
| {{Code|header}}
 
| Indicates if the original CSV data contained a header row.
 
| {{Code|yes}}, {{Code|no}}
 
| {{Code|no}}
 
|- valign="top"
 
| {{Code|lax}}
 
| Specifies if a lax approach is used to convert QNames to JSON names.
 
| {{Code|yes}}, {{Code|no}}
 
| {{Code|yes}}
 
|- valign="top"
 
| {{Code|json-spec}}
 
| Determines the used JSON specification. {{Version|7.7.2}}
 
| {{Code|RFC4627}}, {{Code|ECMA-262}}, {{Code|liberal}}
 
| {{Code|RFC4627}}
 
|- valign="top"
 
| {{Code|json-unescape}}
 
| Determines whether escape sequences (marked by a backslash) in the input are expanded. {{Version|7.7.2}}
 
| {{Code|yes}}, {{Code|no}}
 
| {{Code|yes}}
 
|- valign="top"
 
| {{Code|json-format}}
 
| Determines the conversion format. {{Version|7.7.2}}
 
| {{Code|json}}, {{Code|jsonml}}
 
| {{Code|json}}
 
|}
 
  
 
=Changelog=
 
=Changelog=

Revision as of 00:41, 18 October 2013

This page is part of the XQuery Portal. Serialization parameters define how XQuery items and XML nodes are textually output, i.e., serialized. (For input, see Parsers.) They have been formalized in the W3C XQuery Serialization 3.0 document. In BaseX, they can be specified in several ways:

Parameters

The following table gives a brief summary of all serialization parameters recognized by BaseX. For details, please refer to official specification.

Parameter Description Allowed Default
method Specifies the serialization method:
  • xml, xhtml, html, and text are adopted from the official specification.
  • json is specific to BaseX and can be used to output XML nodes as JSON objects (see the JSON Module for more details).
  • Version 7.7.2: csv is BaseX-specific and can be used to output XML nodes as CSV data (see the CSV Module for more details).
  • raw is BaseX-specific, too: Binary data types are output in their raw form, i.e., without modifications. For all other types, the items’ string values are returned. No indentation takes place, and and no characters are encoded via entities.
  • Version 7.7.2: jsonml is deprecated.
xml, xhtml, html, text, json, csv, raw xml
version Specifies the version of the serialization method. xml/xhtml: 1.0, 1.1
html: 4.0, 4.01, 5.0
1.0
html-version Specifies the version of the HTML serialization method. 4.0, 4.01, 5.0 4.0
item-separator Determines a string to be used as item separator. If a separator is specified, the default separation of atomic values with single whitespaces will be skipped. arbitrary strings, \n, \r\n, \r empty
encoding Encoding to be used for outputting the data. all encodings supported by Java UTF-8
indent Adjusts whitespaces to make the output better readable. yes, no yes
cdata-section-elements List of elements to be output as CDATA, separated by whitespaces.
Example: <text><![CDATA[ <> ]]></text>
omit-xml-declaration Omits the XML declaration, which is serialized before the actual query result
Example: <?xml version="1.0" encoding="UTF-8"?>
yes, no yes
standalone Prints or omits the "standalone" attribute in the XML declaration. yes, no, omit omit
doctype-system Introduces the output with a document type declaration and the given system identifier.
Example: <!DOCTYPE x SYSTEM "entities.dtd">
doctype-public If doctype-system is specified, adds a public identifier.
Example: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
undeclare-prefixes Undeclares prefixes in XML 1.1. yes, no no
normalization-form Specifies a normalization form. BaseX supports Form C (NFC). NFC, none NFC
media-type Specifies the media type. application/xml
use-character-maps Defines character mappings (not supported).
byte-order-mark Prints a byte-order-mark before starting serialization. yes, no no
escape-uri-attributes Escapes URI information in certain HTML attributes
Example: <a href="%C3%A4%C3%B6%C3%BC">äöü<a>
yes, no no
include-content-type Includes a meta content-type element if the result is output as HTML
Example: <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></head>
yes, no no

BaseX provides additional, implementation-specific serialization parameters:

Parameter Description Allowed Default
csv Defines the way how data is serialized as CSV. Version 7.8 see CSV Module
json Defines the way how data is serialized as JSON. Version 7.8 see JSON Module
format Turns output formatting on/off, including the conversion of special characters to entities and insertion of item separators. yes, no yes
tabulator Uses tab characters (\t) for indenting elements. yes, no no
indents Specifies the number of characters to be indented. positive number 2
wrap-prefix,
wrap-uri
Specifies a prefix and/or URI for wrapping the query results.
newline Specifies the type of newline to be used as end-of-line marker. \n, \r\n, \r system dependent

The values of the csv and json parameters may be a list of CSV/JSON option names and values, combined with = and separated by ,:

declare option output:method "csv";
declare option output:csv "header=yes, separator=semicolon";
<csv>
  <record>
    <Text>...bla...</Text>
  </record>
</csv>

Changelog

Version 7.8
  • Added: csv and json serialization parameters
  • Removed: separator option (use item-separator instead)
Version 7.7.2
  • Added: csv serialization method
  • Added: temporary serialization methods csv-header, csv-separator, json-unescape, json-spec, json-format
Version 7.5
  • Added: official item-separator and html-version parameter
  • Updated: method=html5 removed; serializers updated with the latest version of the specification, using method=html and version=5.0.
Version 7.2
  • Added: separator parameter
Version 7.1
  • Added: newline parameter
Version 7.0
  • Added: Serialization parameters added to REST API; JSON/JsonML/raw methods