Difference between revisions of "Conversion Module"

From BaseX Documentation
Jump to navigation Jump to search
Line 245: Line 245:
 
* Characters that are no valid NCName characters are rewritten to an underscore and the character’s four-digit Unicode. For example, the exclamation mark {{Code|?}} is transformed to {{Code|_003f}}.
 
* Characters that are no valid NCName characters are rewritten to an underscore and the character’s four-digit Unicode. For example, the exclamation mark {{Code|?}} is transformed to {{Code|_003f}}.
 
* If lax conversion is chosen, invalid characters are replaced with underscores or (when invalid as first character of an element name) prefixed with an underscore. The resulting string may be better readable, but it cannot necessarily be converted back to the original form.
 
* If lax conversion is chosen, invalid characters are replaced with underscores or (when invalid as first character of an element name) prefixed with an underscore. The resulting string may be better readable, but it cannot necessarily be converted back to the original form.
This encoding is used by the {{Code|direct}} conversion format in the [[JSON Module]] and the [[CSV Module]].
+
This encoding is employed by the {{Code|direct}} conversion format in the [[JSON Module]] and the [[CSV Module]].
 
|-
 
|-
 
| '''Examples'''
 
| '''Examples'''
Line 262: Line 262:
 
|-
 
|-
 
| '''Summary'''
 
| '''Summary'''
|Decodes the specified {{Code|$key}} (with the optional {{Code|$lax}} conversion method) to the original string representation.<br />Keys supplied to this function are usually element names from documents that have been created with the [[JSON Module]] and the [[CSV Module]].
+
|Decodes the specified {{Code|$key}} (with the optional {{Code|$lax}} conversion method) to the original string representation.<br />Keys supplied to this function are usually element names from documents that have been created with the [[JSON Module]] or [[CSV Module]].
 
|-
 
|-
 
| '''Examples'''
 
| '''Examples'''

Revision as of 15:00, 14 April 2020

This XQuery Module contains functions to convert data between different formats.

Conventions

All functions and errors in this module are assigned to the http://basex.org/modules/convert namespace, which is statically bound to the convert prefix.

Strings

convert:binary-to-string

Signatures convert:binary-to-string($bytes as xs:anyAtomicType) as xs:string
convert:binary-to-string($bytes as xs:anyAtomicType, $encoding as xs:string) as xs:string
convert:binary-to-string($bytes as xs:anyAtomicType, $encoding as xs:string, $fallback as xs:boolean) as xs:string
Summary Converts the specifed $bytes (xs:base64Binary, xs:hexBinary) to a string:
  • The UTF-8 default encoding can be overwritten with the optional $encoding argument.
  • By default, invalid characters will be rejected. If $fallback is set to true, these characters will be replaced with the Unicode replacement character FFFD (�).
Errors string: The input is an invalid XML string, or the wrong encoding has been specified.
BXCO0002: The specified encoding is invalid or not supported.
Examples
  • convert:binary-to-string(xs:hexBinary('48656c6c6f576f726c64')) yields HelloWorld.

convert:string-to-base64

Signatures convert:string-to-base64($string as xs:string) as xs:base64Binary
convert:string-to-base64($string as xs:string, $encoding as xs:string) as xs:base64Binary
Summary Converts the specified $string to an xs:base64Binary item. If the default encoding is chosen, conversion will be cheap, as strings and binaries are both internally represented as byte arrays.
The UTF-8 default encoding can be overwritten with the optional $encoding argument.
Errors binary: The input cannot be represented in the specified encoding.
encoding: The specified encoding is invalid or not supported.
Examples
  • string(convert:string-to-base64('HelloWorld')) yields SGVsbG9Xb3JsZA==.

convert:string-to-hex

Signatures convert:string-to-hex($string as xs:string) as xs:hexBinary
convert:string-to-hex($string as xs:string, $encoding as xs:string) as xs:hexBinary
Summary Converts the specified $string to an xs:hexBinary item. If the default encoding is chosen, conversion will be cheap, as strings and binaries are both internally represented as byte arrays.
The UTF-8 default encoding can be overwritten with the optional $encoding argument.
Errors binary: The input cannot be represented in the specified encoding.
encoding: The specified encoding is invalid or not supported.
Examples
  • string(convert:string-to-hex('HelloWorld')) yields 48656C6C6F576F726C64.

Binary Data

convert:integers-to-base64

Signatures convert:integers-to-base64($integers as xs:integer*) as xs:base64Binary
Summary Converts the specified $integers to an item of type xs:base64Binary:
  • Only the first 8 bits of the supplied integers will be considered.
  • Conversion of byte sequences is very efficient, as items of binary type are internally represented as byte arrays.

convert:integers-to-hex

Signatures convert:integers-to-hex($integers as xs:integer*) as xs:hexBinary
Summary Converts the specified $integers to an item of type xs:hexBinary:
  • Only the first 8 bits of the supplied integers will be considered.
  • Conversion of byte sequences is very efficient, as items of binary type are internally represented as byte arrays.

convert:binary-to-integers

Signatures convert:binary-to-integers($binary as xs:anyAtomicType) as xs:integer*
Summary Returns the specified $binary (xs:base64Binary, xs:hexBinary) as a sequence of unsigned integers (octets).
Examples
  • convert:binary-to-integers(xs:hexBinary('FF')) yields 255.

convert:binary-to-bytes

Signatures convert:binary-to-bytes($binary as xs:anyAtomicType) as xs:byte*
Summary Returns the specified $binary (xs:base64Binary, xs:hexBinary) as a sequence of bytes. The conversion is very cheap and takes no additional memory, as items of binary type are internally represented as byte arrays.
Examples
  • convert:binary-to-bytes(xs:base64Binary('QmFzZVggaXMgY29vbA==')) yields the sequence (66, 97, 115, 101, 88, 32, 105, 115, 32, 99, 111, 111, 108).
  • convert:binary-to-bytes(xs:hexBinary("4261736558")) yields the sequence (66 97 115 101 88).

Numbers

convert:integer-to-base

Signatures convert:integer-to-base($number as xs:integer, $base as xs:integer) as xs:string
Summary Converts $number to a string, using the specified $base, interpreting it as a 64-bit unsigned integer.
The first base elements of the sequence '0',..,'9','a',..,'z' are used as digits.
Valid bases are 2, .., 36.
Errors base: The specified base is not in the range 2-36.
Examples
  • convert:integer-to-base(-1, 16) yields 'ffffffffffffffff'.
  • convert:integer-to-base(22, 5) yields '42'.

convert:integer-from-base

Signatures convert:integer-from-base($string as xs:string, $base as xs:integer) as xs:integer
Summary Decodes an integer from $string, using the specified $base.
The first base elements of the sequence '0',..,'9','a',..,'z' are allowed as digits; case does not matter.
Valid bases are 2 - 36.
If the supplied string contains more than 64 bits of information, the result will be truncated.
Errors base: The specified base is not in the range 2-36.
integer: The specified digit is not valid for the given range.
Examples
  • convert:integer-from-base('ffffffffffffffff', 16) yields -1.
  • convert:integer-from-base('CAFEBABE', 16) yields 3405691582.
  • convert:integer-from-base('42', 5) yields 22.
  • convert:integer-from-base(convert:integer-to-base(123, 7), 7) yields 123.

Dates and Durations

convert:integer-to-dateTime

Signatures convert:integer-to-dateTime($milliseconds as xs:integer) as xs:dateTime
Summary Converts the specified number of $milliseconds since 1 Jan 1970 to an item of type xs:dateTime.
Examples
  • convert:integer-to-dateTime(0) yields 1970-01-01T00:00:00Z.
  • convert:integer-to-dateTime(1234567890123) yields 2009-02-13T23:31:30.123Z.
  • convert:integer-to-dateTime(prof:current-ms()) returns the current miliseconds in the xs:dateTime format.

convert:dateTime-to-integer

Signatures convert:dateTime-to-integer($dateTime as xs:dateTime) as xs:integer
Summary Converts the specified $dateTime item to the number of milliseconds since 1 Jan 1970.
Examples
  • convert:dateTime-to-integer(xs:dateTime('1970-01-01T00:00:00Z')) yields 0.

convert:integer-to-dayTime

Signatures convert:integer-to-dayTime($milliseconds as xs:integer) as xs:dayTimeDuration
Summary Converts the specified number of $milliseconds to an item of type xs:dayTimeDuration.
Examples
  • convert:integer-to-dayTime(1234) yields PT1.234S.

convert:dayTime-to-integer

Signatures convert:dayTime-to-integer($dayTime as xs:dayTimeDuration) as xs:integer
Summary Converts the specified $dayTime duration to milliseconds represented by an integer.
Examples
  • convert:dayTime-to-integer(xs:dayTimeDuration('PT1S')) yields 1000.

Keys

convert:encode-key

Template:Mark

Signatures convert:encode-key($key as xs:string) as xs:string
convert:encode-key($key as xs:string, $lax as xs:boolean) as xs:string
Summary Encodes the specified $key (with the optional $lax conversion method) to a valid NCName representation, which can be used to create an element node:
  • An empty string is converted to a single underscore (_).
  • Existing underscores are rewritten to two underscores (__).
  • Characters that are no valid NCName characters are rewritten to an underscore and the character’s four-digit Unicode. For example, the exclamation mark ? is transformed to _003f.
  • If lax conversion is chosen, invalid characters are replaced with underscores or (when invalid as first character of an element name) prefixed with an underscore. The resulting string may be better readable, but it cannot necessarily be converted back to the original form.

This encoding is employed by the direct conversion format in the JSON Module and the CSV Module.

Examples
  • element { convert:encode-key("!") } { } creates a new element with an encoded name: <_0021/>.

convert:decode-key

Template:Mark

Signatures convert:decode-key($key as xs:string) as xs:string
convert:decode-key($key as xs:string, $lax as xs:boolean) as xs:string
Summary Decodes the specified $key (with the optional $lax conversion method) to the original string representation.
Keys supplied to this function are usually element names from documents that have been created with the JSON Module or CSV Module.
Examples
  • convert:decode-key('_0021')) yields !.
  • json:doc("doc.json")//* ! convert:decode-key(name()) yields the original string representation of all names of a JSON document.
Errors key: The specified key cannot be decoded to its original representation.

Errors

Code Description
base The specified base is not in the range 2-36.
binary The input cannot be converted to a binary representation.
encoding The specified encoding is invalid or not supported.
integer The specified digit is not valid for the given range.
key The specified key cannot be decoded to its original representation.
string The input is an invalid XML string, or the wrong encoding has been specified.

Changelog

Version 9.4
Version 9.0
Version 8.5
Version 7.5

The module was introduced with Version 7.3. Some of the functions have been adopted from the obsolete Utility Module.