Difference between revisions of "CSV Module"
(→Rules) |
|||
Line 8: | Line 8: | ||
=Rules= | =Rules= | ||
− | {{Version|7.7.2}}: the conversion rules have been updated and aligned with the JSON parser | + | {{Version|7.7.2}}: the conversion rules have been updated and aligned with the JSON parser. |
− | + | CSV is converted to XML as follows: | |
− | # The resulting document has a {{Code|<csv/>}} root | + | # The resulting XML document has a {{Code|<csv/>}} root elements. |
− | # Rows are represented via {{Code|<record/>}} | + | # Rows are represented via {{Code|<record/>}} elements. |
− | # Fields are | + | # Fields are represented via {{Code|<entry/>}} elements. The value of a field is represented as text node. |
− | ## Empty | + | # If the {{Code|header}} option is set to {{Code|true}}, the first text line is parsed as table header, and the {{Code|entry}} elements are replaced with the field names: |
− | ## | + | ## Empty names are represented by a single underscore ({{Code|_}}), and characters that are not valid in element names are replaced with underscores. |
− | # | + | ## If the {{Code|lax}} option is set to {{Code|false}}, invalid characters will be rewritten to an underscore and the character’s four-digit Unicode, and underscores will be represented as two underscores ({{Code|__}}). The resulting element names may be less readable, but can always be converted back to the original field names. |
+ | # If {{Code|format}} is set to {{Code|attributes}}, field names will be stored in name attributes. | ||
+ | |||
+ | If the JSON parser is selected in the Database Creation dialog of the GUI, a simple example is displayed to show the effects of the available options. | ||
=Functions= | =Functions= |
Revision as of 23:43, 17 October 2013
This XQuery Module contains a single function to parse CSV input. CSV (comma-separated values) is a popular representation for tabular data, exported e. g. from Excel.
Conventions
All functions in this module are assigned to the http://basex.org/modules/csv
namespace, which is statically bound to the csv
prefix.
All errors are assigned to the http://basex.org/errors
namespace, which is statically bound to the bxerr
prefix.
Rules
Version 7.7.2: the conversion rules have been updated and aligned with the JSON parser.
CSV is converted to XML as follows:
- The resulting XML document has a
<csv/>
root elements. - Rows are represented via
<record/>
elements. - Fields are represented via
<entry/>
elements. The value of a field is represented as text node. - If the
header
option is set totrue
, the first text line is parsed as table header, and theentry
elements are replaced with the field names:- Empty names are represented by a single underscore (
_
), and characters that are not valid in element names are replaced with underscores. - If the
lax
option is set tofalse
, invalid characters will be rewritten to an underscore and the character’s four-digit Unicode, and underscores will be represented as two underscores (__
). The resulting element names may be less readable, but can always be converted back to the original field names.
- Empty names are represented by a single underscore (
- If
format
is set toattributes
, field names will be stored in name attributes.
If the JSON parser is selected in the Database Creation dialog of the GUI, a simple example is displayed to show the effects of the available options.
Functions
csv:parse
Signatures | csv:parse($input as xs:string) as element(csv) csv:parse($input as xs:string, $options as item()) as element(csv)
|
Summary | Converts the CSV data specified by $input to XML, and returns the result as element(csv) value.The $options argument can be used to control the way the input is converted. The following options are available:
Options can either be specified
<csv:options> <csv:separator value=';'/> ... </csv:options>
{ 'separator' : ';', ... } |
Errors | BXCS0001 : the input cannot be converted.BXCS0003 : the specified separator must be a single character.
|
csv:serialize
Signatures | csv:serialize($input as node(), $options as item()) as xs:string
|
Summary | Serializes the node specified by $input as CSV data, and returns the result as xs:string .XML documents can also be serialized as CSV if the Serialization Option method is set to csv .The $options argument can be used to control the way the node is serialized. The following options are available:
Options can either be specified
<csv:options> <csv:separator value=';'/> ... </csv:options>
{ 'separator' : ';', ... } |
Errors | BXCS0002 : the input cannot be serialized.BXCS0003 : the specified separator must be a single character.
|
Errors
Code | Description |
---|---|
BXCS0001
|
The input cannot be converted. |
BXCS0002
|
The node cannot be serialized. |
BXCS0001
|
The specified separator must be a single character. |
Changelog
The module was introduced with Version 7.7.2.