Difference between revisions of "CSV Module"

From BaseX Documentation
Jump to navigation Jump to search
Line 30: Line 30:
 
|Converts the CSV data specified by {{Code|$input}} to XML, and returns the result as {{Code|element(csv)}} value.<br/>The {{Code|$options}} argument can be used to control the way the input is converted. The following options are available:
 
|Converts the CSV data specified by {{Code|$input}} to XML, and returns the result as {{Code|element(csv)}} value.<br/>The {{Code|$options}} argument can be used to control the way the input is converted. The following options are available:
 
* {{Code|separator}} defines the character which separates columns in a row. By default, this is a comma ({{Code|,}}).
 
* {{Code|separator}} defines the character which separates columns in a row. By default, this is a comma ({{Code|,}}).
* {{Code|headers}} specifies if the input contains a header row. The default value is {{Code|false}}.
+
* {{Code|header}} specifies if the input contains a header row. The default value is {{Code|false}}.
 
Options can either be specified<br />
 
Options can either be specified<br />
 
* as children of an {{Code|<csv:options/>}} element; e.g.:
 
* as children of an {{Code|<csv:options/>}} element; e.g.:
Line 57: Line 57:
 
|Serializes the node specified by {{Code|$input}} as CSV data, and returns the result as {{Code|xs:string}}.<br />XML documents can also be serialized as CSV if the [[Serialization|Serialization Option]] {{Code|"method"}} is set to {{Code|"csv"}}.<br/>The {{Code|$options}} argument can be used to control the way the node is serialized. The following options are available:
 
|Serializes the node specified by {{Code|$input}} as CSV data, and returns the result as {{Code|xs:string}}.<br />XML documents can also be serialized as CSV if the [[Serialization|Serialization Option]] {{Code|"method"}} is set to {{Code|"csv"}}.<br/>The {{Code|$options}} argument can be used to control the way the node is serialized. The following options are available:
 
* {{Code|separator}} defines the character which separates columns in a row. By default, this is a comma ({{Code|,}}).
 
* {{Code|separator}} defines the character which separates columns in a row. By default, this is a comma ({{Code|,}}).
* {{Code|headers}} specifies if the input element names are to be interpreted as header names. The default value is {{Code|false}}.
+
* {{Code|header}} specifies if the input element names are to be interpreted as header names. The default value is {{Code|false}}.
 
Options can either be specified<br />
 
Options can either be specified<br />
 
* as children of an {{Code|<csv:options/>}} element; e.g.:
 
* as children of an {{Code|<csv:options/>}} element; e.g.:

Revision as of 19:17, 24 September 2013

This XQuery Module contains a single function to parse CSV input. CSV (comma-separated values) is a popular representation for tabular data, exported e. g. from Excel.

Conventions

All functions in this module are assigned to the http://basex.org/modules/csv namespace, which is statically bound to the csv prefix.
All errors are assigned to the http://basex.org/errors namespace, which is statically bound to the bxerr prefix.

Rules

Version 7.7.2: the conversion rules have been updated and aligned with the JSON parser:

The conversion of CSV data is based on the following rules:

  1. The resulting document has a <csv/> root node.
  2. Rows are represented via <record/> nodes.
  3. Fields are either named entry or (if the CSV header is parsed) named by the corresponding column name:
    1. Empty field names are represented by a single underscore (<_>...</_>).
    2. Underscore characters are rewritten to two underscores (__).
    3. A character that cannot be represented as NCName character is rewritten to an underscore and its four-digit Unicode.

Functions

csv:parse

Signatures csv:parse($input as xs:string) as element(csv)
csv:parse($input as xs:string, $options as item()) as element(csv)
Summary Converts the CSV data specified by $input to XML, and returns the result as element(csv) value.
The $options argument can be used to control the way the input is converted. The following options are available:
  • separator defines the character which separates columns in a row. By default, this is a comma (,).
  • header specifies if the input contains a header row. The default value is false.

Options can either be specified

  • as children of an <csv:options/> element; e.g.:
<csv:options>
  <csv:separator value=';'/>
  ...
</csv:options>
  • or as map, which contains all key/value pairs:
{ 'separator' : ';', ... }
Errors BXCS0001: the input cannot be converted.
BXCS0003: the specified separator must be a single character.

csv:serialize

Signatures csv:serialize($input as node(), $options as item()) as xs:string
Summary Serializes the node specified by $input as CSV data, and returns the result as xs:string.
XML documents can also be serialized as CSV if the Serialization Option "method" is set to "csv".
The $options argument can be used to control the way the node is serialized. The following options are available:
  • separator defines the character which separates columns in a row. By default, this is a comma (,).
  • header specifies if the input element names are to be interpreted as header names. The default value is false.

Options can either be specified

  • as children of an <csv:options/> element; e.g.:
<csv:options>
  <csv:separator value=';'/>
  ...
</csv:options>
  • or as map, which contains all key/value pairs:
{ 'separator' : ';', ... }
Errors BXCS0002: the input cannot be serialized.
BXCS0003: the specified separator must be a single character.

Errors

Code Description
BXCS0001 The input cannot be converted.
BXCS0002 The node cannot be serialized.
BXCS0001 The specified separator must be a single character.

Changelog

The module was introduced with Version 7.7.2.