Difference between revisions of "CSV Module"
Line 26: | Line 26: | ||
==csv:parse== | ==csv:parse== | ||
− | {{Version|7.8}}: the return type has been changed from {{Code|element(<csv>)}} to {{Code|document-node(element(<csv>))}} | + | {{Version|7.8}}: the return type has been changed from {{Code|element(<csv>)}} to {{Code|document-node(element(<csv>))}}, and the {{Code|format}} and {{Code|lax}} options have been added. |
{| width='100%' | {| width='100%' | ||
Line 35: | Line 35: | ||
| '''Summary''' | | '''Summary''' | ||
|Converts the CSV data specified by {{Code|$input}} to XML, and returns the result as {{Code|<csv/>}} value.<br/>The {{Code|$options}} argument can be used to control the way the input is converted. The following options are available: | |Converts the CSV data specified by {{Code|$input}} to XML, and returns the result as {{Code|<csv/>}} value.<br/>The {{Code|$options}} argument can be used to control the way the input is converted. The following options are available: | ||
− | * {{Code|separator}} defines the character which separates columns in a row. | + | * {{Code|separator}} defines the character which separates columns in a row. Allowed values are {{Code|comma}}, {{Code|semicolon}}, {{Code|colon}}, {{Code|tab}}, {{Code|space}} and single characters. The default is {{Code|comma}}. |
− | * {{Code|header}} specifies if the input | + | * {{Code|header}} specifies if the first line of the input is parsed as table header. Allowed values are {{Code|yes}} and {{Code|no}}; the default is {{Code|no}}. |
Options can either be specified<br /> | Options can either be specified<br /> | ||
* as children of an {{Code|<csv:options/>}} element; e.g.: | * as children of an {{Code|<csv:options/>}} element; e.g.: | ||
Line 47: | Line 47: | ||
* or as map, which contains all key/value pairs: | * or as map, which contains all key/value pairs: | ||
<pre class="brush:xquery"> | <pre class="brush:xquery"> | ||
− | { 'separator' : ';', ... } | + | { 'separator': ';', ... } |
</pre> | </pre> | ||
|- | |- |
Revision as of 23:49, 17 October 2013
This XQuery Module contains a single function to parse CSV input. CSV (comma-separated values) is a popular representation for tabular data, exported e. g. from Excel.
Conventions
All functions in this module are assigned to the http://basex.org/modules/csv
namespace, which is statically bound to the csv
prefix.
All errors are assigned to the http://basex.org/errors
namespace, which is statically bound to the bxerr
prefix.
Rules
Version 7.7.2: the conversion rules have been updated and aligned with the JSON parser.
CSV is converted to XML as follows:
- The resulting XML document has a
<csv/>
root elements. - Rows are represented via
<record/>
elements. - Fields are represented via
<entry/>
elements. The value of a field is represented as text node. - If the
header
option is set totrue
, the first text line is parsed as table header, and theentry
elements are replaced with the field names:- Empty names are represented by a single underscore (
_
), and characters that are not valid in element names are replaced with underscores. - If the
lax
option is set tofalse
, invalid characters will be rewritten to an underscore and the character’s four-digit Unicode, and underscores will be represented as two underscores (__
). The resulting element names may be less readable, but can always be converted back to the original field names.
- Empty names are represented by a single underscore (
- If
format
is set toattributes
, field names will be stored in name attributes.
If the JSON parser is selected in the Database Creation dialog of the GUI, a simple example is displayed to show the effects of the available options.
Functions
csv:parse
Version 7.8: the return type has been changed from element(<csv>)
to document-node(element(<csv>))
, and the format
and lax
options have been added.
Signatures | csv:parse($input as xs:string) as document-node(element(csv)) csv:parse($input as xs:string, $options as item()) as document-node(element(csv))
|
Summary | Converts the CSV data specified by $input to XML, and returns the result as <csv/> value.The $options argument can be used to control the way the input is converted. The following options are available:
Options can either be specified
<csv:options> <csv:separator value=';'/> ... </csv:options>
{ 'separator': ';', ... } |
Errors | BXCS0001 : the input cannot be converted.BXCS0003 : the specified separator must be a single character.
|
csv:serialize
Signatures | csv:serialize($input as node(), $options as item()) as xs:string
|
Summary | Serializes the node specified by $input as CSV data, and returns the result as xs:string .XML documents can also be serialized as CSV if the Serialization Option method is set to csv .The $options argument can be used to control the way the node is serialized. The following options are available:
Options can either be specified
<csv:options> <csv:separator value=';'/> ... </csv:options>
{ 'separator' : ';', ... } |
Errors | BXCS0002 : the input cannot be serialized.BXCS0003 : the specified separator must be a single character.
|
Errors
Code | Description |
---|---|
BXCS0001
|
The input cannot be converted. |
BXCS0002
|
The node cannot be serialized. |
BXCS0001
|
The specified separator must be a single character. |
Changelog
The module was introduced with Version 7.7.2.