Changes

Jump to navigation Jump to search
1,232 bytes added ,  16:18, 11 February 2017
=Conventions=
All functions in this module are assigned to the {{Code|<code><nowiki>http://basex.org/modules/csv}} </nowiki></code> namespace, which is statically bound to the {{Code|csv}} prefix.<br/>All errors are assigned to the {{Code|<code><nowiki>http://basex.org/errors}} </nowiki></code> namespace, which is statically bound to the {{Code|bxerr}} prefix.
==Conversion==
* Fields are represented via {{Code|<entry>}} elements. The value of a field is represented as text node.
* If the {{Code|header}} option is set to {{Code|true}}, the first text line is parsed as table header, and the {{Code|<entry>}} elements are replaced with the field names:
** Empty names are represented by a single underscore ({{Code|_}}), and characters that are not valid in element names are replaced with underscoresor (when invalid as first character of an element name) prefixed with an underscore.
** If the {{Code|lax}} option is set to {{Code|false}}, invalid characters will be rewritten to an underscore and the character’s four-digit Unicode, and underscores will be represented as two underscores ({{Code|__}}). The resulting element names may be less readable, but can always be converted back to the original field names.
* If {{Code|format}} is set to {{Code|attributes}}, field names will be stored in name attributes.
 
In the Database Creation dialog of the GUI, when the CSV parser is selected, the ''Parsing'' tab demonstrates the conversion of CSV to XML and the effects of the single conversion options.
===Map===
* By default, all entries of a records are represented in a sequence.
* If the {{Code|header}} option is set to {{Code|true}}, a map is created, which contains all field names and its values.
 
'''A little advice''': in the Database Creation dialog of the GUI, if you select CSV Parsing and switch to the ''Parsing'' tab, you can see the effects of some of the conversion options.
==Options==
The {{Mark|Updated with Version 8.6}}: improved Excel compatibility In the following table, all available options are available:listed. The Excel column indicates what are the preferred options for data that is to be imported, or has been exported from Excel.
{| class="wikitable sortable" width="100%"
! Allowed
! Default
! Excel
|- valign="top"
| {{Code|separator}}
| Defines the character which separates the entries values of a single record in a single line.
| {{Code|comma}}, {{Code|semicolon}}, {{Code|colon}}, {{Code|tab}}, {{Code|space}} or a ''single character''
| {{Code|comma}}
| {{Code|semicolon}}
|- valign="top"
| {{Code|header}}
| {{Code|yes}}, {{Code|no}}
| {{Code|no}}
|
|- valign="top"
| {{Code|format}}
| {{Code|direct}}, {{Code|attributes}}, {{Code|map}}
| {{Code|direct}}
|
|- valign="top"
| {{Code|lax}}
| {{Code|yes}}, {{Code|no}}
| {{Code|yes}}
| {{Code|no}}
|- valign="top"
| {{Code|quotes}}
| Specifies how quotes are parsed:
* Parsing: If the option is enabled, quotes at the start and end of a value will be treated as control characters. Separators and newlines within the quotes will be adopted without change.
* Serialization: If the option is enabled, the value will be wrapped with quotes. A quote character in the value will be encoded according to the rules of the {{Code|backslashes}} option.
| {{Code|yes}}, {{Code|no}}
| {{Code|yes}}
| {{Code|yes}}
|- valign="top"
| {{Code|backslashes}}
| Specifies how quotes and other characters are escaped:
* Parsing: If the option is enabled, {{Code|\r}}, {{Code|n}} and {{Code|\t}} will be replaced with the corresponding control characters. All other escaped characters will be adopted as literals (e.g.: {{Code|\"}} → {{Code|"}}). If the option is disabled, two consecutive quotes will be replaced with a single quote (unless {{Code|quotes}} is enabled and the quote is the first or last character of a value).
* Serialization: If the option is enabled, {{Code|\r}}, {{Code|n}}, {{Code|\t}}, {{Code|"}} and the separator character will be encoded with a backslash. If the option is disabled, quotes will be duplicated.
| {{Code|yes}}, {{Code|no}}
| {{Code|no}}
| {{Code|no}}
|}
 
The CSV function signatures provide an {{Code|$options}} argument. Options can either be specified
* as children of an {{Code|<csv:options/>}} element; e.g.:
<pre class="brush:xml">
<csv:options>
<csv:separator value=';'/>
...
</csv:options>
</pre>
* or as map, which contains all key/value pairs:
<pre class="brush:xquery">
{ 'separator': ';', ... }
</pre>
=Functions=
==csv:parse==
 
{{Version|7.8}}: [[#csv:parse|csv:parse]] now returns a document node instead of an element, or an XQuery map if {{Code|format}} is set to {{Code|map}}. New options have been added.
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|csv:parse|$input as xs:string|document-node(element(csv))}}<br/>{{Func|csv:parse|$input as xs:string, $options as map(xs:string, item())|item()}}
|-
| '''Summary'''
==csv:serialize==
 
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|csv:serialize|$input as node()|xs:string}}<br/>{{Func|csv:serialize|$input as node(), $options as map(xs:string, item())|xs:string}}
|-
| '''Summary'''
|Serializes the node specified by {{Code|$input}} as CSV data, and returns the result as {{Code|xs:string}}.Items can also be serialized as JSON if the [[Serialization|Serialization Parameter]] {{Code|method}} is set to {{Code|csv}}.<br/>The {{Code|$options}} argument can be used to control the way the input is serialized.
|-
| '''Errors'''
<pre class="brush:xquery">
let $text := file:read-text('addressbook.csv')
return csv:parse($text, map { 'header': true() })
</pre>
'''Query:'''
<pre class="brush:xquery">
let $text options := map { 'lax': false() }let $input := file:read-text('some-data.csv')let $options output := { 'lax': 'no' }let $xml :input = > csv:parse($text, $options)let $csv := > csv:serialize($xml, $options)return $text input eq $csvoutput
</pre>
<pre class="brush:xquery">
let $text := "Name;City" || out:nl() || "John;Newton" || out:nl() || "Jack;Oldtown"
let $options :=map { <csv'separator':options> <csv:separator value=';'/>, <csv 'format' :format value='map'/>, <csv: 'header value='yes'/> </csv:options>true()}let $map := return csv:parse($text, $options)return map:serialize($map)
</pre>
'''Result:'''
<pre class="brush:xmlxquery">map { 1: map {
"City": "Newton",
"Name": "John"
},
2: map {
"City": "Oldtown",
"Name": "Jack"
=Changelog=
 
;Version 8.6
 
* Updated: [[#Options|Options]]: improved Excel compatibility
 
;Version 8.0
 
* Added: {{Code|backslashes}} option
;Version 7.8
The module was introduced with Version 7.7.2.
 
[[Category:XQuery]]
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu