Changes

Jump to navigation Jump to search
347 bytes removed ,  15:54, 26 May 2012
no edit summary
This [[Module Library|XQuery Module]] extends the [http://www.w3.org/TR/xpath-full-text-10 W3C Full Text Recommendation] with some useful functions: The index can be directly accessed, full-text results can be marked with additional elements, or the relevant parts can be extracted. Moreover, the score value, which is generated by the <code>{{Code|contains text</code> }} expression, can be explicitly requested from items.
=Conventions=
All functions in this module are assigned to the <code>{{Code|http://basex.org/modules/ft</code> }} namespace, which is statically bound to the <code>{{Code|ft</code> }} prefix.<br/>All errors are assigned to the <code>{{Code|http://basex.org/errors</code> }} namespace, which is statically bound to the <code>{{Code|bxerr</code> }} prefix.
=Functions=
|-
| width='90' | '''Signatures'''
|<code><b>{{Func|ft:search</b>(|$db as item(), $terms as item()*) as |text()*</code>}}<br/><code><b>{{Func|ft:search</b>(|$db as item(), $terms as item()*, $options as item()) as |text()*</code>}}
|-
| '''Summary'''
|Returns all text nodes from the full-text index of the [[Database Module#Database Nodes|database node]] <code>{{Code|$db</code> }} that contain the specified {{Code|$terms}}.<br/>The options used for building the full-text will also be applied to the search terms. As an example, if the index terms have been stemmed, the search string will be stemmed as well.
The {{Code|$options}} argument can be used to overwrite the default full-text options. It can be specified as
* {{Code|element(options)}}: <code>{{Code|&lt;options/&gt;</code> }} must be used as root element, and the parameters are specified as child nodes, with the element name representing the key and the text node representing the value:<br />
<pre class="brush:xml">
<options>
</options>
</pre>
* [[Map Module|map structure]]: all parameters can be directly represented as key/value pairs:<br /><code>{{Code|map { "key" := "value", ... </code>}}}<br/>This variant is more compact, but please note that the W3Cā€™s specification of maps in XQuery is still work in progress.
The following keys are supported:
* {{Code|mode}}: determines the search mode (also called [http://www.w3.org/TR/xpath-full-text-10/#ftwords AnyAllOption]). Allowed values are {{Code|any}}, {{Code|any word}}, {{Code|all}}, {{Code|all words}}, and {{Code|phrase}}. {{Code|any}} is the default search mode.
|-
| '''Errors'''
|'''[[{{Error|BXDB0004|Database Module#Errors|BXDB0004]]''' is raised if }} the full-text index is not available.<br/>'''[[{{Error|BXFT0001|#Errors|BXFT0001]]''' is raised if }} both fuzzy and wildcard querying was selected.
|-
| '''Examples'''
|
* <code>{{Code|ft:search("DB", "QUERY")</code> }} returns all text nodes of the database {{Code|DB}} that contain the term {{Code|QUERY}}.* <code>{{Code|ft:search("DB", (2010,2011), map { 'mode':='all' })</code>}}<br/>returns all text nodes of the database {{Code|DB}} that contain the numbers {{Code|2010}} and {{Code|20111}}.
* The last example iterates over five databases and returns all elements containing terms similar to {{Code|Hello World}} in the text nodes:
<pre class="brush:xquery">
|-
| width='90' | '''Signatures'''
|<code><b>{{Func|ft:mark</b>(|$nodes as node()*) as |node()*</code>}}<br /><code><b>{{Func|ft:mark</b>(|$nodes as node()*, $tag as xs:string) as |node()*</code>}}
|-
| '''Summary'''
|Puts a marker element around the resulting <code>{{Code|$nodes</code> }} of a full-text index request.<br />The default tag name of the marker element is <code>{{Code|mark</code>}}. An alternative tag name can be chosen via the optional <code>{{Code|$tag</code> }} argument.<br />Note that the XML node to be transformed must be an internal "database" node. The <code>{{Code|transform</code> }} expression can be used to apply the method to a main-memory fragment (see example).
|-
| '''Examples'''
|
* The following query returns <code>{{Code|&lt;XML&gt;&lt;mark&gt;hello&lt;/mark&gt; world&lt;/XML&gt;</code>}}, if one text node of the database <code>{{Code|DB</code> }} has the value "hello world":
<pre class="brush:xquery">
ft:mark(db:open('DB')//*[text() contains text 'hello'])
</pre>
* The following expression returns <code>{{Code|&lt;p&gt;&lt;b&gt;word&lt;/b&gt;&lt;/p&gt;</code>}}:
<pre class="brush:xquery">
copy $p := &lt;p&gt;word&lt;/p&gt;
|-
| width='90' | '''Signatures'''
|<code><b>{{Func|ft:extract</b>(|$nodes as node()*) as |node()*</code>}}<br /><code><b>{{Func|ft:extract</b>(|$nodes as node()*, $tag as xs:string) as |node()*</code>}}<br /><code><b>{{Func|ft:extract</b>(|$nodes as node()*, $tag as xs:string, $length as xs:integer) as |node()*</code>}}
|-
| '''Summary'''
|Extracts and returns relevant parts of full-text results. It puts a marker element around the resulting <code>{{Code|$nodes</code> }} of a full-text index request and chops irrelevant sections of the result.<br />The default tag name of the marker element is <code>{{Code|mark</code>}}. An alternative tag name can be chosen via the optional <code>{{Code|$tag</code> }} argument.<br />The default length of the returned text is <code>{{Code|150</code> }} characters. An alternative length can be specified via the optional <code>{{Code|$length</code> }} argument. Note that the effective text length may differ from the specified text due to formatting and readibility issues.
|-
| '''Examples'''
|
* The following query may return <code>{{Code|&lt;XML&gt;...&lt;b&gt;hello&lt;/b&gt;...&lt;XML&gt;</code> }} if a text node of the database <code>{{Code|DB</code> }} contains the string "hello world":
<pre class="brush:xquery">
ft:extract(db:open('DB')//*[text() contains text 'hello'], 'b', 1)
|-
| width='90' | '''Signatures'''
|<code><b>{{Func|ft:count</b>(|$nodes as node()*) as |xs:integer</code>}}
|-
| '''Summary'''
| '''Examples'''
|
* <code>{{Code|ft:count(//*[text() contains text 'QUERY'])</code> }} returns the <code>{{Code|xs:integer</code> }} value <code>{{Code|2</code> }} if a document contains two occurrences of the string "QUERY".
|}
|-
| width='90' | '''Signatures'''
|<code><b>{{Func|ft:score</b>(|$item as item()*) as |xs:double*</code>}}
|-
| '''Summary'''
|Returns the score values (0.0 - 1.0) that have been attached to the specified items. <code>{{Code|0</code> }} is returned a value if no score was attached.
|-
| '''Examples'''
|
* <code>{{Code|ft:score('a' contains text 'a')</code> }} returns the <code>{{Code|xs:double</code> }} value <code>{{Code|1</code>}}.
|}
|-
| width='90' | '''Signatures'''
|{{CodeFunc|<b>ft:tokens</b>(|$db as item()) as |element(value)*}}<br/>{{CodeFunc|<b>ft:tokens</b>(|$db as item(), $prefix as xs:string) as |element(value)*}}
|-
| '''Summary'''
|Returns all full-text tokens stored in the index of the [[Database Module#Database Nodes|database node]] <code>{{Code|$db</code>}}, along with their numbers of occurrences.<br/>If {{Code|$prefix}} is specified, the returned nodes will be refined to the strings starting with that prefix. The prefix will be tokenized according to the full-text used for creating the index.
|-
| '''Errors'''
|'''[[{{Error|BXDB0004|Database Module#Errors|BXDB0004]]''' is raised if }} the full-text index is not available.
|}
|-
| width='90' | '''Signatures'''
|{{CodeFunc|<b>ft:tokenize</b>(|$input as xs:string) as |xs:string*}}
|-
| '''Summary'''
| '''Examples'''
|
* <code>{{Code|ft:tokenize("No Doubt")</code> }} returns the two strings {{Code|no}} and {{Code|doubt}}.* <code>{{Code|declare ft-option using stemming; ft:tokenize("GIFTS")</code> }} returns a single string {{Code|gift}}.
|}
! width="95%"|Description
|-
|<code>{{Code|BXFT0001</code>}}
|Both wildcards and fuzzy search have been specified as search options.
|}
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu