Difference between revisions of "Full-Text Module"
Jump to navigation
Jump to search
Line 15: | Line 15: | ||
|Returns all text nodes from the full-text index of the database <code>[[Database Module#Database Argument|$db]]</code> that contain the specified {{Mono|$terms}}.<br/>The options used for building the full-text will also be applied to the search terms. As an example, if the index terms have been stemmed, the search string will be stemmed as well. | |Returns all text nodes from the full-text index of the database <code>[[Database Module#Database Argument|$db]]</code> that contain the specified {{Mono|$terms}}.<br/>The options used for building the full-text will also be applied to the search terms. As an example, if the index terms have been stemmed, the search string will be stemmed as well. | ||
The {{Mono|$options}} argument can be used to overwrite the default full-text options. It can be specified as | The {{Mono|$options}} argument can be used to overwrite the default full-text options. It can be specified as | ||
− | * {{Mono|element( | + | * {{Mono|element(options)}}: <code><options/></code> must be used as root element, and the parameters are specified as child nodes, with the element name representing the key and the text node representing the value:<br /> |
<pre class="brush:xml"> | <pre class="brush:xml"> | ||
− | < | + | <options> |
<key>value</key> | <key>value</key> | ||
... | ... | ||
− | </ | + | </options> |
</pre> | </pre> | ||
* [[Map Module|map structure]]: all parameters can be directly represented as key/value pairs:<br /><code>map { "key" := "value", ... </code>}<br/>This variant is more compact, but please note that the W3C’s specification of maps in XQuery is still work in progress. | * [[Map Module|map structure]]: all parameters can be directly represented as key/value pairs:<br /><code>map { "key" := "value", ... </code>}<br/>This variant is more compact, but please note that the W3C’s specification of maps in XQuery is still work in progress. | ||
Line 40: | Line 40: | ||
let $fuzzy := true() | let $fuzzy := true() | ||
let $options := | let $options := | ||
− | < | + | <options> |
<fuzzy>{ $fuzzy }</fuzzy> | <fuzzy>{ $fuzzy }</fuzzy> | ||
− | </ | + | </options> |
for $db in 1 to 3 | for $db in 1 to 3 | ||
let $dbname := 'DB' || $db | let $dbname := 'DB' || $db |
Revision as of 18:37, 18 May 2012
This XQuery Module extends the W3C Full Text Recommendation with some useful functions: The index can be directly accessed, full-text results can be marked with additional elements, or the relevant parts can be extracted. Moreover, the score value, which is generated by the contains text
expression, can be explicitly requested from items. All functions are introduced with the ft:
prefix, which is linked to the statically declared http://basex.org/modules/ft
namespace.
Contents
Functions
ft:search
Template:Mark second argument generalized, third parameter added.
Signatures | ft:search($db as item(), $terms as item()*) as text()* ft:search($db as item(), $terms as item()*, $options as item()) as text()*
|
Summary | Returns all text nodes from the full-text index of the database $db that contain the specified $terms .The options used for building the full-text will also be applied to the search terms. As an example, if the index terms have been stemmed, the search string will be stemmed as well. The
<options> <key>value</key> ... </options>
The following keys are supported:
|
Errors | BASX0001 is raised if the full-text index is not available, or if the selected option is not supported by the existing index. BASX0002 is raised if a referenced node is not stored in a database (i.e., references a main-memory XML fragment). BASX0021 is raised if the specified full-text option is unknown. BASX0022 is raised if both fuzzy and wildcard querying has been selected. |
Examples |
let $terms := "Hello Worlds" let $fuzzy := true() let $options := <options> <fuzzy>{ $fuzzy }</fuzzy> </options> for $db in 1 to 3 let $dbname := 'DB' || $db return ft:search($dbname, $terms, $options)/.. |
ft:mark
Signatures | ft:mark($nodes as node()*) as node()* ft:mark($nodes as node()*, $tag as xs:string) as node()*
|
Summary | Puts a marker element around the resulting $nodes of a full-text index request.The default tag name of the marker element is mark . An alternative tag name can be chosen via the optional $tag argument.Note that the XML node to be transformed must be an internal "database" node. The transform expression can be used to apply the method to a main-memory fragment (see example).
|
Errors | BASX0002 is raised if a referenced node is not stored in a database (i.e., references a main-memory XML fragment). FOCA0002 is raised if $name is no valid QName.
|
Examples |
ft:mark(db:open('DB')//*[text() contains text 'hello'])
copy $p := <p>word</p> modify () return ft:mark($p[text() contains text 'word'], 'b') |
ft:extract
Signatures | ft:extract($nodes as node()*) as node()* ft:extract($nodes as node()*, $tag as xs:string) as node()* ft:extract($nodes as node()*, $tag as xs:string, $length as xs:integer) as node()*
|
Summary | Extracts and returns relevant parts of full-text results. It puts a marker element around the resulting $nodes of a full-text index request and chops irrelevant sections of the result.The default tag name of the marker element is mark . An alternative tag name can be chosen via the optional $tag argument.The default length of the returned text is 150 characters. An alternative length can be specified via the optional $length argument. Note that the effective text length may differ from the specified text due to formatting and readibility issues.
|
Errors | BASX0002 is raised if a referenced node is not stored in a database (i.e., references a main-memory XML fragment). FOCA0002 is raised if $name is no valid QName.
|
Examples |
ft:extract(db:open('DB')//*[text() contains text 'hello'], 'b', 1) |
ft:count
Signatures | ft:count($nodes as node()*) as xs:integer
|
Summary | Returns the number of occurrences of the search terms specified in a full-text expression. |
Errors | BASX0002 is raised if a referenced node is not stored in a database (i.e., references a main-memory XML fragment). |
Examples |
|
ft:score
Signatures | ft:score($item as item()*) as xs:double*
|
Summary | Returns the score values (0.0 - 1.0) that have been attached to the specified items. 0 is returned a value if no score was attached.
|
Examples |
|
ft:tokens
Signatures | ft:tokens($db as item()) as element(value)* ft:tokens($db as item(), $prefix as xs:string) as element(value)*
|
Summary | Returns all full-text tokens stored in the index of the database $db , along with their numbers of occurrences. $db may either be an xs:string , denoting the database name, or a node stored in the database.If $prefix is specified, the returned nodes will be refined to the strings starting with that prefix. The prefix will be tokenized according to the full-text used for creating the index.
|
Errors | BASX0001 is raised if the full-text index is not available. BASX0002 is raised if $db references a node that is not stored in a database (i.e., references a main-memory XML fragment).BASX0003 is raised if the addressed database cannot be opened. |
ft:tokenize
Signatures | ft:tokenize($input as xs:string) as xs:string*
|
Summary | Tokenizes the given $input string, using the current default full-text options.
|
Examples |
|
Changelog
Version 7.2
- Updated: ft:search (second argument generalized, third parameter added)
Version 7.1
- Added: ft:tokens, ft:tokenize