Changes

Jump to navigation Jump to search
231 bytes added ,  11:33, 2 July 2020
m
Text replacement - "[http://www.w3.org/TR/xpath" to "[https://www.w3.org/TR/xpath"
This [[Module Library|XQuery Module]] extends the [httphttps://www.w3.org/TR/xpath-full-text-10 W3C Full Text Recommendation] with some useful functions: The index can be directly accessed, fulltext results can be marked with additional elements, or the relevant parts can be extracted. Moreover, the score value, which is generated by the {{Code|contains text}} expression, can be explicitly requested from items.
=Conventions=
 
{{Mark|Updated with Version 9.0}}:
All functions and errors in this module are assigned to the <code><nowiki>http://basex.org/modules/ft</nowiki></code> namespace, which is statically bound to the {{Code|ft}} prefix.<br/>
|-
| width='120' | '''Signatures'''
|{{Func|ft:search|$db as xs:string, $terms as item()*|text()*}}<br/>{{Func|ft:search|$db as xs:string, $terms as item()*, $options as map(xs:string, item()*)?|text()*}}
|-
| '''Summary'''
* Return all text nodes of the database {{Code|DB}} that contain the numbers {{Code|2010}} and {{Code|2020}}:<br/><code>ft:search("DB", ("2010", "2020"), map { 'mode': 'all' })</code>
* Return text nodes that contain the terms {{Code|A}} and {{Code|B|}} in a distance of at most 5 words:
<pre classsyntaxhighlight lang="brush:xquery">
ft:search("db", ("A", "B"), map {
"mode": "all words",
}
})
</presyntaxhighlight>
* Iterate over three databases and return all elements containing terms similar to {{Code|Hello World}} in the text nodes:
<pre classsyntaxhighlight lang="brush:xquery">
let $terms := "Hello Worlds"
let $fuzzy := true()
let $dbname := 'DB' || $db
return ft:search($dbname, $terms, map { 'fuzzy': $fuzzy })/..
</presyntaxhighlight>
|}
|-
| width='120' | '''Signatures'''
|{{Func|ft:contains|$input as item()*, $terms as item()*|xs:boolean}}<br/>{{Func|ft:contains|$input as item()*, $terms as item()*, $options as map(xs:xstring, item()*)?|xs:boolean}}
|-
| '''Summary'''
|
* Checks if {{Code|jack}} or {{Code|john}} occurs in the input string {{Code|John Doe}}:
<pre classsyntaxhighlight lang="brush:xquery">
ft:contains("John Doe", ("jack", "john"), map { "mode": "any" })
</presyntaxhighlight>
* Calls the function with stemming turned on and off:
<pre classsyntaxhighlight lang="brush:xquery">
(true(), false()) ! ft:contains("Häuser", "Haus", map { 'stemming': ., 'language':'de' })
</presyntaxhighlight>
|}
==ft:mark==
 
{| width='100%'
|-
|-
| '''Summary'''
|Puts a marker element around the resulting {{Code|$nodes}} of a full-text index request.<br />The default name of the marker element is {{Code|mark}}. An alternative name can be chosen via the optional {{Code|$name}} argument.<br />Please note that:* the The full-text expression that computes the token positions must be specified as argument of the <code>ft:mark()</code> function, as all position information is lost in subsequent processing steps. You may need to specify more than one full-text expression if you want to use the function in a FLWOR expression, as shown in Example 2.* the XML The supplied node to be transformed must be an internal "database" nodea [[Database Module#Database Node|Database Node]]. The As shown in Example 3, {{Code|update}} or {{Code|transform}} expression can be used to apply the method utilized to convert a main-memory fragment, as shown in Example 3to the required internal representation.
|-
| '''Examples'''
|'''Example 1''': The following query returns {{Code|&lt;XML&gt;&lt;mark&gt;hello&lt;/mark&gt; world&lt;/XML&gt;}}, if one text node of the database {{Code|DB}} has the value "hello world":
<pre classsyntaxhighlight lang="brush:xquery">
ft:mark(db:open('DB')//*[text() contains text 'hello'])
</presyntaxhighlight>
'''Example 2''': The following expression loops through the first ten full-text results and marks the results in a second expression:
<pre classsyntaxhighlight lang="brush:xquery">
let $start := 1
let $end := 10
ft:mark($ft[text() contains text { $term }])
}
</presyntaxhighlight>'''Example 3''': The following expression returns {{Code|<code>&lt;p&gt;xml>hello &lt;b&gt;word&lt;/b&gt;&lt;/pxml&gt;}}</code>:<pre classsyntaxhighlight lang="brush:xquery">copy $p := &lt;p&gt;word&lt;<xml>hello world</p&gt;xml>
modify ()
return ft:mark($p[text() contains text 'word'], 'b')</presyntaxhighlight>
|}
==ft:extract==
 
{| width='100%'
|-
|
* The following query may return {{Code|&lt;XML&gt;...&lt;b&gt;hello&lt;/b&gt;...&lt;XML&gt;}} if a text node of the database {{Code|DB}} contains the string "hello world":
<pre classsyntaxhighlight lang="brush:xquery">
ft:extract(db:open('DB')//*[text() contains text 'hello'], 'b', 1)
</presyntaxhighlight>
|}
==ft:score==
 
{| width='100%'
|-
| '''Examples'''
|Returns the number of occurrences for a single, specific index entry:
<pre classsyntaxhighlight lang="brush:xquery">
let $term := ft:tokenize($term)
return number(ft:tokens('db', $term)[. = $term]/@count)
</presyntaxhighlight>
|}
==ft:tokenize==
 
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|ft:tokenize|$input string as xs:string?|xs:string*}}<br/>{{Func|ft:tokenize|$input string as xs:string?, $options as map(xs:xstring, item()*)?|xs:string*}}
|-
| '''Summary'''
|Tokenizes the given {{Code|$inputstring}} string, using the current default full-text options or the {{Code|$options}} specified as second argument, and returns a sequence with the tokenized string. The following options are available:
* {{Code|case}}: determines how character case is processed. Allowed values are {{Code|insensitive}}, {{Code|sensitive}}, {{Code|upper}} and {{Code|lower}}. By default, search is case insensitive.
* {{Code|diacritics}}: determines how diacritical characters are processed. Allowed values are {{Code|insensitive}} and {{Code|sensitive}}. By default, search is diacritical insensitive.
==ft:normalize==
 
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|ft:normalize|$input string as xs:string?|xs:string*}}<br/>{{Func|ft:normalize|$input string as xs:string?, $options as map(xs:xstring, item()*)?|xs:string*}}
|-
| '''Summary'''
|Normalizes the given {{Code|$inputstring}} string, using the current default full-text options or the {{Code|$options}} specified as second argument. The function expects the same arguments as [[#ft:tokenize|ft:tokenize]].
|-
| '''Examples'''
=Errors=
 
{{Mark|Updated with Version 9.0}}:
{| class="wikitable" width="100%"
=Changelog=
 
; Version 9.1
* Updated: [[#ft:tokenize|ft:tokenize]] and [[#ft:normalize|ft:normalize]] can be called with empty sequence.
;Version 9.0
* Updated: error codes updatesupdated; errors now use the module namespace
;Version 8.0
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu