Changes

Jump to navigation Jump to search
467 bytes added ,  10:41, 25 April 2022
</syntaxhighlight>
Please note that scoring propagation was removed with Scoring is supported within full-text expressions, by {{MarkFunction|Full-Text|Version 9.5ft:search}}. The following expressions will now yield , and by simple predicate tests that can be rewritten to {{CodeFunction|Full-Text|0ft:search}}:
<syntaxhighlight lang="xquery">
let $string := 'a b'return ft:score($string contains text 'a' ftand 'b'), for $n score $s in dbft:opensearch('factbook')//religions[text() contains text , 'orthodox'])order by $s descendingreturn $s|| ': ' || $n,
let for $string n score $s in db:= open('a bfactbook'return ft:score)//text($string )[. contains text 'aorthodox' and ]order by $s descendingreturn $string contains text s || 'b: ')|| $n
</syntaxhighlight>
Scoring is still supported within ==Thesaurus== One or more thesaurus files can be specified in a full-text expressions and by expression. The following query returns {{FunctionCode|Full-Text|ft:searchfalse}}:
<syntaxhighlight lang="xquery">
for $n score $s in ft:search('factbookhardware', 'orthodox')return $s, let $string := 'a b'return ft:score($string contains text 'acomputers' ftand 'b') using thesaurus default
</syntaxhighlight>
The reason for removing the scoring propagation was that the storage of scoring values required additional memory, even if scoring If a thesaurus is not required.employed…
<syntaxhighlight lang="xml"><thesaurus xmlns=Thesaurus=="http://www.w3.org/2007/xqftts/thesaurus"> <entry> <term>computers</term> <synonym> <term>hardware</term> <relationship>NT</relationship> </synonym> </entry></thesaurus></syntaxhighlight>
BaseX supports full-text queries using thesauri, but it does not provide a default thesaurus. This is why queries such as…the result will be {{Code|true}}:
<syntaxhighlight lang="xquery">
'computershardware' contains text 'hardwarecomputers' using thesaurus defaultat 'thesaurus.xml'
</syntaxhighlight>
will return <code>false<Thesaurus files must comply with the [https://dev.w3.org/2007/xpath-full-text-10-test-suite/TestSuiteStagingArea/TestSources/code>thesaurus.xsd XSD Schema] of the XQFT Test Suite (but the namespace can be omitted). HoweverApart from the relationship defined in [https://www.iso.org/standard/7776.html ISO 2788] (NT: narrower team, RT: related term, if the thesaurus is specifiedetc.), then custom relationships can be used. The type of relationship and the result will level depth can be <code>true</code>specified as well:
<syntaxhighlight lang="xquery">
(: BT: find broader terms; NT means narrower term :)
'computers' contains text 'hardware'
using thesaurus at 'XQFTTS_1_0_4/TestSources/usability2x.xml'relationship 'BT' from 1 to 10 levels
</syntaxhighlight>
The format of the thesaurus files must More details can be the same as the format of the thesauri provided by found in the [https://devwww.w3.org/2007TR/xpath-full-text-10-test-suite XQuery and XPath Full Text 1.0 Test Suite]. It is an XML with structure defined by an [https://dev.w3.org/2007/xpath-full-text-10-test-suite/TestSuiteStagingArea/TestSources/thesaurus.xsd XSD Schema#ftthesaurusoption specification].
==Fuzzy Querying==
</syntaxhighlight>
Fuzzy search is based on the Levenshtein distance. The maximum number of allowed errors is calculated by dividing the token length of a specified query term by 4, preserving a minimum of 1 errors. A static error distance can be set by adjusting the {{Option|LSERROR}} option (default: <code>SET LSERROR 0</code>). The query above yields two results as there is no error between the query term “house” and the text node “house”, and one error between “house” and “hous”. A user-defined value can be adjusted globally via the {{Option|LSERROR}} option or via an additional argument:
Fuzzy search is also supported by the full-<syntaxhighlight lang="xquery">//a[text index.() contains text 'house' using fuzzy 3 errors]</syntaxhighlight>
=Mixed Content=
=Changelog=
 
; Version 9.6
* Updated: [[#Fuzzy_Querying|Fuzzy Querying]]: Specify Levenshtein error
 
; Version 9.5:
* Removed: Scoring propagation.
; Version 9.2:
 
* Added: Arabic stemmer.
; Version 8.0:
 
* Updated: [[#Scoring|Scores]] will be propagated by the {{Code|and}} and {{Code|or}} expressions and in predicates.
; Version 7.7:
 
* Added: [[#Collations|Collations]] support.
; Version 7.3:
 
* Removed: Trie index, which was specialized on wildcard queries. The fuzzy index now supports both wildcard and fuzzy queries.
* Removed: TF/IDF scoring was discarded in favor of the internal scoring model.
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu