Changes

Full-Text (edit)

Revision as of 12:21, 1 June 2021

1,039 bytes added , 12:21, 1 June 2021

let $string := 'a b'

return ft:score($string contains text 'a' and $string contains text 'b'),

for $n score $s in db:open('factbook')//religions[text() contains text 'orthodox']

~~return~~ order by $s,descending ~~let~~ return $~~string := 'a b~~s || '~~return ft~~:~~score($string contains text 'a~~' ~~and~~ || $~~string contains text 'b')~~n

</syntaxhighlight>

Scoring is still supported within full-text expressions , by {{Function|Full-Text|ft:search}}, and by simple predicate tests that can be rewritten to {{Function|Full-Text|ft:search}}:

let $string := 'a b'

return ft:score($string contains text 'a' ftand 'b'),

for $n score $s in ft:search('factbook', 'orthodox')

order by $s descendingreturn $s|| ': ' || $n,

~~let~~ for $~~string~~ n score $s in db:= open('~~a b~~factbook'~~return ft:score~~)//text(~~$string~~ )[. contains text 'aorthodox' ~~ftand~~ ]order by $s descendingreturn $s || 'b: ')|| $n

</syntaxhighlight>

==Thesaurus==

~~BaseX supports~~ One or more thesaurus files can be specified in a full-text ~~queries using thesauri, but it does not provide a default thesaurus~~expression. ~~This is why queries such as~~The following query returns {{Code|false}}:

'~~computers~~hardware' contains text '~~hardware~~computers'

using thesaurus default

</syntaxhighlight>

~~will return~~ If a thesaurus is employed… <~~code~~syntaxhighlight lang="xml">~~false~~<thesaurus xmlns="http:/~~code>~~/www.w3. ~~However, if the~~ org/2007/xqftts/thesaurus ~~is specified, then the result will be~~ "> <entry> <term>computers</term> <synonym> <term>hardware</term> <relationship>NT</relationship> </synonym> </entry><~~code~~/thesaurus>~~true~~</~~code~~syntaxhighlight> …the result will be {{Code|true}}:

'hardware' contains text 'computers'

using thesaurus at 'thesaurus.xml'

</syntaxhighlight>

Thesaurus files must comply with the [https://dev.w3.org/2007/xpath-full-text-10-test-suite/TestSuiteStagingArea/TestSources/thesaurus.xsd XSD Schema] of the XQFT Test Suite (but the namespace can be omitted). Apart from the relationship defined in [https://www.iso.org/standard/7776.html ISO 2788] (NT: narrower team, RT: related term, etc.), custom relationships can be used.

The type of relationship and the level depth can be specified as well:

(: BT: find broader terms; NT means narrower term :)

'computers' contains text 'hardware'

using thesaurus at '~~XQFTTS_1_0_4/TestSources/usability2~~x.xml'relationship 'BT' from 1 to 10 levels

</syntaxhighlight>

~~The format of the thesaurus files must~~ More details can be ~~the same as the format of the thesauri provided by~~ found in the [https://~~dev~~www.w3.org/~~2007~~TR/xpath-full-text-10~~-test-suite XQuery and XPath Full Text 1.0 Test Suite]. It is an XML with structure defined by an [https:/~~/~~dev.w3.org/2007/xpath-full-text-10-test-suite/TestSuiteStagingArea/TestSources/thesaurus.xsd XSD Schema~~#ftthesaurusoption specification].

==Fuzzy Querying==

</syntaxhighlight>

Fuzzy search is based on the Levenshtein distance. The maximum number of allowed errors is calculated by dividing the token length of a specified query term by 4, preserving a minimum of 1 errors~~. A static error distance can be set by adjusting the {{Option|LSERROR}} option (default: <code>SET LSERROR 0</code>)~~. The query above yields two results as there is no error between the query term “house” and the text node “house”, and one error between “house” and “hous”.

~~Fuzzy search is also supported by~~ A user-defined value can be adjusted globally via the ~~full-~~{{Option|LSERROR}} option or, since {{Version|9.6}}, via an additional argument: <syntaxhighlight lang="xquery">//a[text() contains text ~~index.~~'house' using fuzzy 3 errors]</syntaxhighlight>

=Mixed Content=

=Changelog=

; Version 9.26* Updated:[[#Fuzzy_Querying|Fuzzy Querying]]: Specify Levenshtein error

; Version 9.5:

* Removed: Scoring propagation.

; Version 9.2:

* Added: Arabic stemmer.

; Version 8.0:

* Updated: [[#Scoring|Scores]] will be propagated by the {{Code|and}} and {{Code|or}} expressions and in predicates.

; Version 7.7:

* Added: [[#Collations|Collations]] support.

; Version 7.3:

* Removed: Trie index, which was specialized on wildcard queries. The fuzzy index now supports both wildcard and fuzzy queries.

* Removed: TF/IDF scoring was discarded in favor of the internal scoring model.

CG

Bureaucrats, editor, reviewer, Administrators

13,550

edits

Changes

Full-Text (edit)

Revision as of 12:21, 1 June 2021

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools