Changes

Jump to navigation Jump to search
79 bytes removed ,  16:58, 10 April 2019
no edit summary
</pre>
Fuzzy search is based on the Levenshtein distance. The maximum number of allowed errors is calculated by dividing the token length of a specified query term by 4, preserving a minimum of 1 errors. A static error distance can be set by adjusting the <code>[[Options#LSERROR{{Option|LSERROR]]</code> property }} option (default: <code>SET LSERROR 0</code>). The query above yields two results as there is no error between the query term “house” and the text node “house”, and one error between “house” and “hous”.
Fuzzy search is also supported by the full-text index.
To enable this kind of searches, it is recommendable to:
* Turn off ''whitespace chopping'' when importing XML documents. This can be done by setting the option <code>[[Options#CHOP{{Option|CHOP]]</code> }} to <code>OFF</code>. This can also be done in the GUI if a new database is created (''Database'' → ''New…'' → ''Parsing'' → ''Chop Whitespaces'').* Turn off automatic indentation by assigning <code>indent=no</code> to the <code>[[Options#SERIALIZER{{Option|SERIALIZER]]</code> }} option.
A query such as <code>//p[. contains text 'real text']</code> will then match the example paragraph above. However, the full-text index will '''not''' be used in this query, so it may take a long time. The full-text index would be used for the query <code>//p[text() contains text 'real text']</code>, but this query will not find the example paragraph, because the matching text is split over two text nodes.
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu