Difference between revisions of "String Module"
Jump to navigation
Jump to search
(6 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
=Conventions= | =Conventions= | ||
− | All functions in this module and errors are assigned to the <code><nowiki>http://basex.org/modules/strings</nowiki></code> namespace, which is statically bound to the {{Code|strings}} prefix.<br/> | + | All functions and errors in this module and errors are assigned to the <code><nowiki>http://basex.org/modules/strings</nowiki></code> namespace, which is statically bound to the {{Code|strings}} prefix.<br/> |
=Functions= | =Functions= | ||
Line 15: | Line 15: | ||
|- | |- | ||
| '''Summary''' | | '''Summary''' | ||
− | |Computes the [https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance Damerau-Levenshtein Distance] for two strings and returns a double value ({{Code|0.0}} - {{Code|1.0}}). The | + | |Computes the [https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance Damerau-Levenshtein Distance] for two strings and returns a double value ({{Code|0.0}} - {{Code|1.0}}). The returned value is computed as follows:<br/> |
− | * <code>1.0 | + | * <code>1.0</code> – distance / max(length of strings) |
* <code>1.0</code> is returned if the strings are equal; <code>0.0</code> is returned if the strings are too different. | * <code>1.0</code> is returned if the strings are equal; <code>0.0</code> is returned if the strings are too different. | ||
|- | |- | ||
Line 25: | Line 25: | ||
* In the following query, the input is first normalized (words are stemmed, converted to lower case, and diacritics are removed). It returns {{Code|1}}: | * In the following query, the input is first normalized (words are stemmed, converted to lower case, and diacritics are removed). It returns {{Code|1}}: | ||
<pre class="brush:xquery"> | <pre class="brush:xquery"> | ||
− | let $norm := | + | let $norm := ft:normalize(?, map { 'stemming': true() }) |
return strings:levenshtein($norm("HOUSES"), $norm("house")) | return strings:levenshtein($norm("HOUSES"), $norm("house")) | ||
</pre> | </pre> | ||
Line 54: | Line 54: | ||
|- | |- | ||
| '''Summary''' | | '''Summary''' | ||
− | |Computes the [https://de.wikipedia.org/wiki/K%C3%B6lner_Phonetik Kölner Phonetik] value for the specified string | + | |Computes the [https://de.wikipedia.org/wiki/K%C3%B6lner_Phonetik Kölner Phonetik] value for the specified string. Similar to Soundex, the algorithm is used to find similarly pronounced words, but for the German language. As the first returned digit can be {{Code|0}}, the value is returned as string. |
|- | |- | ||
| '''Examples''' | | '''Examples''' | ||
Line 65: | Line 65: | ||
The Module was introduced with Version 8.3. | The Module was introduced with Version 8.3. | ||
− | |||
− |
Revision as of 17:52, 21 November 2017
This XQuery Module contains functions for string computations.
Contents
Conventions
All functions and errors in this module and errors are assigned to the http://basex.org/modules/strings
namespace, which is statically bound to the strings
prefix.
Functions
strings:levenshtein
Signatures | strings:levenshtein($string1 as xs:string, $string2 as xs:string) as xs:double |
Summary | Computes the Damerau-Levenshtein Distance for two strings and returns a double value (0.0 - 1.0 ). The returned value is computed as follows:
|
Examples |
let $norm := ft:normalize(?, map { 'stemming': true() }) return strings:levenshtein($norm("HOUSES"), $norm("house")) |
strings:soundex
Signatures | strings:soundex($string as xs:string) as xs:string |
Summary | Computes the Soundex value for the specified string. The algorithm can be used to find and index English words with similar pronouncation. |
Examples |
|
strings:cologne-phonetic
Signatures | strings:cologne-phonetic($string as xs:string) as xs:string |
Summary | Computes the Kölner Phonetik value for the specified string. Similar to Soundex, the algorithm is used to find similarly pronounced words, but for the German language. As the first returned digit can be 0 , the value is returned as string.
|
Examples |
|
Changelog
The Module was introduced with Version 8.3.