Changes
Jump to navigation
Jump to search
Created page with "==Existing Indexes== <p>Indexes can speedup queries by magnitudes. Currently, four indexes exist:</p> <ul> <li> <b>Text Index</b>: This index speeds up text comparisons in p..."
==Existing Indexes==
<p>Indexes can speedup queries by magnitudes.
Currently, four indexes exist:</p>
<ul>
<li> <b>Text Index</b>: This index speeds up text comparisons in predicates.</li>
<li> <b>Attribute Index</b>: This index speeds up attribute value comparisons in predicates.</li>
<li> <b>Full-Text Index</b>: Full-text queries are sped up by this index.</li>
<li> <b>Path Summary</b>: This index speeds up the resolution of location paths.</li>
</ul>
==Examples of using the indexes==
<p>Here are some examples for queries which are rewritten for index access:</p>
==Text-Based Queries:==
<ul>
<li><code>//node()[text() = 'Usability']</code></li>
<li><code>//div[p = 'Usability' or p = 'Testing']</code></li>
<li><code>path/to/relevant[text() = 'Usability Testing']/and/so/on</code></li>
</ul>
==Attribute Index:==
<ul>
<li><code>//node()[@align = 'right']</code></li>
<li><code>descendant::elem[@id = '1']</code></li>
<li><code>range/query[@id >= 1 and @id <= 5]</code></li>
</ul>
==Full-Text Index:==
<ul>
<li><code>//node[text() contains text 'Usability']</code></li>
<li><code>//node[text() contains text 'Usebiliti' using fuzzy]</code></li>
<li><code>//book[chapter contains text ('web' ftor 'WWW' using no stemming)
ftand 'diversity' using stemming distance at most 5 words]</code></li>
</ul>
<p>The full-text index is optimized to support all features of the XQuery Full Text
Recommendation.</p>
<p>BaseX extends the specification by offering a fuzzy match option.
Fuzzy search is based on the Levenshtein algorithm; the longer
query terms are, the more errors will be tolerated.</p>
<p>Default "Case Sensitivity", "Stemming" and "Diacritics" options
will be considered in the index creation. Consequently, all queries
will be sped up which use the default index options.</p>
==Index data structures==
<ul>
<li><b>Text/Attribute Index</b><br/>
Both the text and attribute index are based on a balanced B-Tree
and support exact matches and range queries.</li>
<li><b>Full-Text Index (Standard)</b><br/>
The standard full-text index is implemented as sorted array
structure. It is optimized for simple and fuzzy searches.</li>
<li><b>Full-Text Index (Wildcards enabled)</b><br/>
A second full-text index is implemented as a compressed trie.
Its needs slightly more memory than the standard full-text index,
but it supports more features, such as full wildcard search.
</li>
</ul>
<p>Indexes can speedup queries by magnitudes.
Currently, four indexes exist:</p>
<ul>
<li> <b>Text Index</b>: This index speeds up text comparisons in predicates.</li>
<li> <b>Attribute Index</b>: This index speeds up attribute value comparisons in predicates.</li>
<li> <b>Full-Text Index</b>: Full-text queries are sped up by this index.</li>
<li> <b>Path Summary</b>: This index speeds up the resolution of location paths.</li>
</ul>
==Examples of using the indexes==
<p>Here are some examples for queries which are rewritten for index access:</p>
==Text-Based Queries:==
<ul>
<li><code>//node()[text() = 'Usability']</code></li>
<li><code>//div[p = 'Usability' or p = 'Testing']</code></li>
<li><code>path/to/relevant[text() = 'Usability Testing']/and/so/on</code></li>
</ul>
==Attribute Index:==
<ul>
<li><code>//node()[@align = 'right']</code></li>
<li><code>descendant::elem[@id = '1']</code></li>
<li><code>range/query[@id >= 1 and @id <= 5]</code></li>
</ul>
==Full-Text Index:==
<ul>
<li><code>//node[text() contains text 'Usability']</code></li>
<li><code>//node[text() contains text 'Usebiliti' using fuzzy]</code></li>
<li><code>//book[chapter contains text ('web' ftor 'WWW' using no stemming)
ftand 'diversity' using stemming distance at most 5 words]</code></li>
</ul>
<p>The full-text index is optimized to support all features of the XQuery Full Text
Recommendation.</p>
<p>BaseX extends the specification by offering a fuzzy match option.
Fuzzy search is based on the Levenshtein algorithm; the longer
query terms are, the more errors will be tolerated.</p>
<p>Default "Case Sensitivity", "Stemming" and "Diacritics" options
will be considered in the index creation. Consequently, all queries
will be sped up which use the default index options.</p>
==Index data structures==
<ul>
<li><b>Text/Attribute Index</b><br/>
Both the text and attribute index are based on a balanced B-Tree
and support exact matches and range queries.</li>
<li><b>Full-Text Index (Standard)</b><br/>
The standard full-text index is implemented as sorted array
structure. It is optimized for simple and fuzzy searches.</li>
<li><b>Full-Text Index (Wildcards enabled)</b><br/>
A second full-text index is implemented as a compressed trie.
Its needs slightly more memory than the standard full-text index,
but it supports more features, such as full wildcard search.
</li>
</ul>