Changes

Jump to navigation Jump to search
1,041 bytes added ,  15:56, 14 February 2017
==Name Index==
The name index contains all element and attribute references to the names of a database, all elements and the fixed-size index ids are stored attributes in the main database table. If a database is updated, new names are automatically added. Furthermore, the index is enriched with It contains some basic statistical information, such as the distinct (categorical) or minimum and maximum values number of its elements and attributes. The maximum number occurrence of categories to store per a name can be changed via [[Options#MAXCATS|MAXCATS]]. The index are discarded after [[#Updates|database updates]].
The name index is e.g. applied to pre-evaluate discard location steps that will never yield results:
<pre class="brush:xquery">
The contents of the name indexes can be directly accessed with the XQuery functions [[Index Module#index:element-names|index:element-names]] and [[Index Module#index:attribute-names|index:attribute-names]].
 
If a database is updated, new names will be added incrementally, but the statistical information will get out-dated.
==Path Index==
The path index (which is also called ''path summary'' or ''data guide'') stores all distinct paths of the documents in the database. It contains similar additional statistical information , such as the name indexnumber of occurrence of a path, its distinct string values, and the minimum/maximum of numeric values. The index are discarded after [[#Updatesmaximum number of distinct values to store per name can be changed via {{Option|database updates]]MAXCATS}}.
The Since {{Version|8.6}}, the distinct values are also stored for elements and attributes of numeric type. Various queries will be evaluated much faster if an up-to-date path index is applied to rewrite descendant available (as can be observed when opening the [[GUI#Visualizations|Info View]]): * Descendant steps will be rewritten to multiple child steps. Child steps can be are evaluated faster, as fewer nodes have to be accessedtraversed:
<pre class="brush:xquery">
</pre>
* The paths statistics are e.g. used to pre-evaluate the {{Code|fn:count}} functionwill be pre-evaluated by looking up the number in the index:
<pre class="brush:xquery">
count(doc(: will 'factbook')//country)</pre> * The distinct values of elements or attributes can be rewritten and pre-evaluated by looked up in the path index as well: <pre class="brush:)xquery"> countdistinct-values( docdb:open('factbook')//country religions)
</pre>
The contents of the path index can be directly accessed with the XQuery function [[Index Module#index:facets|index:facets]].
 
If a database is updated, the statistics in the path index will be invalidated.
==Document Index==
Matching text nodes can be directly requested from the index with the XQuery function {{Function|Database|db:text}}. The index contents can be accessed via {{Function|Index|index:text}}.
The {{Option|UPDINDEX}} option can be activated to keep this index up-to-date., for example: <pre class="brush:xquery">db:optimize( 'mydb', true(), map { 'updindex':true(), 'textindex': true(), 'textinclude':'id' })</pre>
===Range Queries===
Attribute nodes can directly be retrieved from the index with the XQuery functions {{Function|Database|db:attribute}} and {{Function|Database|db:attribute-range}}. The index contents can be accessed with {{Function|Index|index:attributes}}.
The [[Options#UPDINDEX{{Option|UPDINDEX]] }} option can be activated to keep this index up-to-date.
==Token Index==
The index provides support for the following full-text features (the values can be changed in the GUI or via the {{Command|SET}} command):
* '''Stemming''': tokens are stemmed before being indexed (see option: {{Option|STEMMING}})* '''Case Sensitive''': tokens are indexed in case-sensitive mode (see option: {{Option|CASESENS}})* '''Diacritics''': diacritics are indexed as well (see option: {{Option|DIACRITICS}})* '''Stopword List''': a stop word list can be defined to reduce the number of indexed tokens (see option: {{Option|STOPWORDS}})* '''Language''': see [[Full-Text#Languages|Languages]] for more details (see option: {{Option|LANGUAGE}})
The options that have been used for creating the full-text index will also be applied to the optimized full-text queries. However, the defaults can be overwritten if you supply options in your query. For example, if words were stemmed in the index, and if the query can be rewritten for index access, the query terms will be stemmed as well, unless stemming is not explicitly disabled. This is demonstrated in the following [[Commands#Command_Scripts|Command Script]]:
=Updates=
Updates in BaseX Generally, update operations are generally very fastin BaseX. By default, because the index structures will be invalidated by updates. As ; as a result, subsequent queries that benefit from index structures may be executed more slowly than before the updateslow down after updates. There are different alternatives to cope with this:
* After the execution of one or more update operations, the {{Command|OPTIMIZE}} command or the {{Function|Database|db:optimize}} function can be called to rebuild the index structures.
* The {{Option|UPDINDEX}} option can be activated before creating or optimizing the database. As a result, the text, attribute and token indexes will be incrementally updated after each database update. Please note that incremental updates are not available for the full-text index and database statistics. This is also explains why the up-to-date UPTODATE flag, which is e.g. displayed via {{Command|INFO DB}} or {{Function|Database|db:info}}, will be set to {{Code|false}} until the database will be optimized again(various optimizations won’t be triggered. For example, count(//item) can be extremely fast if all meta data is up-to-date.* The {{Option|AUTOOPTIMIZE}} option can be enabled before creating or optimizing the database. All outdated index structures and statistics will then be recreated after each database update. This option should only be used done for small and medium-sized databases.* Both options can be used side by side: {{Option|UPDINDEX}} will take care that the value index structures will be updated as part of the actual update operation. {{Option|AUTOOPTIMIZE}} will update the remaining data structures (full-text index, database statistics).
=Changelog=
 
;Version 8.4
 
* Updated: [[#Name Index|Name Index]], [[#Path Index|Path Index]]
;Version 8.4
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu