Changes

Jump to navigation Jump to search
460 bytes removed ,  16:55, 10 April 2019
no edit summary
This page is provided to help those who are interested in the specific file format of the index files used by BaseX. It was predominantly written to aid research into the reasons for ever increasing file size when using the <code>[[Options#UPDINDEX{{Option|UPDINDEX]]</code> }} option.
== Attribute Index Files ==
<span style="background-color:#F4A460"><code>[02] 0B 03</code></span>
The header tells us that there are 4 attribute values but we can see there are 5 ID lists in the file. One has become orphaned - a new longer list was required to include the newly added attribute and has been appended to the end of the file.
In versions of BaseX prior to 8.0 when items are deleted and a shorter list is required it will be updated in place. When items are added and a longer list is required the new list is always added at the end of the file. Over a period of time the file will grow - running the [[Commands#OPTIMIZE|OPTIMIZE]] command will recreate the index from scratch and recover the lost space. From BaseX 8.0 some optimisations have been applies so that while While a database is open , a list of free spaces is maintained and a new list will only be added to the end of the file if there isn't a free space available that is large enough. However , this list of free spaces is lost when the database is closed and future operations will not be aware of any free space available when the database is opened. This, and the fact that small spaces are unlikely to be filled (single bytes for example) mean that the index file may still grow larger than it needs to be. This space can be recovered, as before, by running [[Commands#OPTIMIZE|OPTIMIZE]].
== Value Index Files ==
These files, txtr.basex and txtl.basex work in the same way as the attribute index files but with references to the text nodes instead of attributes.
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu