Difference between revisions of "Storage Layout"
Jump to navigation
Jump to search
Line 1: | Line 1: | ||
− | |||
− | |||
==Data Types== | ==Data Types== | ||
* {{Type|Num}}: compressed integer (1-5 bytes) | * {{Type|Num}}: compressed integer (1-5 bytes) | ||
Line 10: | Line 8: | ||
==inf.basex== | ==inf.basex== | ||
+ | |||
+ | '''Contents:''' Meta information on a database and main memory indexes. | ||
+ | |||
{| class="wikitable" width="100%" | {| class="wikitable" width="100%" | ||
|- | |- | ||
Line 15: | Line 16: | ||
! Format | ! Format | ||
! Method | ! Method | ||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
| valign='top' | '''1. Meta Data''' | | valign='top' | '''1. Meta Data''' | ||
− | | Key/value pairs, suffixed by empty key ({{Type|Token}}/{{Type|Token}}):<br />• <code>PERM</code> → User Permissions | + | | valign='top' | Key/value pairs, suffixed by empty key ({{Type|Token}}/{{Type|Token}}):<br />• <code>PERM</code> → User Permissions |
− | | valign='top' | [https://github.com/BaseXdb/basex/blob/master/src/main/java/org/basex/data/MetaData.java MetaData()]<br/>[https://github.com/BaseXdb/basex/blob/master/src/main/java/org/basex/core/Users.java Users()] | + | | valign='top' | [https://github.com/BaseXdb/basex/blob/master/src/main/java/org/basex/data/DiskData.java DiskData()]<br/>[https://github.com/BaseXdb/basex/blob/master/src/main/java/org/basex/data/MetaData.java MetaData()]<br/>[https://github.com/BaseXdb/basex/blob/master/src/main/java/org/basex/core/Users.java Users()] |
|- | |- | ||
| valign='top' | '''2. Main memory indexes''' | | valign='top' | '''2. Main memory indexes''' | ||
Line 44: | Line 41: | ||
| valign='top' | [https://github.com/BaseXdb/basex/blob/master/src/main/java/org/basex/index/DocIndex.java DocIndex()] | | valign='top' | [https://github.com/BaseXdb/basex/blob/master/src/main/java/org/basex/index/DocIndex.java DocIndex()] | ||
|} | |} | ||
+ | |||
+ | ==(tbl|tbli).basex== | ||
+ | |||
+ | '''Contents:''' Main database table and directory. | ||
+ | |||
+ | ==txt.basex== | ||
+ | |||
+ | '''Contents:''' Heap file with text values (document names, string values of texts, comments and processing instructions). | ||
+ | |||
+ | ==atv.basex== | ||
+ | |||
+ | '''Contents:''' Heap file with attribute values. | ||
+ | |||
+ | ==(txtl|txtr).basex== | ||
+ | |||
+ | '''Contents:''' Value index for texts. | ||
+ | |||
+ | ==(atvl|atvr).basex== | ||
+ | |||
+ | '''Contents:''' Value index for attributes. | ||
+ | |||
+ | ==(ftxa|ftxb|ftxc).basex== | ||
+ | |||
+ | '''Contents:''' Trie full-text index. | ||
+ | |||
+ | ==(ftxx|ftxy|ftxz).basex== | ||
+ | |||
+ | '''Contents:''' Fuzzy full-text index. |
Revision as of 18:59, 26 October 2011
Contents
Data Types
Num
: compressed integer (1-5 bytes)Token
: length (Num
) and bytes of UTF8 byte representationDouble
: number, stored as tokenBoolean
: boolean (1 byte,00
or01
)TokenSet
: key array (Tokens
), next/bucket/size arrays (Nums
)Nums
,Tokens
andDoubles
are arrays of values, and introduced with the number of entries (Num
)
inf.basex
Contents: Meta information on a database and main memory indexes.
Description | Format | Method |
---|---|---|
1. Meta Data | Key/value pairs, suffixed by empty key (Token /Token ):• PERM → User Permissions
|
DiskData() MetaData() Users() |
2. Main memory indexes | Key/value pairs, suffixed by empty key (Token /Token ):• TAGS → Tag Index• ATTS → Attribute Index• PATH → Path Index• NS → Namespaces• DOCS → Document Index
|
DiskData() |
2.1. Name Index Tag/attribute names |
1. Token set, storing all names (TokenSet )2. One StatsKey instance per entry: 2.1. Content kind ( Num ):2.1.1. Number: min/max ( Doubles )2.1.2. Category: number of entries ( Num ), entries (Tokens )2.2. Number of entries ( Num )2.3. Leaf flag ( Boolean )2.4. Maximum text length ( Double ; legacy, could be Num )
|
Names() TokenSet.read() StatsKey() |
2.2. Path Index | 1. Flag for path definition (Boolean , always true ; legacy)2. PathNode: 2.1. Name reference ( Num )2.2. Node kind ( Num )2.3. Number of occurrences ( Num )2.4. Number of children ( Num )2.5. Double ; legacy, can be reused or discarded2.6. Recursive generation of child nodes (→ 2) |
PathSummary() PathNode() |
2.3. Namespaces | 1. Token set, storing prefixes (TokenSet )2. Token set, storing URIs ( TokenSet )3. NSNode: 3.1. pre value ( Num )3.2. References to prefix/URI pairs ( Nums )3.3. Number of children ( Num )3.4. Recursive generation of child nodes (→ 3) |
Namespaces() NSNode() |
2.4. Document Index | Array of integers, representing the distances between all document pre values (Nums )
|
DocIndex() |
(tbl|tbli).basex
Contents: Main database table and directory.
txt.basex
Contents: Heap file with text values (document names, string values of texts, comments and processing instructions).
atv.basex
Contents: Heap file with attribute values.
(txtl|txtr).basex
Contents: Value index for texts.
(atvl|atvr).basex
Contents: Value index for attributes.
(ftxa|ftxb|ftxc).basex
Contents: Trie full-text index.
(ftxx|ftxy|ftxz).basex
Contents: Fuzzy full-text index.