Node Storage

From BaseX Documentation
Revision as of 00:48, 9 December 2011 by Dimitar (talk | contribs) (Added table storage description as described in the mailing list (W.I.P.))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

BaseX logically splits the tbl.basex file into blocks with length 4096 bytes, i.e. each block can have max 256 records each with length 16 bytes. The records within a block are sorted by their pre valu(which, therefore, can be implicitly determined and need not be saved).

For each block BaseX stores in a separate file (tbli.basex) the smallest pre value within that block (and since the records are sorted, that will be the pre value of the first record stored in the block). These will be referred as fpre from now on.

The physical address of each block is stored in tbli.basex, too.

Since these two maps will not grow excessively large, but are accessed resp. changed on each read resp. write operation, they are kept in main memory and flushed to disk on closing the database.

A newly created database which has 256 + 10 records, will occupy the first two blocks with physical addresses 0 and 4096. The corresponding fpre's will be 0 and 256.

If a record with pre = 12 is to be inserted, it needs to be stored in the first block, which is, however, full. In this case, a new block with physical address 8192 will be allocated, the records with pre values from 12 to 255 will be copied to the new block, the new record will be stored in the old block at pre = 12, and the two maps will look like this:

fpre's = 0, 13, 257 addr's = 0, 8192, 4096

Basically, the old records remain in the first block, but they will not be read, since the fpre's array says that only 13 records are stored in the first block. This causes redundant storage of the records with old pres from 13 to 255.

Additionally to these two maps (fpre's and addr's), BaseX maintains a bit map (which is also stored in tbli.basex) which reflects which physical blocks are free and which not, so that when a new block is needed, an already free one will be reused.