Binary Data

From BaseX Documentation
Jump to navigation Jump to search

This article is part of the Advanced User's Guide.

The BaseX store also provides support for binary resources. A database may contain both XML documents and binary files, which are handled in a uniform way: A unique database path serves as key, and the contents can be retrieved via database commands, XQuery, or the various APIs.

Storage

XML documents are stored in a proprietary format to speed up XPath axis traversals and update operations, and binary files are stored unchanged in a dedicated subdirectory (called raw). Several reasons exist for using the traditional file system as storage:

  • Good Performance: The file system generally performs very well when it comes to the retrieval and update of binary files.
  • Key/Value Stores: We do not want to compete with existing key/value database solutions.
  • Our Focus: our main focus is the efficient storage of hierarchical data structures and file formats such as XML or (more and more) JSON. The efficient storage of arbitrary binary resources would introduce many new challenges that would distract us from more pressing tasks.

For some use cases, the chosen database design may bring along certain limitations:

  • Performance Limits: most file system are not capable of handling thousands or millions of binary resources in a single directory in an efficient way. The same problem happens if you have a large number of XML documents that need to imported in or exported from a BaseX database. The general solution to avoid this bottleneck is to distribute the relevant binaries in additional subdirectories.
  • Keys: if you want to use arbitrary keys for XML and binary resources, which are not supported by the underlying file system, you may either add an XML document in your database that contains all key/path mappings.

In the latter case, a key/value store might be the better option anyway.

Usage

More information on how to store, retrieve, update and export binary data is found in the general Database documentation.