Binary Data
The BaseX store provides support for binary resources. A database can contain both XML documents and binary files, which are handled in a consistent manner: A unique database path serves as the key, and the contents can be retrieved using database commands, XQuery, or the various APIs.
Storage
XML documents are stored in a proprietary format to speed up XPath axis traversals and update operations, and binary files are stored unchanged in a dedicated subdirectory (called raw
). Several reasons exist for using the traditional file system as storage:
- Good Performance: The file system generally performs very well when it comes to the retrieval and update of binary files.
- Key/Value Stores: We do not want to compete with existing key/value database solutions.
- Our Focus: our main focus is the efficient storage of hierarchical data structures and file formats such as XML or (more and more) JSON. The efficient storage of arbitrary binary resources would introduce many new challenges that would distract us from more pressing tasks.
For some use cases, the chosen database design may bring along certain limitations:
- Performance Limits: most file system are not capable of handling thousands or millions of binary resources in a single directory efficiently. The same problem happens if you have a large number of XML documents that need to be imported in or exported from a BaseX database. The general solution to avoid this bottleneck is to distribute the relevant binaries in additional subdirectories.
- Keys: if you want to use arbitrary keys for XML and binary resources, which are not supported by the underlying file system, you may either add an XML document in your database that contains all key/path mappings.
In the latter case, a key/value store might be the better option anyway.
Usage
More information on how to store, retrieve, update and export binary data is found in the general Databases documentation.