BaseX 10

From BaseX Documentation
Revision as of 12:34, 14 July 2022 by CG (talk | contribs) (Created page with "After 15 years of continuous development, the first double-digit version of BaseX sees the light of day. We have taken the version jump as an opportunity to perform some majo...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

After 15 years of continuous development, the first double-digit version of BaseX sees the light of day.

We have taken the version jump as an opportunity to perform some major refactorings of BaseX, both under the hood and on API and XQuery level.

Prerequisites

BaseX 10 requires Java 11 or later to run. Databases created with BaseX 10 are backward compatible and can still be opened with BaseX 9.

Storage

Whitespaces

With BaseX 10, all whitespaces are now preserved when importing XML resources, unless whitespace stripping is enabled.

The notorious CHOP option was removed to prevent conflicting behavior caused by earlier installations. It was replaced by a new STRIPWS option, which defaults to false. In addition, the new default of the serialization parameter indent is no.

Please be warned that the new default can throw off existing applications. If you want to restore the old behavior, you should assign the following values in your .basex configuration file, or the web.xml file of your Web Application:

<syntaxhighlight lang="xquery"> STRIPWS: true SERIALIZER: indent=no </syntaxhighlight>

In addition, databases may considerably increase in size, as whitespaces used for indenting an XML document will be interpreted and stored as additional text nodes. If your XML resources are structured and have no mixed content, it is advisable to enable whitespaces stripping when importing them to a database.

Backups

The Backup Commands and Backup Functions were enhanced to back up general data: registered users, scheduled services, key-value stores.

Applications

The default port for web applications has been changed from 8984 to 8080.

If a new application of BaseX is deployed, the admin user can only be used after a custom password has been assigned, e.g., via the PASSWORD command.

The IGNOREHOSTNAME option was dropped and merged with IGNORECERT.

Graphical User Interface

The graphical user interface of BaseX has been revised and made more consistent. With JEP 263, Java 9 introduced support for HiDPI graphics on Windows and Linux, so the old icons were replaced by scalable ones.

XQuery

Compilation

Several internal steps are performed when a query is executed (see XQuery Optimizations for more details):

1. The query is parsed, i.e., the original query string is transformed to an executable tree representation. 2. External values that are passed on by APIs are bound to variables and the query context. External values can be names of databases, or contribute to a name that will be assembled later on in the query. 3. The query is compiled and evaluated.

The transaction manager gathers the names of the databases that will be accessed by a query. If it is not possible to uniquely identify all databases that may be opened by the query, global locking will be applied, and all databases will be locked. Detection can fail if the names of databases depend on external input. It can also fail if a query is too complex to associate character strings with database operations.

With BaseX 10, compilation has been split into two separate steps:

  1. Compilation of logical, context-independent operations. External values are bound to the query, and deterministic code is rewritten, simplified and pre-evaluated.
  2. Optimization of physical, context-based operations. Addressed databases are opened and checked for available indexes; current date/time is retrieved. The resulting code is further rewritten and optimized.

Lock detection will be performed after the first step, and the code resulting from this step offers much more insight into which specific databases need to be locked. As a result, local locks can be applied to many more queries than before, and many queries can now run in parallel. An example:

<syntaxhighlight lang="xquery"> declare variable $n external; db:open('names-' || $n) </syntaxhighlight>

After the query has been parsed, a user-specific value (e.g., 123) will be bound to $n. The variable will be inlined by the compiler, and the argument of db:open will be pre-evaluated to names123. It is then easy for the lock detector to collect the name of the database that needs to read-locked before the query is eventually executed.

Another positive side effect of two-step compilation is that productive environments get faster in general: Queries can be compiled in parallel, and it’s only the optimization and evaluation of a query that may need to be delayed by locking.

Main-Memory Updates

XQuery Updates provides constructs to update XML nodes in main memory. The data structures for in-memory representations of XML resources have been revised, such that updates can be performed orders of magnitudes faster than before. With BaseX 9.7.3, the following query runs for 6-7 minutes, whereas it can now be computed in 3 seconds:

<syntaxhighlight lang="xquery"> <x>{

 (1 to 1000000) ! <y/>

}</x> update {

 y ! (insert node <z/> into .)

} </syntaxhighlight>

Modules

  • The new Store Module provides a persistent main-memory key-value store for speeding up operations on frequently accessed data.
  • The Job Module and String Module have been renamed (before: Jobs/Strings Module).
  • Various functions in the Fetch Module and Database Module have been renamed.
  • The deprecated ZIP Module has been removed.

Catalogs

From early on, catalog resolvers had been neglected both in BaseX and Java. This has changed: The new XML Catalog API from Java is now universally used to resolve references to external resources. As an alternative, Norman Walsh’s Enhanced XML Resolver will be used if it is found in the classpath.

The option for supplying the XML catalog was renamed from CATFILE to CATALOG. See Catalog Resolver for more details.

<syntaxhighlight lang="xquery"> -Djavax.xml.catalog.resolve=continue Q{java:System}setProperty('javax.xml.catalog.resolve', 'continue') </syntaxhighlight>