Main Page » Advanced User Guide » BaseX 10

BaseX 10

After 15 years of continuous development, the first double-digit version of BaseX sees the light of day.

We have taken the version jump as an opportunity to perform some major refactorings of BaseX, both under the hood and on API and XQuery level. Before migrating your projects to the new version, some adjustments may be required, so please read this article carefully.

Prerequisites

BaseX 10 requires Java 11 or later to run. Databases created with the new version are backward compatible and can still be opened with BaseX 9.

Migrating Applications

The following modifications might be relevant when migrating existing applications:

The default ports for web applications have been changed from 8984/8985 to 8080/8081.
The default admin password has been removed. The admin user can only be used if a password has been assigned, e.g., via the PASSWORD command (see Database Server or Web Application for more details).
The conventions for functions in Clients in other programming languages were revised.
The IGNOREHOSTNAME option was dropped and merged with IGNORECERT.

Storage

Whitespace

Whitespace is now preserved when importing XML resources, unless whitespace stripping is enabled.

The notorious CHOP option was removed to prevent conflicting behavior caused by earlier installations. It was replaced by a new STRIPWS option, which defaults to false. In addition, the new default of the serialization parameter indent is no.

Please be warned that the new default can throw off existing applications. If you want to restore the old behavior, you should attach the following lines in your .basex configuration file…

# Local Options
STRIPWS = true
SERIALIZER = indent=yes

…or add context parameters in the web.xml file of your Web Application:

<context-param>
  <param-name>org.basex.stripws</param-name>
  <param-value>true</param-value>
</context-param>
<context-param>
  <param-name>org.basex.serializer</param-name>
  <param-value>indent=yes</param-value>
</context-param>

In the GUI editor, a shortcut and an icon were added to switch result indentation on and off.

In addition, databases may considerably increase in size, as whitespace used for indenting an XML document will be interpreted and stored as additional text nodes. If your XML resources are structured and have no mixed content, it is advisable to enable whitespace stripping when importing them to a database.

Note that STRIPWS only applies to stored XML and not to XML constructed within XQuery source files. Within XQuery code, the standard boundary-space setting applies. This defaults to strip. To preserve XML whitespace in XQuery modules, add the following to the prolog:

declare boundary-space preserve; 
<xml> </xml>

Value Resources

In addition to XML and binary resources, a third resource type has been added: XQuery values (atomic items and nodes, sequences, maps, arrays) can now be stored in databases as well. The db:put-value and db:get-value can be used to store to and retrieve values.

The new feature can e.g. be used to store maps in a database:

db:put-value(
  'factbook',
  map:merge(
    for $country in db:get('factbook')//country
    return map:entry($country/@name, $country//city/name ! string())
  ),
  'cities'
)

…and use them as index later on:

let $cities := db:get-value('factbook', 'cities')
for $country in ('Japan', 'Indonesia', 'Malaysia')
return $country || ': ' || string-join($cities?($country), ', ')

Backups

The Commands and Backup Functions were enhanced to back up general data: registered users, scheduled services, key-value stores.

XQuery

Compilation

The compilation has been split up into multiple steps to improve locking.

So far, several internal steps were already performed when executing a query (see XQuery Optimizations for more details):

The query is parsed, i.e., the original query string is transformed to an executable tree representation.
External values that are passed on by APIs are bound to variables and the query context. External values can be names of databases, or contribute to a name that will later on be constructed in the query.
The query is compiled and evaluated.

The transaction manager gathers the names of the databases that will be accessed by a query. If it is not possible to uniquely identify all databases that may be opened by the query, global locking will be applied, and all databases will be locked. Detection can fail if the names of databases depend on external input. It can also fail if a query is too complex to associate character strings with database operations.

The compilation phase now comprises two separate steps:

Compilation of logical, context-independent (static) operations. External values are bound to the query, and deterministic code is rewritten, simplified and pre-evaluated.
Optimization of physical, context-based (dynamic) operations. Databases are opened and checked for available indexes; current date/time is retrieved. The resulting code is further rewritten and optimized.

Lock detection will be performed after the first step, and the code resulting from this step offers much more insight into which specific databases need to be locked. As a result, local locks can be applied to many more queries than before, and many queries can now run in parallel. An example:

declare variable $n external;
db:get('names-' || $n)

After the query has been parsed, a user-specific value (e.g., 123) will be bound to $n. The variable will be inlined by the compiler, and the argument of db:get will be pre-evaluated to names123. It is then easy for the lock detector to collect the name of the database that needs to read-locked before the query is eventually executed.

Another positive side effect of two-step compilation is that productive environments get faster in general: Queries can be compiled in parallel, and it’s only the optimization and evaluation of a query that may need to be delayed by locking.

Main-Memory Updates

XQuery Update provides constructs to update XML nodes in main memory. The data structures for in-memory representations of XML resources have been revised, such that updates can be performed orders of magnitudes faster than before. With BaseX 9.x, the following query runs for several minutes, whereas it can now be computed in a few seconds:

<xml>{
  (1 to 1000000) ! <child/>
}</xml> update {
  for $child at $pos in child
  return insert node text { $pos } into $child
}

Key-Value Store

With the new Store Functions, values can be organized in a persistent main-memory key-value store. The store allows you to speed up access to frequently accessed data.

Store data

let $email := map:merge(
  for $address in db:get('addressbook')//address
  return map:entry($address/name, $address/email)
)
return store:put('emails', $email)

Retrieve data

let $name := 'Richard David James'
return store:get('email')($name)

The store is persistent: Its contents are written to disk if BaseX is shut down, and retrieved again after a restart.

Functions

All functions, excluding the File Functions, now consistently resolve relative URI references against the static base URI, and not the current working directory.

Various functions have been revised, added, renamed or removed:

Description	BaseX 10	BaseX 9
Retrieve XML resources	`db:get`	`db:open`
Retrieve nodes with specified pre values	`db:get-pre`	`db:open-pre`
Retrieve nodes with specified IDs	`db:get-id`	`db:open-id`
Retrieve binary resources	`db:get-binary`	`db:retrieve`
Retrieve value resources	`db:get-value`	new
Add or replace resource	`db:put`, arguments swapped!	`db:replace`
Add or replace binary resource	`db:put-binary`, arguments swapped!	`db:store`
Add or replace value resource	`db:put-value`	new
Get resource type	`db:type`	`db:is-raw`, `db:is-xml`
Fetch XML document	`fetch:doc`	`fetch:xml`
Convert binary data to XML	`fetch:binary-doc`	`fetch:xml-binary`
Functions: Process Geo data	removed	Geo Functions
XQuery jobs	Job Functions	Jobs Functions
Return variable bindings of a job	`job:bindings`	new
Return variable bindings of a job	`job:remove`	`jobs:stop`
Functions: Main-memory key-value store	Store Functions	new
Functions: String computations	String Functions	Strings Functions
Format string	`string:format`	`out:format`
Return control characters	`string:cr`, `string:nl`, `string:tab`	`out:cr`, `out:nl`, `out:tab`
Functions: Process ZIP files	removed	ZIP Functions

Commands

The following commands have been revised:

Description	BaseX 10	BaseX 9
List directories and resources.	`DIR`	new
Retrieve single XML document	`GET`	new
Retrieve binary resource	`BINARY GET`	`RETRIEVE`
Add or replace resources	`PUT`	`REPLACE`
Store binary resource	`BINARY PUT`	`STORE`
Returns current option values	`SHOW OPTIONS`	`GET`
Lists jobs	removed	`JOBS LIST`
Returns a job result	removed	`JOBS RESULT`
Stops a job	removed	`JOBS STOP`

HTTP Requests

HTTP requests in BaseX are now based on the new Java HTTP Client. This client provides a better overall performance, uses internal connection pools and follows redirects across different protocols (http, https).

HTTP operations are, among others, performed by:

the HTTP Client Functions;
the Fetch Functions, Database Functions, Fetch Functions, Validation Functions, XSLT Functions or Repository Functions;
fn:doc and fn:collection;
the CREATE DB and REPO INSTALL commands.

Catalogs

From early on, catalog resolvers had been neglected both in BaseX and Java. This has changed: The new XML Catalog API from Java is universally used to resolve references to external resources. As an alternative, Norman Walsh’s Enhanced XML Resolver is utilized if it is found in the classpath.

The option for supplying the XML catalog was renamed from CATFILE to CATALOG. See Catalog Resolver for more details.

Graphical User Interface

The graphical user interface of BaseX has been revised and made more consistent.

The icons were replaced by scalable ones, building upon the HiDPI graphics support for Windows and Linux.

REST

Results in the rest namespace are now returned without prefix:

<!-- before -->
<rest:databases xmlns:rest="http://basex.org/rest"/>

<!-- now -->
<databases xmlns="http://basex.org/rest"/>

When listing the resources of a database, dir elements are returned for resources that are located in subdirectories. See REST for more details.

⚡Generated with XQuery