Difference between revisions of "Transaction Management"

From BaseX Documentation
Jump to navigation Jump to search
Line 12: Line 12:
 
Please note that:
 
Please note that:
  
* Locks *cannot be synchronized* between BaseX instances that run in different JVMs. If concurrent write operations are to be performed, we generally recommend working with the client/server or the HTTP architecture .
+
* Locks ''cannot be synchronized'' between BaseX instances that run in different JVMs. If concurrent write operations are to be performed, we generally recommend working with the client/server or the HTTP architecture .
* An *unexpected abort* of the server during a transaction, caused by a hardware failure or power cut, may lead to an inconsistent database state if a transaction was active at shutdown time. So it is advisable to use the [[Commands#CREATE BACKUP|BACKUP]] command to regularly backup your database. If the worst case occurs, you can try the [[Commands#INSPECT|INSPECT]] command to check if your database has obvious inconsistencies, and use [[Commands#RESTORE|RESTORE]] to restore the last backed up version of the database.
+
* An ''unexpected abort'' of the server during a transaction, caused by a hardware failure or power cut, may lead to an inconsistent database state if a transaction was active at shutdown time. So it is advisable to use the [[Commands#CREATE BACKUP|BACKUP]] command to regularly backup your database. If the worst case occurs, you can try the [[Commands#INSPECT|INSPECT]] command to check if your database has obvious inconsistencies, and use [[Commands#RESTORE|RESTORE]] to restore the last backed up version of the database.
  
 
==XQuery Update==
 
==XQuery Update==

Revision as of 11:17, 11 December 2015

This article is part of the Advanced User's Guide. The BaseX client-server architecture offers ACID-safe transactions, with multiple readers and writers. Here is some more information about the transaction management.

Introduction

In a nutshell, a transaction is equal to a command or query. So each command or query sent to the server becomes a transaction.

Incoming requests are parsed and checked for errors on the server. If the command or query is not correct, the request will not be executed, and the user will receive an error message. Otherwise the request becomes a transaction and gets into the transaction monitor.

Please note that:

  • Locks cannot be synchronized between BaseX instances that run in different JVMs. If concurrent write operations are to be performed, we generally recommend working with the client/server or the HTTP architecture .
  • An unexpected abort of the server during a transaction, caused by a hardware failure or power cut, may lead to an inconsistent database state if a transaction was active at shutdown time. So it is advisable to use the BACKUP command to regularly backup your database. If the worst case occurs, you can try the INSPECT command to check if your database has obvious inconsistencies, and use RESTORE to restore the last backed up version of the database.

XQuery Update

Many update operations are triggered by XQuery Update expressions. When executing an updating query, all update operations of the query are stored in a pending update list. They will be executed all at once, so the database is updated atomically. If any of the update sub-operations is erroneous, the overall transaction will be aborted.

Concurrency Control

BaseX provides support for multiple read and single write operations (using preclaiming and starvation-free two phase locking). This means that read transactions are executed in parallel. If an updating transaction comes in, it will be queued and executed after all previous read transaction have been executed. Subsequent operations will also be queued until the updating transaction has completed.

Each database has its own queue: An update on database A will not block operations on database B. This is under the premise that it can be statically determined (i.e., before the transaction is evaluated) which databases will be accessed by a transaction:

OPEN db; ADD factbook.xml; CLOSE
XQUERY insert node <a/> into db:open('db')/*

In the following example, all databases will be blocked, because the name of the second database, which will be opened in the query, will only be known after having opened the first database:

let $db1 := db:open('catalog')//db-name[@id = '123']
let $db2 := db:open($db)
return delete node $db2//text()

The number of maximum parallel transactions can be adjusted with the PARALLEL option.

External Side Effects

Access to external resources (files on hard disk, HTTP requests, ...) is not controlled by the transaction monitor of BaseX unless specified by the user.

XQuery Locking Options

Custom locks can be acquired by setting the BaseX-specific XQuery options query:read-lock and query:write-lock. Multiple option declarations may occur in the prolog of a query, but multiple values can also be separated with commas in a single declaration. These locks are in another namespace than the database names: the lock value factbook will not lock a database named factbook.

These option declarations will put read locks on foo, bar and batz and a write lock on quix:

declare option query:read-lock "foo,bar";
declare option query:read-lock "batz";
declare option query:write-lock "quix";

Java Modules

Locks can also be acquired on Java functions which are imported and invoked from an XQuery expression. It is advisable to explicitly lock Java code whenever it performs sensitive read and write operations.

Limitations

Commands

Database locking works with all commands unless the glob syntax is used, such as in the following command call:

  • DROP DB new*: drop all databases starting with "new"

XQuery

As XQuery is a very powerful language, deciding which databases will be accessed by a query is non-trivial. Optimization is work in progress. The current identification of which databases to lock is limited to queries that access the currently opened database, XQuery functions that explicitly specify a database, and expressions that address no database at all.

Some examples on database-locking enabled queries, all of these can be executed in parallel:

  • //item, read-locking of the database opened by a client
  • doc('factbook'), read-locking of "factbook"
  • collection('db/path/to/docs'), read-locking of "db"
  • fn:sum(1 to 100), locking nothing at all
  • delete nodes doc('test')//*[string-length(local-name(.)) > 5], write-locking of "test"

Some examples on queries that are not supported by database-locking yet:

  • let $db := 'factbook' return doc($db), will read-lock: referencing database names isn’t supported yet
  • for $db in ('factbook') return doc($db), will read-lock globally
  • doc(doc('test')/reference/text()), will read-lock globally
  • let $db := 'test' return insert nodes <test/> into doc($db), will write-lock globally

A list of all locked databases is output if QUERYINFO is set to true. If you think that too much is locked, please give us a note on our mailing list with some example code.

GUI

Database locking is currently disabled if the BaseX GUI is used.

Process Locking

In order to enable locking on global (process) level, the option GLOBALLOCK can be set to true. This can e.g. be done by editing your .basex file (see Options for more details). If process locking is active, a process that performs write operations will queue all other operations.

File-System Locks

Update Operations

During the term of a database update, a locking file upd.basex will reside in that database directory. If the update fails for some unexpected reason, or if the process is killed ungracefully, this file may not be deleted. In this case, the database cannot be opened anymore using the default commands, and the message "Database ... is being updated, or update was not completed" will be shown instead. If the locking file is manually removed, you may be able to reopen the database, but you should be aware that database may have got corrupt due to the interrupted update process, and you should revert to the most recent database backup.

Database Locks

To avoid database corruptions caused by write operations running in different JVMs, a shared lock is requested on the database table file (tbl.basex) whenever a database is opened. If an update operation is triggered, it will be rejected with the message "Database ... is opened by another process." if no exclusive lock can be acquired.

As the standalone versions of BaseX (command-line, GUI) cannot be synchronized with other BaseX instances, we generally recommend working with the client/server architecture if concurrent write operations are to be performed.

Changelog

Version 7.8
Version 7.6
  • Added: database locking introduced, replacing process locking.
Version 7.2.1
  • Updated: pin files replaced with shared/exclusive filesystem locking.
Version 7.2
  • Added: pin files to mark open databases.
Version 7.1
  • Added: update lock files.