Job Module

From BaseX Documentation
Revision as of 17:45, 3 April 2016 by CG (talk | contribs)
Jump to navigation Jump to search

This XQuery Module provides functions for evaluating XQuery expressions in separate threads. Query execution can both be parallelized and postponed to be executed asynchronously.

Conventions

All functions in this module are assigned to the http://basex.org/modules/async namespace, which is statically bound to the async prefix. Errors will be bound to the same prefix.

Parallelized Execution

Parallel query execution is recommendable if you have various calls that require a lot of time, but cannot be sped up by rewriting the code. This is e. g. the case if external URLs are called. If you are parallelizing local data reads (such as accessing a database), your single-threaded query will usually be faster, because parallelized access to disk data will often lead to randomized access patterns, which can hardly be optimized by your HD or SSD.

async:fork-join

Signatures async:fork-join($functions as function(*)*) as item()*
async:fork-join($functions as function(*)*, $options as map(xs:string, xs:string)) as item()*
Summary This function executes the supplied (non-updating) functions in parallel. The following $options are available:
  • threads: maximum number of parallel threads (default: available number of cores)
  • thread-size: number of functions to be evaluated by each thread (default: 1)
Examples
  • The following function sleeps in parallel; it will be finished in 1 second if your system has at least 2 cores:
async:fork-join(
  for $i in 1 to 2
  return function() { prof:sleep(1000) }
)
  • In the following query, up to two URLs will be requested in parallel:
let $funcs :=
  for $segment in 1 to 4
  let $url := 'http://url.com/path' || $segment
  return function() { http:send-request((), $url) }
return async:fork-join($funcs, map { 'threads': 2 })
Errors unexpected: an unexpected error occurred while running a query or function in a separate thread.
out-of-range: a supplied option is out of range.

Asynchronous Execution

Asynchronous query execution is recommendable if a client does not, or cannot, wait until a request is fully processed. This is e. g. the case with web browsers, which will usually cancel a request after a specific timeout. In such cases, you can use asynchronous execution to trigger another server-side process, which will start the time-consuming process, and fetch the result later on as soon as it is available.

async:eval

Signatures async:eval($query as xs:string) as xs:string
async:eval($query as xs:string, $bindings as map(*)) as xs:string
async:eval($query as xs:string, $bindings as map(*), $options as map(xs:string, xs:string)) as xs:string
Summary Prepares the supplied $query string for asynchronous execution and returns a query id. The query will be queued as described in the article on Transaction Management, and the result will be cached in main-memory until it is fetched via async:result, or until ASYNCTIMEOUT is exceeded. Queries may be updating.
Variables and context items can be declared via $bindings (see xquery:eval for more details). The $options parameter contains evaluation options:
  • cache: indicates if the query result will be cached or ignored (default: true). If the query result will not be cached, the query id will immediately be discarded after query execution, too.
  • base-uri: set base-uri property for the query. This URI will be used when resolving relative URIs by functions such as fn:doc (default: empty string).
Examples
  • async:eval("1+3") returns a query id, e.g. Query-abc. The result can be retrieved via a second query in the same BaseX context: async:result("Query-abc")
  • The following RESTXQ function will return the id of the query thread, which evaluates the query that has been specified in the body of a POST request:
declare %rest:POST("{$query}") %rest:path('/eval') function local:eval($query) {
  async:eval($query)
};

async:result

Signatures async:result($id as xs:string) as item()*
Summary Returns the result of an asynchronously executed query with the specified query $id:
  • Results can only be retrieved once. After retrieval, the cached result will be discarded.
  • If the query raised an error, the error will be raised instead.
Errors is-running: the query is still running.
unknown: the supplied query id is unknown: The query result may already have been retrieved, or query execution may have been stopped.
Examples
  • The following RESTXQ function will either return the result of a previously started query or an error:
declare %rest:path('/result/{$id}') function local:result($id) {
  async:result($id)
};
  • The following query demonstrates how the results of an asynchronously executed query can be returned in a single query. Please remember that this is not the common way how these functions are used in practice:
let $query := async:eval('(1 to 10000000)[. = 1]')
return (
  hof:until(
    function($result) { async:finished($query) },
    function($curr) { prof:sleep(10) },
    ()
  ),
  async:result($query)
)

async:finished

Signatures async:finished($id as xs:string) as xs:boolean
Summary Indicates if the evaluation of a query with the specified query $id has finished. If false is returned, the query is still running. An error will be raised if the query result was not cached or has already been retrieved.
Errors unknown: the supplied query id is unknown: The query result may already have been retrieved, or query execution may have been stopped.

async:stop

Signatures async:stop($id as xs:string) as empty-sequence()
Summary Cancels the execution of a query with the specified query $id, or drops the query result if it has already been executed.
Errors unknown: the supplied query id is unknown: The query result may already have been retrieved, or query execution may have been stopped.

async:ids

Signatures async:ids() as xs:string*
Summary Returns the ids of all queries that are either being executed asynchronously, or that have been executed and the results of which have been cached.
Examples
  • async:ids() ! async:stop(.) stops and invalidates all asynchronous queries and results.

Errors

Code Description
unexpected An unexpected error occurred while running a query or function in a separate thread.
out-of-range The supplied option is out of range.
unknown The supplied query id is unknown or not available anymore.
is-running A query is still running.

Changelog

The module was introduced with Version 8.5.