Changes

Jump to navigation Jump to search
7,145 bytes added ,  13:45, 6 May 2021
This [[Module Library|XQuery Module]] provides functions for evaluating XQuery expressions in separate threadsorganizing scheduled, queued, running and cached jobs. Query execution Jobs can both be parallelized and postponed to be executed asynchronouslycommands, queries, client or HTTP requests.
=Conventions=
All functions in this module are assigned to the <code><nowiki>http://basex.org/modules/asyncjobs</nowiki></code> namespace, which is statically bound to the {{Code|asyncjobs}} prefix. Errors will be bound to the same prefix.
=Parallelized ExecutionServices=
A job can be registered as ''service'' by supplying the {{Code|service}} option to {{Function|Jobs|jobs:eval}}: <syntaxhighlight lang="xquery">(: register job as service; will be run every day at 1 am :)jobs:eval('db:drop("tmp")', (), map { 'id':'cleanup', 'start':'01:00:00', 'interval':'P1D', 'service': true() }), (: list registered services :)jobs:services(),(: result: <job base-uri="..." id="cleanup" interval="P1D" start="01:00:00">db:drop("tmp")</job> :) (: unregister job :)jobs:stop('cleanup', map { 'service': true() })</syntaxhighlight> '''Some more notes:''' * All job services will be scheduled for evaluation when the BaseX server or BaseX HTTP server is started.* If a job service is outdated (e.g. because a supplied end time has been exceeded), it will be removed from the jobs file at startup time.* The job definitions are stored in a {{Code|jobs.xml}} file in the database directory. It can also be edited manually. =Basic Functions= ==asyncjobs:forkcurrent== {| width='100%'|-join| width='120' | '''Signatures'''|{{Func|jobs:current||xs:string}}|-| '''Summary'''|Returns the id of the current job.|} ==jobs:list==
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:fork-joinlist|$functions as function(*)*|item()*}}<br/ >{{Func|async:fork-join|$functions as function(*)*, $options as map(xs:string, xs:string)|item()*}}<br/ >
|-
|'''Summary'''|This function executes Returns the supplied functions in parallelids of all jobs that are currently registered. The following {{Code|$options}} are available:* <code>threads</code>: maximum number of parallel threads (default: available number of cores)* <code>thread-size</code>: number of functions to be evaluated by each thread (default: <code>1</code>)list includes scheduled, queued, running, stopped, and finished jobs with cached results.
|-
| '''Examples'''
|* The following function sleeps in parallel; it will be finished in 1 second if your system has at least 2 cores:<pre class='brush:xquery'code>async:fork-join( for $i in 1 to 2 return function() { profjobs:sleeplist(1000) })</precode>* In returns the following query, up to two URLs will be requested in parallel:<pre class='brush:xquery'>let $funcs := for $segment in 1 to 4 let $url := 'http://url.com/path' || $segment return function() same job id as { http:send-request((), $url) }return async:fork-join($funcs, map { 'threads': 2 })</pre>Function|-Jobs|'''Errors'''|{{Error|unexpected|#Errorsjobs:current}} an unexpected error occurred while running a query or function in a separate thread.<br/>{{Error|out-of-range|#Errors}} a supplied option if no other job is out of rangeregistered.<br/>
|}
=Asynchronous Execution==asyncjobs:evallist-details==
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:evallist-details|$query as xs:string|xs:string}}<br />{{Func|async:eval|$query as xs:string, $bindings as mapelement(job)*)|xs:string}}<br />{{Func|asyncjobs:evallist-details|$query id as xs:string, $bindings as map|element(job)*), $options as map(xs:string, xs:string)|xs:string}}<br />
|-
| '''Summary'''
|Prepares Returns information on all jobs that are currently registered, or on a job with the supplied specified {{Code|$queryid}} string for asynchronous execution and returns a query id(or an empty sequence if this job is not found). The query will be list includes scheduled, queued as described in the article on [[Transaction Management]], running jobs, and cached jobs. A string representation of the result job, or its URI, will be cached in main-memory until it is fetched via [[#asyncreturned as value. The returned elements have additional attributes:* <code>id</code>:result|asyncjob id* <code>type</code>:result]]type of the job (command, query, REST, RESTXQ, or until {{Option|ASYNCTIMEOUT}} is exceededetc.)* <code>state<br/code>Variables and context items can be declared via {{Code|$bindings}} (see [[XQuery Module#xquery:eval|xquerycurrent state of the job:eval]] for more details). The {{Code|$options}} parameter contains evaluation options<code>scheduled</code>, <code>queued</code>, <code>running</code>, <code>cached</code>* <code>user</code>:user who started the job* {{Code|cache}}<code>duration</code>: indicates evaluation time (included if a job is running or if the query result will be was cached or ignored (default: )* <code>truestart</code>: next start of job (included if a job will be executed repeatedly).* {{Code|base-uri}}: set [https<code>time</code>://www.w3.org/TR/xquery-31/#dt-static-base-uri base-uri property] for the query. This URI will be used time when resolving relative URIs by functions such as {{Code|fn:doc}} (default: ''empty string'').job was registered
|-
| '''ErrorsExamples'''| <code>jobs:list-details()</code> returns information on the currently running job and possibly others:<syntaxhighlight lang="xml"><job id="job1" type="XQuery" state="running" user="admin" duration="PT0.001S"> XQUERY jobs:list-details()</job></syntaxhighlight>|} ==jobs:finished== {| width='100%'|-| width='120' | '''Signatures'''|{{ErrorFunc|jobs:finished|updating$id as xs:string|#Errorsxs:boolean}} the query contains update operations.
|-
| '''ExamplesSummary'''|* Indicates if the evaluation of an already running job with the specified {{Code|async:eval("1+3")$id}} returns a query has finished. As the ids of finished jobs will usually be discarded, unless caching is enabled, the function will also return <code>true</code> for unknown jobs.* <code>false</code> indicates that the job idis scheduled, equeued, or currently running.g. {{Code|Query-abc}}. The result can * <code>true</code> will be retrieved via a second query in returned if the same BaseX context: {{Code|async:resultjob has either finished, or if the id is unknown ("Query-abc"because the ids of all finished jobs will not be cached)}}<br />.
|}
==asyncjobs:updateservices==
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:updateservices|$query as xs:string|xs:string}}<br />{{Func|async:update|$query as xs:string, $bindings as mapelement(*job)|xs:string}}<br />{{Func|async:update|$query as xs:string, $bindings as map(*), $options as map(xs:string, xs:string)|xs:string}}<br />
|-
| '''Summary'''
|Prepares the supplied {{Code|$query}} string for asynchronous execution and returns Returns a query id. The query will be queued list of all jobs that have been persistently registered as described in the article on [[Transaction Management]].<br/>See [[#async:evalServices|async:evalServices]] for information on the <code>$bindings</code> and <code>$options</code> arguments.
|-
| '''Errors'''
|{{Error|non-updatingservices|#Errors}} the query does not contain any update operationsRegistered services cannot be parsed.<br/>|-| '''Examples'''|* <code>async:update("delete node db:open('db')//text()", map {}, map { 'cache': false() })</code> returns a query id. The text nodes of the database <code>db</code> will be deleted once the database is available for write access.
|}
=Execution=async There are cases in which a client does not, or cannot, wait until a request is fully processed. The client may be a browser, which sends an HTTP request to the server in order to start another time-consuming query job. The functions in this section allow you to register a new query job from a running query. Jobs can be executed immediately (i.e., as soon as the [[Transaction Management#Concurrency Control|Concurrency Control]] allows it) or scheduled for repeated execution. Each registered job gets a job id, and the id can be used to retrieve a query result, stop a job, or wait for its termination. ==jobs:resulteval== {{Mark|Updated with Version 9.5}}: Integers added as valid start and end times.
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:resulteval|$id query as xs:anyAtomicItem|xs:string}}<br />{{Func|jobs:eval|$query as xs:anyAtomicItem, $bindings as map(*)?|xs:string}}<br />{{Func|jobs:eval|item$query as xs:anyAtomicItem, $bindings as map(*)?, $options as map(*)?|xs:string}}<br />
|-
| '''Summary'''
|Returns Schedules the result evaluation of an asynchronously executed the supplied {{Code|$query}} and returns a query id. The query with will be queued, and the specified result will optionally be cached. Queries can be updating. The query can be a URI or a string, and variables and context items can be declared via {{Code|$bindings}} (see [[XQuery Module#xquery:eval|xquery:eval]] for more details). The following {{Code|$idoptions}} can be supplied:* {{Code|cache}}: indicates if the query result will be cached or ignored (default: <code>false</code>):* Results * The result will be cached in main-memory until it is fetched via [[#jobs:result|jobs:result]], or until {{Option|CACHETIMEOUT}} is exceeded.** If the query raises an error, it will be cached and returned instead.* {{Code|start}}: a dayTimeDuration, time, dateTime or integer can only be retrieved oncespecified to delay the execution of the query:** If a dayTimeDuration is specified, the query will be queued after the specified duration has passed. Examples for valid values are: <code>P1D</code> (1 day), <code>PT5M</code> (5 minutes), <code>PT0.1S</code> (100 ms). An error will be raised if a negative value is specified.** If a dateTime is specified, the query will be executed at this date. Examples for valid values are: <code>2018-12-31T23:59:59</code> (New Year's Eve 2018, close to midnight). An error will be raised if the specified time lies in the past.** If a time is specified, the query will be executed at this time of the day. Examples for valid times are: <code>02:00:00</code> (2am local time), <code>12:00:00Z</code> (noon, UTC). After retrievalIf the time lies in the past, the cached result query will be discardedexecuted the next day.* * An integer will be interpreted as minutes. If the specified number exceeds the minutes of the current hour, the query will be executed one hour later.* {{Code|interval}}: a dayTimeDuration string can be specified to execute the query periodically. An error is raised an errorif the specified interval is less than one second (<code>PT1S</code>). If the next scheduled call is due, and if a query with the same id is still running, it will be skipped.* {{Code|end}}: scheduling can be stopped after a given time or duration. The string format is the same as for {{Code|start}}. An error is raised if the resulting end time is smaller than the start time.* {{Code|base-uri}}: sets the [https://www.w3.org/TR/xquery-31/#dt-static-base-uri base-uri property] for the query. This URI will be raised insteadused when resolving relative URIs, such as with {{Code|fn:doc}}.* {{Code|id}}: sets a custom job id. The id must not start with the standard <code>job</code> prefix, and it can only be assigned if no job with the same name exists.* {{Code|service}}: additionally registers the job as [[#Services|service]]. Registered services must have no variable bindings.* {{Code|log}}: writes the specified string to the [[Logging|database logs]]. Two log entries are stored, one at the beginning and another one after the execution of the job.
|-
| '''Errors'''
|{{Error|overflow|#Errors}} Query execution is rejected, because too many jobs are queued or being executed. {{Option|CACHETIMEOUT}} can be decreased if the default setting is-runningtoo restrictive.<br/>{{Error|range|#Errors}} the query A specified time or duration is still runningout of range.<br/>{{Error|unknownid|#Errors}} the supplied query The specified id is unknown: The query result may invalid or has already have been retrieved, or query execution may have been stoppedassigned.<br/>{{Error|options|#Errors}} The specified options are conflicting.
|-
| '''Examples'''
| The following query returns the results of an asynchronously executed * Cache queryresult. It will succeed, because both The returned id can be used to pick up the main and the asynchronous query do not include write operations on concurrently used databasesresult with [[#jobs:result|jobs:result]]:<pre classsyntaxhighlight lang="xquery">jobs:eval("1+3", (), map { 'brushcache': true() })</syntaxhighlight>* A happy birthday mail will be sent at the given date:<syntaxhighlight lang="xquery'">let $query jobs:eval("import module namespace mail= async'mail'; mail:evalsend('Happy birthday!')", (1 to 10000000), map { 'start': '2018-09-01T06:00:00' })}}</syntaxhighlight>* The following [[RESTXQ]] functions can be called to execute a query at 2 am every day. An id will be returned by the first function, which can be used to stop the scheduler via the second function:<syntaxhighlight lang= 1]')"xquery">return declare %rest:POST( hof"{$query}") %rest:untilpath( '/start-scheduling') functionlocal:start($resultquery) { not(async jobs:is-runningeval($query, (), map { 'start': '02:00:00', 'interval': 'P1D' }) },; declare %rest:path('/stop-scheduling/{$id}') functionlocal:stop($currid) { jobs:stop($id)};</syntaxhighlight>* Query execution is scheduled for every second, and for 10 seconds in total. As the query itself will take 1.5 seconds, it will only be executed every second time:<syntaxhighlight lang="xquery">jobs:eval("prof:sleep(101500)", () , map { 'interval': 'PT1S', 'end': 'PT10S' })</syntaxhighlight>* The query in the specified file will be evaluated once:<syntaxhighlight lang="xquery">jobs:eval(xs:anyURI('cleanup.xq'))</syntaxhighlight>* The following expression,if stored in a file, will be evaluated every 5 seconds: <syntaxhighlight lang="xquery">jobs:eval( static-base-uri(), )map { }, asyncmap { 'start':result($query)'PT5S' }
)
</presyntaxhighlight>
|}
==asyncjobs:is-runningresult==
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:is-runningresult|$id as xs:string|xs:booleanitem()*}}
|-
| '''Summary'''
|Indicates if Returns the cached result of a query job with the specified query job {{Code|$id}} is currently being evaluated:* Results can only be retrieved once. After retrieval, the cached result will be dropped.* If the original job has raised an error, the cached error will be raised instead.
|-
| '''Errors'''
|{{Error|running|#Errors}} the job is still running.<br/>{{Error|unknown|#Errors}} the supplied query id is unknown: The query id is unknown, or the result may has already have been retrieved, .<br/>|-| '''Examples'''|* The following [[RESTXQ]] function will either return the result of a previously started job or raise an error:<syntaxhighlight lang="xquery">declare %rest:path('/result/{$id}') function local:result($id) { jobs:result($id)};</syntaxhighlight>* The following query demonstrates how the results of an executed query execution may have been stoppedcan be returned within the same query (see below why you should avoid this pattern in practice):<syntaxhighlight lang="xquery">let $query := jobs:eval('(1 to 10000000)[.= 1]', map { }, map { 'cache': true() })return ( jobs:wait($query), jobs:result($query))<br/syntaxhighlight>Queries of this kind can cause deadlocks! If the original query and the new query perform updates on the same database, the second query will only be run after the first one has been executed, and the first query will wait for the second query forever. You should resort to [[XQuery Module#xquery:fork-join|xquery:fork-join]] if you want to have full control on parallel query execution.
|}
==asyncjobs:stop==
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:stop|$id as xs:string|empty-sequence()}}
|-
| '''Summary'''
|Cancels Triggers the execution cancelation of a query job with the specified query {{Code|$id}}, or drops the cached result of a query result if , or cancels a scheduled job. Unknown ids are ignored. All jobs are gracefully stopped; it is up to the process to decide when it has already been executedis safe to shut down. The following {{Code|$options}} can be supplied:* {{Code|service}}: additionally removes the job from the [[#Services|job services]] list.
|-
| '''ErrorsExamples'''|{{Error|unknown|#Errors}} the supplied query id is unknown<code>jobs:list()[. != jobs:current()] ! jobs: The query result may already have been retrieved, or query execution may have been stoppedstop(.)<br/code>stops and discards all jobs except for the current one.
|}
==asyncjobs:idswait==
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:ids|wait|$id as xs:string*|empty-sequence()}}
|-
| '''Summary'''
|Returns Waits for the ids completion of all queries that are either being executeda job with the specified {{Code|$id}}:* The function will terminate immediately if the job id is unknown. This is the case if a future job has not been queued yet, or that have if the id has already been discarded after job evaluation.* If the function is called with the id of a queued job, or repeatedly executed job, it may stall and the results of which have been cachednever terminate.
|-
| '''ExamplesErrors'''|* <code>async:ids() ! async:stop({{Error|self|#Errors}} The current job is addressed.)<br/code> stops and invalidates all asynchronous queries and results.
|}
|Description
|-
|{{Code|unexpectedoptions}}| An unexpected error occurred while running a The specified options are conflicting.|-|{{Code|id}}| The specified id is invalid or has already been assigned.|-|{{Code|overflow}}| Too many queries or query results are queued.|-|{{Code|range}}| A specified time or function in a separate threadduration is out of range.
|-
|{{Code|out-of-rangerunning}}| The supplied option A query is out of rangestill running.
|-
|{{Code|updatingself}}| A query is expected to The current job cannot be non-updating, but it performs updatesaddressed.
|-
|{{Code|non-updatingservice}}| A query is expected to Registered services cannot be updatingparsed, but it does not perform updatesadded or removed.
|-
|{{Code|unknown}}
| The supplied query id is unknown or not available anymore.
|-
|{{Code|is-running}}
| A query is still running.
|}
=Changelog=
 
;Version 9.5
* Updated: {{Function|Jobs|jobs:eval}}: integers added as valid start and end times.
 
;Version 9.4
* Updated: {{Function|Jobs|jobs:eval}}: option added for writing log entries.
* Updated: {{Function|Jobs|jobs:list-details}}: interval added.
 
;Version 9.2
* Deleted: jobs:invoke (merged with {{Function|Jobs|jobs:eval}})
 
;Version 9.1
* Updated: {{Function|Jobs|jobs:list-details}}: registration time added.
 
;Version 9.0
* Added: {{Function|Jobs|jobs:invoke}}, [[#Services|Services]]
 
;Version 8.6
* Updated: {{Function|Jobs|jobs:eval}}: <code>id</code> option added.
The module was introduced with Version 8.5.
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu