Changes

Jump to navigation Jump to search
7,966 bytes added ,  15:31, 13 May 2022
no edit summary
This [[Module Library|XQuery Module]] provides functions for evaluating XQuery expressions in separate threadsorganizing scheduled, queued, running and cached jobs. Query execution Jobs can both be parallelized and postponed to be executed asynchronouslycommands, queries, client or HTTP requests.
=Conventions=
All functions in this module are assigned to the <code><nowiki>http://basex.org/modules/asyncjobs</nowiki></code> namespace, which is statically bound to the {{Code|asyncjobs}} prefix. Errors will be bound to the same prefix.
=Parallelized ExecutionServices=
A job can be registered as ''service'' by supplying the {{Code|service}} option to {{Function|Jobs|jobs:eval}}: <syntaxhighlight lang="xquery">(: register job as service; will be run every day at 1 am :)jobs:eval('db:drop("tmp")', (), map { 'id':'cleanup', 'start':'01:00:00', 'interval':'P1D', 'service': true() }), (: list registered services :)jobs:services(),(: result: <job base-uri="..." id=async"cleanup" interval="P1D" start="01:fork00:00">db:drop("tmp")</job> :) (: unregister job :)jobs:stop('cleanup', map { 'service': true() })</syntaxhighlight> '''Some more notes:''' * All job services will be scheduled for evaluation when the BaseX server or BaseX HTTP server is started.* If a job service is outdated (e.g. because a supplied end time has been exceeded), it will be removed from the jobs file at startup time.* The job definitions are stored in a {{Code|jobs.xml}} file in the database directory. It can also be edited manually. =Executing Jobs= There are cases in which a client does not, or cannot, wait until a request is fully processed. The client may be a browser, which sends an HTTP request to the server in order to start another time-joinconsuming query job. The functions in this section allow you to register a new query job from a running query. Jobs can be executed immediately (i.e., as soon as the [[Transaction Management#Concurrency Control|Concurrency Control]] allows it) or scheduled for repeated execution. Each registered job gets a job id, and the id can be used to retrieve a query result, stop a job, or wait for its termination. ==jobs:eval==
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:fork-joineval|$functions query as functionxs:anyAtomicItem|xs:string}}<br />{{Func|jobs:eval|$query as xs:anyAtomicItem, $bindings as map(*)*?|item()*xs:string}}<br/ >{{Func|asyncjobs:fork-joineval|$functions query as functionxs:anyAtomicItem, $bindings as map(*)*?, $options as map(*)?|xs:string, xs:string)|item()*}}<br/ >
|-
|'''Summary'''|This function executes Schedules the evaluation of the supplied functions in parallel{{Code|$query}} ({{Code|xs:string}}, or of type {{Code|xs:anyURI}}, pointing to a resource), and returns a query id. The query will be queued, and the result will optionally be cached. Queries can be updating. Variables and the context value can be declared via {{Code|$bindings}} (see [[XQuery Module#xquery:eval|xquery:eval]] for more details). The following {{Code|$options}} are availablecan be supplied:* {{Code|cache}}: indicates if the query result will be cached or ignored (default: <code>threadsfalse</code>): maximum number ** The result will be cached in main-memory until it is fetched via [[#jobs:result|jobs:result]], or until {{Option|CACHETIMEOUT}} is exceeded.** If the query raises an error, it will be cached and returned instead.* {{Code|start}}: a dayTimeDuration, time, dateTime or integer can be specified to delay the execution of parallel threads the query:** If a dayTimeDuration is specified, the query will be queued after the specified duration has passed. Examples for valid values are: <code>P1D</code> (1 day), <code>PT5M</code> (default5 minutes), <code>PT0.1S</code> (100 ms). An error will be raised if a negative value is specified.** If a dateTime is specified, the query will be executed at this date. Examples for valid values are: <code>2018-12-31T23:59: available number of cores59</code> (New Year's Eve 2018, close to midnight). An error will be raised if the specified time lies in the past.* * If a time is specified, the query will be executed at this time of the day. Examples for valid times are: <code>thread-size02:00:00</code>(2am local time), <code>12: 00:00Z</code> (noon, UTC). If the time lies in the past, the query will be executed the next day.** An integer will be interpreted as minutes. If the specified number is greater than the elapsed minutes of functions the current hour, the query will be executed one hour later.* {{Code|interval}}: a dayTimeDuration string can be specified to execute the query periodically. An error is raised if the specified interval is less than one second (<code>PT1S</code>). If the next scheduled call is due, and if a query with the same id is still running, it will be skipped.* {{Code|end}}: scheduling can be evaluated by each thread (defaultstopped after a given time or duration. The string format is the same as for {{Code|start}}. An error is raised if the resulting end time is smaller than the start time.* {{Code|base-uri}}: sets the [https://www.w3.org/TR/xquery-31/#dt-static-base-uri base-uri property] for the query. This URI will be used when resolving relative URIs, such as with {{Code|fn:doc}}.* {{Code|id}}: sets a custom job id. The id must not start with the standard <code>1job</code>)prefix, and it can only be assigned if no job with the same name exists.* {{Code|service}}: additionally registers the job as [[#Services|service]]. Registered services must have no variable bindings.* {{Code|log}}: writes the specified string to the [[Logging|database logs]]. Two log entries are stored, one at the beginning and another one after the execution of the job.|-| '''Errors'''|{{Error|overflow|#Errors}} Query execution is rejected, because too many jobs are queued or being executed. {{Option|CACHETIMEOUT}} can be decreased if the default setting is too restrictive.<br/>{{Error|range|#Errors}} A specified time or duration is out of range.<br/>{{Error|id|#Errors}} The specified id is invalid or has already been assigned.<br/>{{Error|options|#Errors}} The specified options are conflicting.
|-
| '''Examples'''
|
* Cache query result. The following function sleeps in parallel; it returned id can be used to pick up the result with [[#jobs:result|jobs:result]]:<syntaxhighlight lang="xquery">jobs:eval("1+3", (), map { 'cache': true() })</syntaxhighlight>* A happy birthday mail will be finished in 1 second if your system has sent at least 2 coresthe given date:<pre classsyntaxhighlight lang="xquery">jobs:eval("import module namespace mail='mail'; mail:send('Happy birthday!')", (), map { 'start'brush:'2018-09-01T06:00:00' })}}</syntaxhighlight>* The following [[RESTXQ]] functions can be called to execute a query at 2 am every day. An id will be returned by the first function, which can be used to stop the scheduler via the second function:<syntaxhighlight lang="xquery'">asyncdeclare %rest:forkPOST("{$query}") %rest:path('/start-joinscheduling') function local:start($query) { for jobs:eval($i in 1 to 2query, (), map { 'start': '02:00:00', 'interval': 'P1D' })}; return declare %rest:path('/stop-scheduling/{$id}') functionlocal:stop($id) { jobs:stop($id)};</syntaxhighlight>* Query execution is scheduled for every second, and for 10 seconds in total. As the query itself will take 1.5 seconds, it will only be executed every second time:<syntaxhighlight lang="xquery">jobs:eval("prof:sleep(10001500)", (), map { 'interval': 'PT1S', 'end': 'PT10S' })</syntaxhighlight>* The query in the specified file will be evaluated once:<syntaxhighlight lang="xquery">jobs:eval(xs:anyURI('cleanup.xq') )</syntaxhighlight>* The following expression, if stored in a file, will be evaluated every 5 seconds:<syntaxhighlight lang="xquery">jobs:eval( static-base-uri(), map { }, map { 'start': 'PT5S' }
)
</pre>* In the following query, up to two URLs will be requested in parallel:<pre class='brush:xquery'>let $funcs := for $segment in 1 to 4 let $url := 'http://url.com/path' || $segment return function() { http:send-request((), $url) }return async:fork-join($funcs, map { 'threads': 2 })</pre>|-|'''Errors'''|{{Error|unexpected|#Errors}} an unexpected error occurred while running a query or function in a separate thread.<br/>{{Error|out-of-range|#Errors}} a supplied option is out of range.<br/syntaxhighlight>
|}
=Asynchronous Execution=jobs:result==
==async{{Mark|Updated with Version 9.7:eval==}} Return empty sequence if no result is cached.
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:evalresult|$query id as xs:string|xs:string}}<br />{{Func|async:eval|$query as xs:string, $bindings as mapitem(*)|xs:string}}<br />{{Func|async:eval|$query as xs:string, $bindings as map(*), $options as map(xs:string, xs:string)|xs:string}}<br />
|-
| '''Summary'''
|Prepares Returns the supplied cached result of a job with the specified job {{Code|$queryid}} string for asynchronous execution and returns a query id:* If the original job has raised an error, the cached error will be raised instead. The query will * Results can only be queued as described in the article on [[Transaction Management]]retrieved once. After retrieval, and the cached result will be dropped.* If the result has already been retrieved, or if it has not been cached in main, an empty sequence is returned.|-memory until it is fetched via | '''Examples'''|* The following [[#asyncRESTXQ]] function will either return the result of a previously started job or raise an error:<syntaxhighlight lang="xquery">declare %rest:path('/result|async/{$id}') function local:result]], or until ($id) {{Option|ASYNCTIMEOUT jobs:result($id)}} is exceeded.;<br/syntaxhighlight>Variables and context items * The following query demonstrates how the results of an executed query can be declared via {{Code|$bindings}} returned within the same query (see [[XQuery Module#below why you should avoid this pattern in practice):<syntaxhighlight lang="xquery">let $query :eval|xquery= jobs:eval]] for more details('(1 to 10000000)[. The {= 1]', map {Code|$options}} parameter contains evaluation options:* , map {{Code|'cache': true() }})return ( jobs: indicates if the wait($query ), jobs:result will be cached or ignored (default: <code>true$query))</codesyntaxhighlight>)Queries of this kind can cause deadlocks! If the original query and the new query perform updates on the same database, the second query will only be run after the first one has been executed, and the first query will wait for the second query forever.* {{Code|base-uri}}: set You should resort to [[httpsXQuery Module#xquery://www.w3.org/TR/fork-join|xquery:fork-31/#dt-static-base-uri base-uri propertyjoin]] for the if you want to have full control on parallel queryexecution. This URI will be used when resolving relative URIs by functions such as {{Code|fn:doc}} (default ==jobs: stop== {| width='100%'empty string'').
|-
| width='120'| 'Errors''Signatures'''|{{ErrorFunc|jobs:stop|$id as xs:string|empty-sequence()}}|-| '''Summary'''|updatingTriggers the cancelation of a job with the specified {{Code|#Errors$id}} , drops the cached result of a query contains update operations, or cancels a scheduled job. Unknown ids are ignored. All jobs are gracefully stopped; it is up to the process to decide when it is safe to shut down. The following {{Code|$options}} can be supplied:* {{Code|service}}: additionally removes the job from the [[#Services|job services]] list.
|-
| '''Examples'''
|* {{Code|async<code>jobs:evallist("1+3")}} returns a query id, e[.g. {{Code|Query-abc}}. The result can be retrieved via a second query in the same BaseX context!= jobs: {{Code|asynccurrent()] ! jobs:resultstop("Query-abc".)}}<br /code>stops and discards all jobs except for the current one.
|}
==asyncjobs:updatewait==
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:updatewait|$query id as xs:string|xs:string}}<br />{{Func|async:update|$query as xs:string, $bindings as mapempty-sequence(*)|xs:string}}<br />{{Func|async:update|$query as xs:string, $bindings as map(*), $options as map(xs:string, xs:string)|xs:string}}<br />
|-
| '''Summary'''
|Prepares Waits for the completion of a job with the supplied specified {{Code|$queryid}} string for asynchronous execution and returns a query :* The function will terminate immediately if the job idis unknown. The query will be This is the case if a future job has not been queued as described in yet, or if the article on [[Transaction Management]]id has already been discarded after job evaluation.<br/>See [[#async:eval|async:eval]] for information on * If the function is called with the <code>$bindings</code> id of a queued job, or repeatedly executed job, it may stall and <code>$options</code> argumentsnever terminate.
|-
| '''Errors'''
|{{Error|non-updatingself|#Errors}} the query does not contain any update operationsThe current job is addressed.<br/>|} =Listing Jobs= ==jobs:current== {| width='100%'|-| width='120' | '''Signatures'''|{{Func|jobs:current||xs:string}}|-| '''Summary'''|Returns the id of the current job.|} ==jobs:list== {| width='100%'|-| width='120' | '''Signatures'''|{{Func|jobs:list||xs:string*}}|-| '''Summary'''|Returns the ids of all jobs that are currently registered. The list includes scheduled, queued, running, stopped, and finished jobs with cached results.
|-
| '''Examples'''
|* <code>asyncjobs:update("delete node db:open('db')//textlist()", map {}, map { 'cache': false() })</code> returns a query the same job id. The text nodes of the database <code>db</code> will be deleted once the database as {{Function|Jobs|jobs:current}} if no other job is available for write accessregistered.
|}
==asyncjobs:resultlist-details==
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:resultlist-details||element(job)*}}<br/>{{Func|jobs:list-details|$id as xs:string|itemelement(job)*}}
|-
| '''Summary'''
|Returns the result of an asynchronously executed query information on all jobs that are currently registered, or on a job with the specified query {{Code|$id}}:* Results can only be retrieved once(or an empty sequence if this job is not found). After retrievalThe list includes scheduled, queued, running jobs, and cached jobs. A string representation of the cached result job, or its URI, will be discardedreturned as value.The returned elements have additional attributes:* If <code>id</code>: job id* <code>type</code>: type of the job (command, query raised an error, the error will be raised insteadREST, RESTXQ, etc.)|-* <code>state</code>: current state of the job: <code>scheduled</code>, <code>queued</code>, <code>running</code>, <code>cached</code>| '''Errors'''* <code>user</code>: user who started the job|{{Error|* <code>duration</code>: evaluation time (included if a job is-running|#Errors}} or if the query is still running.result was cached)* <code>start<br/code>{{Error|unknown|#Errors}} the supplied query id is unknown: The query result may already have been retrieved, or query execution may have been stopped.next start of job (included if a job will be executed repeatedly)* <code>time<br/code>: time when job was registered
|-
| '''Examples'''
| The following query <code>jobs:list-details()</code> returns information on the results of an asynchronously executed query. It will succeed, because both the main currently running job and the asynchronous query do not include write operations on concurrently used databasespossibly others:<pre classsyntaxhighlight lang='brush:xquery'"xml">let $query :<job id="job1" type="XQuery" state="running" user="admin" duration= async:eval('(1 to 10000000)["PT0. = 1]')return (001S"> hof:until( function($result) { not(asyncXQUERY jobs:islist-runningdetails($query)) }, function($curr) { prof:sleep(10) }, () ), async:result($query))</job></presyntaxhighlight>
|}
==asyncjobs:is-runningbindings== {{Mark|Introduced with Version 10.0}}
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:is-runningbindings|$id as xs:string|xs:booleanmap(*)}}
|-
| '''Summary'''
|Indicates if a query Returns the variable bindings of an existing job with the specified query {{Code|$id}} . If no variables have been bound to this job, an empty map is currently being evaluatedreturned.|} ==jobs:finished== {| width='100%'
|-
| width='120' | '''ErrorsSignatures'''|{{ErrorFunc|jobs:finished|unknown$id as xs:string|#Errorsxs:boolean}} |-| '''Summary'''|Indicates if the evaluation of an already running job with the supplied query specified {{Code|$id }} has finished. As the ids of finished jobs will usually be discarded, unless caching is enabled, the function will also return <code>true</code> for unknown: The query result may already have been retrievedjobs.* <code>false</code> indicates that the job id is scheduled, queued, or query execution may have been stoppedcurrently running.* <code>true<br/code>will be returned if the job has either finished, or if the id is unknown (because the ids of all finished jobs will not be cached).
|}
==asyncjobs:stopservices==
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|asyncjobs:stopservices|$id as xs:string|empty-sequenceelement(job)*}}
|-
| '''Summary'''
|Cancels the execution Returns a list of a query with the specified query {{Codeall jobs that have been persistently registered as [[#Services|$id}}Services]].
|-
| '''Errors'''
|{{Error|unknownservices|#Errors}} the supplied query id is unknown: The query result may already have been retrieved, or query execution may have been stoppedRegistered services cannot be parsed.<br/>
|}
|Description
|-
|{{Code|unexpectedoptions}}| An unexpected error occurred while running a query The specified options are conflicting.|-|{{Code|id}}| The specified id is invalid or function in a separate threadhas already been assigned.
|-
|{{Code|out-of-rangeoverflow}}| The supplied option is out of rangeToo many queries or query results are queued.
|-
|{{Code|updatingrange}}| A query specified time or duration is expected to be non-updating, but it performs updatesout of range.
|-
|{{Code|non-updatingrunning}}| A query is expected to be updating, but it does not perform updatesstill running.
|-
|{{Code|unknownself}}| The supplied query id is unknown or not available anymorecurrent job cannot be addressed.
|-
|{{Code|is-runningservice}}| A query is still runningRegistered services cannot be parsed, added or removed.
|}
=Changelog=
 
;Version 10.0
* Added: {{Function|Jobs|jobs:bindings}}
 
;Version 9.7
* Updated: {{Function|Jobs|jobs:result}}: return empty sequence if no result is cached.
 
;Version 9.5
* Updated: {{Function|Jobs|jobs:eval}}: integers added as valid start and end times.
 
;Version 9.4
* Updated: {{Function|Jobs|jobs:eval}}: option added for writing log entries.
* Updated: {{Function|Jobs|jobs:list-details}}: interval added.
 
;Version 9.2
* Deleted: jobs:invoke (merged with {{Function|Jobs|jobs:eval}})
 
;Version 9.1
* Updated: {{Function|Jobs|jobs:list-details}}: registration time added.
 
;Version 9.0
* Added: {{Function|Jobs|jobs:invoke}}, [[#Services|Services]]
 
;Version 8.6
* Updated: {{Function|Jobs|jobs:eval}}: <code>id</code> option added.
The module was introduced with Version 8.5.
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu