Difference between revisions of "Lazy Module"

From BaseX Documentation
Jump to navigation Jump to search
m (Text replace - "\[\[Category:XQuery\]\]" to "")
(10 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This [[Module Library|XQuery Module]] contains functions for handling ''streamable'' items.
+
This [[Module Library|XQuery Module]] contains functions for handling ''lazy'' items.
  
In contrast to standard XQuery items, a streamable item contains only a reference to the actual data. The data itself will be retrieved if it is requested by an expression, or if the item is to be serialized. Hence, a streamable item only uses a few bytes, and no additional memory is occupied during serialization.
+
In contrast to standard XQuery items, a lazy item contains a reference to the actual data, and the data itself will only be retrieved if it is requested. Hence, possible errors will be postponed, and no memory will be occupied by a lazy item as long as its content has not been requested yet.
  
The following BaseX functions return streamable items:
+
The following BaseX functions return lazy items:
  
* Streamable Base64 binaries:
+
* Lazy Base64 binaries:
 
** <code>[[Database Module#db:retrieve|db:retrieve]]</code>
 
** <code>[[Database Module#db:retrieve|db:retrieve]]</code>
 
** <code>[[Fetch Module#fetch:binary|fetch:binary]]</code>
 
** <code>[[Fetch Module#fetch:binary|fetch:binary]]</code>
 
** <code>[[File Module#file:read-binary|file:read-binary]]</code>
 
** <code>[[File Module#file:read-binary|file:read-binary]]</code>
  
* Streamable strings:
+
* Lazy strings:
 
** <code>[[Fetch Module#fetch:text|fetch:text]]</code>
 
** <code>[[Fetch Module#fetch:text|fetch:text]]</code>
 
** <code>[[File Module#file:read-text|file:read-text]]</code>
 
** <code>[[File Module#file:read-text|file:read-text]]</code>
  
Some functions are capable of consuming items in a ''streamable'' fashion: data will never be cached, but instead passed on to another target (file, the calling expression, etc.). The following streaming functions are currently available:
+
Some functions are capable of consuming the contents of lazy items in a ''streamable'' fashion: data will not be cached, but instead passed on to another target (file, the calling expression, etc.). The following streaming functions are currently available:
  
* <code>[[Conversion Module#convert:binary-to-bytes|convert:binary-to-bytes]]</code>
+
* [[Archive Module]] (most functions)
* <code>[[Database Module#db:store|db:store]]</code>
+
* Conversion Module: <code>[[Conversion Module#convert:binary-to-bytes|convert:binary-to-bytes]]</code>, <code>[[Conversion Module#convert:binary-to-string|convert:binary-to-string]]</code>
* <code>[[File Module#file:write-binary|file:write-binary]]</code>
+
* Database Module: <code>[[Database Module#db:store|db:store]]</code>
* <code>[[Fetch Module#file:write-text|file:write-text]]</code>
+
* File Module: <code>[[File Module#file:write-binary|file:write-binary]]</code>, <code>[[File Module#file:write-text|file:write-text]] </code> (if no encoding is specified)
 +
* [[Hashing Module]] (all functions)
  
 
The XQuery expression below serves as an example on how large files can be downloaded and written to a file with constant memory consumption:
 
The XQuery expression below serves as an example on how large files can be downloaded and written to a file with constant memory consumption:
  
<pre class="brush:xquery">
+
<syntaxhighlight lang="xquery">
 
file:write-binary('output.data', fetch:binary('http://files.basex.org/xml/xmark111mb.zip'))
 
file:write-binary('output.data', fetch:binary('http://files.basex.org/xml/xmark111mb.zip'))
</pre>
+
</syntaxhighlight>
 +
 
 +
If lazy items are serialized, they will be streamed as well.
  
 
=Conventions=
 
=Conventions=
  
All functions in this module are assigned to the <code><nowiki>http://basex.org/modules/stream</nowiki></code> namespace, which is statically bound to the {{Code|stream}} prefix.<br/>
+
All functions and errors in this module are assigned to the <code><nowiki>http://basex.org/modules/lazy</nowiki></code> namespace, which is statically bound to the {{Code|lazy}} prefix.<br/>
All errors are assigned to the <code><nowiki>http://basex.org/errors</nowiki></code> namespace, which is statically bound to the {{Code|bxerr}} prefix.
 
  
 
=Functions=
 
=Functions=
  
==stream:materialize==
+
==lazy:cache==
  
 
{| width='100%'
 
{| width='100%'
 
|-
 
|-
 
| width='120' | '''Signatures'''
 
| width='120' | '''Signatures'''
|{{Func|stream:materialize|$value as item()*|item()*}}
+
|{{Func|lazy:cache|$items as item()*|item()*}}<br/>{{Func|lazy:cache|$items as item()*, $lazy as xs:boolean|item()*}}
 
|-
 
|-
 
| '''Summary'''
 
| '''Summary'''
|Returns a materialized instance of the specified {{Code|$value}}:<br />
+
|Caches the data of lazy {{Code|$items}} in a sequence:<br />
* if an item is streamable, its value will be retrieved, and a new item containing the value will be returned.
+
* data of lazy items will be retrieved and cached inside the item.
* other, non-streamable items will simply be passed through.
+
* non-lazy items, or lazy items with cached data, will simply be passed through.
Materialization is advisable if a value is to be processed more than once, and is expensive to retrieve. It is get mandatory whenever a value is invalidated before it is requested (see the example below).
+
* If {{Code|$lazy}} is set to {{Code|true()}}, caching will be deferred until the data is eventually requested. Streaming will be disabled: Data will always be cached before a stream is returned.
 +
Caching is advisable if an item will be processed more than once, or if the data may not be available anymore at a later stage.
 
|-
 
|-
 
| '''Example'''
 
| '''Example'''
|In the following example, a file will be deleted before its content is returned. To avoid a "file not found" error, the content will first be materialized:
+
|In the following example, a file will be deleted before its content is returned. To avoid a “file not found” error when serializing the result, the content must be cached:
<pre class="brush:xquery">
+
<syntaxhighlight lang="xquery">
 
let $file := 'data.txt'
 
let $file := 'data.txt'
let $data := stream:materialize(file:read-text($file))
+
let $text := lazy:cache(file:read-text($file))
return (file:delete($file), $data)
+
return (file:delete($file), $text)
</pre>
+
</syntaxhighlight>
 
|}
 
|}
  
==stream:is-streamable==
+
==lazy:is-lazy==
 +
 
 
{| width='100%'
 
{| width='100%'
 
|-
 
|-
 
| width='120' | '''Signatures'''
 
| width='120' | '''Signatures'''
|{{Func|stream:is-streamable|$item as item()|item()}}
+
|{{Func|lazy:is-lazy|$item as item()|xs:boolean}}
 
|-
 
|-
 
| '''Summary'''
 
| '''Summary'''
|Checks whether the specified {{Code|$item}} is streamable.  
+
|Checks whether the specified {{Code|$item}} is lazy.
 +
|}
 +
 
 +
==lazy:is-cached==
 +
 
 +
{| width='100%'
 +
|-
 +
| width='120' | '''Signatures'''
 +
|{{Func|lazy:is-cached|$item as item()|xs:boolean}}
 +
|-
 +
| '''Summary'''
 +
|Checks whether the contents of the specified {{Code|$item}} are cached. The function will always return {{Code|true}} for non-lazy items.
 
|}
 
|}
  
 
=Changelog=
 
=Changelog=
 +
 +
;Version 9.1
 +
 +
* Updated: [[#lazy:cache|lazy:cache]]: {{Code|$lazy}} argument added; support for sequences.
 +
 +
;Version 9.0
 +
 +
* Updated: Renamed from Streaming Module to Lazy Module.
 +
* Added: [[#lazy:is-cached|lazy:is-cached]]
  
 
;Version 8.0
 
;Version 8.0
* Update: [[#stream:materialize|stream:materialize]] extended to sequences.
+
 
 +
* Updated: [[#stream:materialize|stream:materialize]] extended to sequences.
  
 
This module was introduced with Version 7.7.
 
This module was introduced with Version 7.7.

Revision as of 15:20, 27 February 2020

This XQuery Module contains functions for handling lazy items.

In contrast to standard XQuery items, a lazy item contains a reference to the actual data, and the data itself will only be retrieved if it is requested. Hence, possible errors will be postponed, and no memory will be occupied by a lazy item as long as its content has not been requested yet.

The following BaseX functions return lazy items:

Some functions are capable of consuming the contents of lazy items in a streamable fashion: data will not be cached, but instead passed on to another target (file, the calling expression, etc.). The following streaming functions are currently available:

The XQuery expression below serves as an example on how large files can be downloaded and written to a file with constant memory consumption:

<syntaxhighlight lang="xquery"> file:write-binary('output.data', fetch:binary('http://files.basex.org/xml/xmark111mb.zip')) </syntaxhighlight>

If lazy items are serialized, they will be streamed as well.

Conventions

All functions and errors in this module are assigned to the http://basex.org/modules/lazy namespace, which is statically bound to the lazy prefix.

Functions

lazy:cache

Signatures lazy:cache($items as item()*) as item()*
lazy:cache($items as item()*, $lazy as xs:boolean) as item()*
Summary Caches the data of lazy $items in a sequence:
  • data of lazy items will be retrieved and cached inside the item.
  • non-lazy items, or lazy items with cached data, will simply be passed through.
  • If $lazy is set to true(), caching will be deferred until the data is eventually requested. Streaming will be disabled: Data will always be cached before a stream is returned.

Caching is advisable if an item will be processed more than once, or if the data may not be available anymore at a later stage.

Example In the following example, a file will be deleted before its content is returned. To avoid a “file not found” error when serializing the result, the content must be cached:

<syntaxhighlight lang="xquery"> let $file := 'data.txt' let $text := lazy:cache(file:read-text($file)) return (file:delete($file), $text) </syntaxhighlight>

lazy:is-lazy

Signatures lazy:is-lazy($item as item()) as xs:boolean
Summary Checks whether the specified $item is lazy.

lazy:is-cached

Signatures lazy:is-cached($item as item()) as xs:boolean
Summary Checks whether the contents of the specified $item are cached. The function will always return true for non-lazy items.

Changelog

Version 9.1
  • Updated: lazy:cache: $lazy argument added; support for sequences.
Version 9.0
  • Updated: Renamed from Streaming Module to Lazy Module.
  • Added: lazy:is-cached
Version 8.0

This module was introduced with Version 7.7.