Difference between revisions of "Lazy Module"

From BaseX Documentation
Jump to navigation Jump to search
(29 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This [[Module Library|XQuery Module]] contains functions for handling ''streamable'' items.
+
This [[Module Library|XQuery Module]] contains functions for handling ''lazy'' items.
  
In contrast to conventional XQuery items, streamable items may take up much less space, because they only contain a reference to the actual data. The data will only be retrieved if it is required, e.g. if it needs to be serialized or processed by another expression.
+
In contrast to standard XQuery items, a lazy item contains a reference to the actual data, and the data itself will only be retrieved if it is requested. Hence, possible errors will be postponed, and no memory will be occupied by a lazy item as long as its content has not been requested yet.
  
Currently, the following BaseX functions return streamable items:
+
The following BaseX functions return lazy items:
  
* <code>[[Database Module#db:retrieve|db:retrieve]]</code>
+
* Lazy Base64 binaries:
* <code>[[Fetch Module#fetch:binary|fetch:binary]]</code>
+
** <code>[[Database Module#db:retrieve|db:retrieve]]</code>
* <code>[[Fetch Module#fetch:text|fetch:text]]</code>
+
** <code>[[Fetch Module#fetch:binary|fetch:binary]]</code>
* <code>[[File Module#file:read-binary|file:read-binary]]</code>
+
** <code>[[File Module#file:read-binary|file:read-binary]]</code>
* <code>[[File Module#file:read-text|file:read-text]]</code>
 
  
The following functions are capable of processing streamed input:
+
* Lazy strings:
 +
** <code>[[Fetch Module#fetch:text|fetch:text]]</code>
 +
** <code>[[File Module#file:read-text|file:read-text]]</code>
  
* <code>[[Conversion Module#convert:binary-to-bytes|convert:binary-to-bytes]]</code>
+
Some functions are capable of consuming the contents of lazy items in a ''streamable'' fashion: data will not be cached, but instead passed on to another target (file, the calling expression, etc.). The following streaming functions are currently available:
* <code>[[Database Module#db:store|db:store]]</code>
+
 
* <code>[[Fetch Module#file:write-text|file:write-text]]</code>
+
* [[Archive Module]] (most functions)
* <code>[[File Module#file:write-binary|file:write-binary]]</code>
+
* Conversion Module: <code>[[Conversion Module#convert:binary-to-bytes|convert:binary-to-bytes]]</code>, <code>[[Conversion Module#convert:binary-to-string|convert:binary-to-string]]</code>
 +
* Database Module: <code>[[Database Module#db:store|db:store]]</code>
 +
* File Module: <code>[[File Module#file:write-binary|file:write-binary]]</code>, <code>[[File Module#file:write-text|file:write-text]] </code> (if no encoding is specified)
 +
* [[Hashing Module]] (all functions)
 +
 
 +
The XQuery expression below serves as an example on how large files can be downloaded and written to a file with constant memory consumption:
 +
 
 +
<pre class="brush:xquery">
 +
file:write-binary('output.data', fetch:binary('http://files.basex.org/xml/xmark111mb.zip'))
 +
</pre>
 +
 
 +
If lazy items are serialized, they will be streamed as well.
  
 
=Conventions=
 
=Conventions=
  
All functions in this module are assigned to the {{Code|http://basex.org/modules/stream}} namespace, which is statically bound to the {{Code|stream}} prefix.<br/>
+
All functions and errors in this module are assigned to the <code><nowiki>http://basex.org/modules/lazy</nowiki></code> namespace, which is statically bound to the {{Code|lazy}} prefix.<br/>
All errors are assigned to the {{Code|http://basex.org/errors}} namespace, which is statically bound to the {{Code|bxerr}} prefix.
 
  
 
=Functions=
 
=Functions=
  
==stream:materialize==
+
==lazy:cache==
 +
 
 
{| width='100%'
 
{| width='100%'
 
|-
 
|-
| width='90' | '''Signatures'''
+
| width='120' | '''Signatures'''
|{{Func|stream:materialize|$item as item()|item()}}
+
|{{Func|lazy:cache|$items as item()*|item()*}}<br/>{{Func|lazy:cache|$items as item()*, $lazy as xs:boolean|item()*}}
 
|-
 
|-
 
| '''Summary'''
 
| '''Summary'''
|Returns a materialized instance of the specified {{Code|$item}}.<br />If an item is streamable, its content will be retrieved, and a new item containing its data will be returned. Other, non-streamable items will simply be returned.
+
|Caches the data of lazy {{Code|$items}} in a sequence:<br />
 +
* data of lazy items will be retrieved and cached inside the item.
 +
* non-lazy items, or lazy items with cached data, will simply be passed through.
 +
* If {{Code|$lazy}} is set to {{Code|true()}}, caching will be deferred until the data is eventually requested. Streaming will be disabled: Data will always be cached before a stream is returned.
 +
Caching is advisable if an item will be processed more than once, or if the data may not be available anymore at a later stage.
 +
|-
 +
| '''Example'''
 +
|In the following example, a file will be deleted before its content is returned. To avoid a “file not found” error when serializing the result, the content must be cached:
 +
<pre class="brush:xquery">
 +
let $file := 'data.txt'
 +
let $text := lazy:cache(file:read-text($file))
 +
return (file:delete($file), $text)
 +
</pre>
 
|}
 
|}
  
==stream:is-streamable==
+
==lazy:is-lazy==
 +
 
 
{| width='100%'
 
{| width='100%'
 
|-
 
|-
| width='90' | '''Signatures'''
+
| width='120' | '''Signatures'''
|{{Func|stream:is-streamable|$item as item()|item()}}
+
|{{Func|lazy:is-lazy|$item as item()|xs:boolean}}
 
|-
 
|-
 
| '''Summary'''
 
| '''Summary'''
|Checks if the specified {{Code|$item}} is streamable.
+
|Checks whether the specified {{Code|$item}} is lazy.
 +
|}
 +
 
 +
==lazy:is-cached==
 +
 
 +
{| width='100%'
 +
|-
 +
| width='120' | '''Signatures'''
 +
|{{Func|lazy:is-cached|$item as item()|xs:boolean}}
 +
|-
 +
| '''Summary'''
 +
|Checks whether the contents of the specified {{Code|$item}} are cached. The function will always return {{Code|true}} for non-lazy items.
 
|}
 
|}
  
 
=Changelog=
 
=Changelog=
 +
 +
;Version 9.1
 +
 +
* Updated: [[#lazy:cache|lazy:cache]]: {{Code|$lazy}} argument added; support for sequences.
 +
 +
;Version 9.0
 +
 +
* Updated: Renamed from Streaming Module to Lazy Module.
 +
* Added: [[#lazy:is-cached|lazy:is-cached]]
 +
 +
;Version 8.0
 +
 +
* Updated: [[#stream:materialize|stream:materialize]] extended to sequences.
  
 
This module was introduced with Version 7.7.
 
This module was introduced with Version 7.7.
 
[[Category:XQuery]]
 

Revision as of 13:40, 16 April 2019

This XQuery Module contains functions for handling lazy items.

In contrast to standard XQuery items, a lazy item contains a reference to the actual data, and the data itself will only be retrieved if it is requested. Hence, possible errors will be postponed, and no memory will be occupied by a lazy item as long as its content has not been requested yet.

The following BaseX functions return lazy items:

Some functions are capable of consuming the contents of lazy items in a streamable fashion: data will not be cached, but instead passed on to another target (file, the calling expression, etc.). The following streaming functions are currently available:

The XQuery expression below serves as an example on how large files can be downloaded and written to a file with constant memory consumption:

file:write-binary('output.data', fetch:binary('http://files.basex.org/xml/xmark111mb.zip'))

If lazy items are serialized, they will be streamed as well.

Conventions

All functions and errors in this module are assigned to the http://basex.org/modules/lazy namespace, which is statically bound to the lazy prefix.

Functions

lazy:cache

Signatures lazy:cache($items as item()*) as item()*
lazy:cache($items as item()*, $lazy as xs:boolean) as item()*
Summary Caches the data of lazy $items in a sequence:
  • data of lazy items will be retrieved and cached inside the item.
  • non-lazy items, or lazy items with cached data, will simply be passed through.
  • If $lazy is set to true(), caching will be deferred until the data is eventually requested. Streaming will be disabled: Data will always be cached before a stream is returned.

Caching is advisable if an item will be processed more than once, or if the data may not be available anymore at a later stage.

Example In the following example, a file will be deleted before its content is returned. To avoid a “file not found” error when serializing the result, the content must be cached:
let $file := 'data.txt'
let $text := lazy:cache(file:read-text($file))
return (file:delete($file), $text)

lazy:is-lazy

Signatures lazy:is-lazy($item as item()) as xs:boolean
Summary Checks whether the specified $item is lazy.

lazy:is-cached

Signatures lazy:is-cached($item as item()) as xs:boolean
Summary Checks whether the contents of the specified $item are cached. The function will always return true for non-lazy items.

Changelog

Version 9.1
  • Updated: lazy:cache: $lazy argument added; support for sequences.
Version 9.0
  • Updated: Renamed from Streaming Module to Lazy Module.
  • Added: lazy:is-cached
Version 8.0

This module was introduced with Version 7.7.