Changes

Jump to navigation Jump to search
3,811 bytes added ,  16:21, 27 February 2020
no edit summary
This [[Module Library|XQuery Module]] provides simple functions to fetch the content of resources identified by URIs. Resources can be stored locally or remotely and e.g. use the {{Code|file://}} or {{Code|http://}} scheme. If more control over HTTP requests is required, the [[HTTP Client Module]] can be used. With the [[HTML Module]], retrieved HTML documents can be converted to XML.
The module has initially been inspired by [http://www.zorba-xquery.com/html/modules/zorba/io/fetch Zorba’s Fetch Module].=Conventions=
=Conventions=All functions and errors in this module are assigned to the <code><nowiki>http://basex.org/modules/fetch</nowiki></code> namespace, which is statically bound to the {{Code|fetch}} prefix.<br/>
All functions in this module are assigned URI arguments can point be URLs or point to the {{Code|http://basexlocal files.org/modules/fetch}} namespace, which is statically bound to Relative file paths will be resolved against the {{Code|fetch}} prefix.<br/>All errors are assigned to the {{Code|http://basex.org/errors}} namespace''current working directory'' (for more details, which is statically bound to have a look at the {{Code[[File Module#File Paths|bxerr}} prefixFile Module]]).
=Functions=
 
==fetch:binary==
 
{| width='100%'
|-
| width='120' | '''Signatures'''
|{{Func|fetch:binary|$uri as xs:string|xs:base64Binary}}<br/>
|-
| '''Summary'''
|Fetches the resource referred to by the given URI and returns it as [[Lazy Module|lazy]] {{Code|xs:base64Binary}} item.
|-
| '''Errors'''
|{{Error|open|#Errors}} the URI could not be resolved, or the resource could not be retrieved.
|-
| '''Examples'''
|
* <code><nowiki>fetch:binary("http://images.trulia.com/blogimg/c/5/f/4/679932_1298401950553_o.jpg")</nowiki></code> returns the addressed image.
* <code><nowiki>lazy:cache(fetch:binary("http://en.wikipedia.org"))</nowiki></code> enforces the fetch operation (otherwise, it will be delayed until requested first).
|}
==fetch:text==
 
{| width='100%'
|-
| width='90120' | '''Signatures'''|{{Func|fetch:text|$uri as xs:string|xs:string}}<br/>{{Func|fetch:text|$uri as xs:string, $encoding as xs:string|xs:string}}<br/>{{Func|fetch:text|$uri as xs:string, $encoding as xs:string, $fallback as xs:boolean|xs:string}}<br/>
|-
| '''Summary'''
|Fetches the resource referred to by the given URI {{Code|$uri}} and returns it as [[Streaming Lazy Module|streamablelazy]] {{Code|xs:string}}item:* The UTF-8 default encoding can be overwritten with the optional {{Code|$encoding}} argument.* By default, invalid characters will be rejected. If {{Code|$fallback}} is set to true, these characters will be replaced with the Unicode replacement character <code>FFFD</code> (&#xFFFD;).
|-
| '''Errors'''
|{{Error|BXFE0001open|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved. Invalid XML characters will be ignored if the <code>[[Options#CHECKSTRINGS|CHECKSTRINGS]]</code> option is turned off.<br/>{{Error|BXFE0002encoding|XQuery Errors#Functions Errors}} the specified encoding is not supported, or unknown.
|-
| '''Examples'''
|
* <code><nowiki>fetch:text("http://en.wikipedia.org")</nowiki></code> returns a string representation of the English Wikipedia main HTML page.
* <code><nowiki>streamfetch:text("http://www.bbc.com","US-ASCII",true())</nowiki></code> returns the BBC homepage in US-ASCII with all non-US-ASCII characters replaced with &#xFFFD;.* <code><nowiki>lazy:materializecache(fetch:text("http://en.wikipedia.org"))</nowiki></code> returns a materialized representation of enforces the streamable resultfetch operation (otherwise, it will be delayed until requested first).
|}
==fetch:binaryxml== 
{| width='100%'
|-
| width='90120' | '''Signatures'''|{{Func|fetch:binaryxml|$uri as xs:string|xs:base64Binarydocument-node()}}<br/>{{Func|fetch:xml|$uri as xs:string, $options as map(*)?|document-node()}}
|-
| '''Summary'''
|Fetches the resource referred to by the given URI {{Code|$uri}} and returns it as document node.<br/>The {{Code|$options}} argument can be used to change the parsing behavior. Allowed options are all [[Streaming ModuleOptions#Parsing|streamableparsing]] {{Codeand [[Options#XML Parsing|xsXML parsing]] options in lower case.<br/>The function is different to <code>fn:doc</code> in various aspects:base64Binary}}* As it is non-deterministic, a new document node will be created by each call of this function.* A document created by this function will be garbage-collected as soon as it is not referenced anymore.* URIs will not be resolved against existing databases. As a result, it will not trigger any locks (see [[Transaction Management#Limitations|limitations of database locking]] for more details).
|-
| '''Errors'''
|{{Error|BXFE0001open|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.|-| '''Examples'''|* Retrieve an XML representation of the English Wikipedia main HTML page, chop all whitespace nodes:<syntaxhighlight lang="xquery">fetch:xml("http://en.wikipedia.org", map { 'chop': true() })</syntaxhighlight>* Return a document located in the current base directory:<syntaxhighlight lang="xquery">fetch:xml(file:base-dir() || "example.xml")</syntaxhighlight>* Return a web page as XML, preserve namespaces:<syntaxhighlight lang="xquery">fetch:xml( 'http://basex.org/', map { 'parser': 'html', 'htmlparser': map { 'nons': false() } })</syntaxhighlight>|} ==fetch:xml-binary== {| width='100%'|-| width='120' | '''Signatures'''|{{Func|fetch:xml-binary|$data as xs:base64Binary|document-node()}}<br/>{{Func|fetch:xml-binary|$data as xs:base64Binary, $options as map(*)?|document-node()}}|-| '''Summary'''|Parses binary {{Code|$data}} and returns it as document node.<br/>In contrast to fn:parse-xml, which expects an XQuery string, the input of this function can be arbitrarily encoded. The encoding will be derived from the XML declaration or (in case of UTF16 or UTF32) from the first bytes of the input.<br/>The {{Code|$options}} argument can be used to change the parsing behavior. Allowed options are all [[Options#Parsing|parsing]] and [[Options#XML Parsing|XML parsing]] options in lower case.
|-
| '''Examples'''
|
* Retrieves file input as binary data and parses it as XML:<code><nowikisyntaxhighlight lang="xquery">fetch:xml-binary("httpfile://imagesread-binary('doc.trulia.comxml'))</blogimg/c/5/f/4/679932_1298401950553_osyntaxhighlight>* Encodes a string as CP1252 and parses it as XML.jpgThe input and the string {{Code|touché}} will be correctly decoded because of the XML declaration:<syntaxhighlight lang="xquery")>fetch:xml-binary(convert:string-to-base64( "<?xml version='1.0' encoding='CP1252'?><xml>touché</nowikixml>", "CP1252"))</codesyntaxhighlight> returns * Encodes a string as UTF16 and parses it as XML. The document will be correctly decoded, as the first bytes of the addressed image.data indicate that the input must be UTF16:* <codesyntaxhighlight lang="xquery"><nowiki>streamfetch:materializexml-binary(fetchconvert:binarystring-to-base64("http:<xml//en.wikipedia.org>", "UTF16"))</nowikisyntaxhighlight></code> returns a materialized representation of the streamable result.
|}
==fetch:content-type==
 
{| width='100%'
|-
| width='90120' | '''Signatures'''
|{{Func|fetch:content-type|$uri as xs:string|xs:string}}<br/>
|-
|Returns the content-type (also called mime-type) of the resource specified by {{Code|$uri}}:
* If a remote resource is addressed, the request header will be evaluated.
* If the addressed resource is locally stored, the file extension content-type will be guessed based on the file extension.
|-
| '''Errors'''
|{{Error|BXFE0001open|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.
|-
| '''Examples'''
=Errors=
{| width='100%' class="wikitable" width="100%"! width="5%110"|Code! width="95%"|Description|-|{{Code|encoding}}|The specified encoding is not supported, or unknown.
|-
|{{Code|BXFE0001open}}
|The URI could not be resolved, or the resource could not be retrieved.
|-
|{{Code|BXFE0002}}
|The specified encoding is not supported, or unknown.
|}
=Changelog=
 
;Version 9.0
 
* Added: [[#fetch:xml-binary|fetch:xml-binary]]
* Updated: error codes updated; errors now use the module namespace
 
;Version 8.5
 
* Updated: [[#fetch:text|fetch:text]]: <code>$fallback</code> argument added.
 
;Version 8.0
 
* Added: [[#fetch:xml|fetch:xml]]
The module was introduced with Version 7.6.
 
[[Category:XQuery]]
Bureaucrats, editor, reviewer, Administrators
13,550

edits

Navigation menu