Difference between revisions of "Fetch Module"

From BaseX Documentation
Jump to navigation Jump to search
Line 3: Line 3:
 
=Conventions=
 
=Conventions=
  
All functions in this module are assigned to the <code><nowiki>http://basex.org/modules/fetch</nowiki></code> namespace, which is statically bound to the {{Code|fetch}} prefix.<br/>
+
{{Mark|Updated with Version 9.0}}:
All errors are assigned to the <code><nowiki>http://basex.org/errors</nowiki></code> namespace, which is statically bound to the {{Code|bxerr}} prefix.
+
 
 +
All functions and errors in this module are assigned to the <code><nowiki>http://basex.org/modules/fetch</nowiki></code> namespace, which is statically bound to the {{Code|fetch}} prefix.<br/>
  
 
URI arguments can point be URLs or point to local files. Relative file paths will be resolved against the ''current working directory'' (for more details, have a look at the [[File Module#File Paths|File Module]]).
 
URI arguments can point be URLs or point to local files. Relative file paths will be resolved against the ''current working directory'' (for more details, have a look at the [[File Module#File Paths|File Module]]).
Line 21: Line 22:
 
|-
 
|-
 
| '''Errors'''
 
| '''Errors'''
|{{Error|BXFE0001|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.
+
|{{Error|open|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.
 
|-
 
|-
 
| '''Examples'''
 
| '''Examples'''
Line 42: Line 43:
 
|-
 
|-
 
| '''Errors'''
 
| '''Errors'''
|{{Error|BXFE0001|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.<br/>{{Error|BXFE0002|XQuery Errors#Functions Errors}} the specified encoding is not supported, or unknown.
+
|{{Error|open|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.<br/>{{Error|encoding|XQuery Errors#Functions Errors}} the specified encoding is not supported, or unknown.
 
|-
 
|-
 
| '''Examples'''
 
| '''Examples'''
Line 62: Line 63:
 
|-
 
|-
 
| '''Errors'''
 
| '''Errors'''
|{{Error|BXFE0001|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.
+
|{{Error|open|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.
 
|-
 
|-
 
| '''Examples'''
 
| '''Examples'''
Line 120: Line 121:
 
|-
 
|-
 
| '''Errors'''
 
| '''Errors'''
|{{Error|BXFE0001|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.
+
|{{Error|open|XQuery Errors#Functions Errors}} the URI could not be resolved, or the resource could not be retrieved.
 
|-
 
|-
 
| '''Examples'''
 
| '''Examples'''
Line 128: Line 129:
  
 
=Errors=
 
=Errors=
 +
 +
{{Mark|Updated with Version 9.0}}:
  
 
{| class="wikitable" width="100%"
 
{| class="wikitable" width="100%"
Line 133: Line 136:
 
|Description
 
|Description
 
|-
 
|-
|{{Code|BXFE0001}}
+
|{{Code|encoding}}
 +
|The specified encoding is not supported, or unknown.
 +
|-
 +
|{{Code|open}}
 
|The URI could not be resolved, or the resource could not be retrieved.
 
|The URI could not be resolved, or the resource could not be retrieved.
|-
 
|{{Code|BXFE0002}}
 
|The specified encoding is not supported, or unknown.
 
 
|}
 
|}
  
Line 145: Line 148:
  
 
* Added: [[#fetch:xml-binary|fetch:xml-binary]]
 
* Added: [[#fetch:xml-binary|fetch:xml-binary]]
 +
* Updated: error codes updates; errors now use the module namespace
  
 
;Version 8.5
 
;Version 8.5

Revision as of 12:03, 21 November 2017

This XQuery Module provides simple functions to fetch the content of resources identified by URIs. Resources can be stored locally or remotely and e.g. use the file:// or http:// scheme. If more control over HTTP requests is required, the HTTP Module can be used. With the HTML Module, retrieved HTML documents can be converted to XML.

Conventions

Updated with Version 9.0:

All functions and errors in this module are assigned to the http://basex.org/modules/fetch namespace, which is statically bound to the fetch prefix.

URI arguments can point be URLs or point to local files. Relative file paths will be resolved against the current working directory (for more details, have a look at the File Module).

Functions

fetch:binary

Signatures fetch:binary($uri as xs:string) as xs:base64Binary
Summary Fetches the resource referred to by the given URI and returns it as streamable xs:base64Binary.
Errors open: the URI could not be resolved, or the resource could not be retrieved.
Examples
  • fetch:binary("http://images.trulia.com/blogimg/c/5/f/4/679932_1298401950553_o.jpg") returns the addressed image.
  • stream:materialize(fetch:binary("http://en.wikipedia.org")) returns a materialized representation of the streamable result.

fetch:text

Signatures fetch:text($uri as xs:string) as xs:string
fetch:text($uri as xs:string, $encoding as xs:string) as xs:string
fetch:text($uri as xs:string, $encoding as xs:string, $fallback as xs:boolean) as xs:string
Summary Fetches the resource referred to by the given $uri and returns it as streamable xs:string:
  • The UTF-8 default encoding can be overwritten with the optional $encoding argument.
  • By default, invalid characters will be rejected. If $fallback is set to true, these characters will be replaced with the Unicode replacement character FFFD (�).
Errors open: the URI could not be resolved, or the resource could not be retrieved.
encoding: the specified encoding is not supported, or unknown.
Examples
  • fetch:text("http://en.wikipedia.org") returns a string representation of the English Wikipedia main HTML page.
  • fetch:text("http://www.bbc.com","US-ASCII",true()) returns the BBC homepage in US-ASCII with all non-US-ASCII characters replaced with �.
  • stream:materialize(fetch:text("http://en.wikipedia.org")) returns a materialized representation of the streamable result.

fetch:xml

Signatures fetch:xml($uri as xs:string) as document-node()
fetch:xml($uri as xs:string, $options as map(*)) as document-node()
Summary Fetches the resource referred to by the given $uri and returns it as XML document node.
In contrast to fn:doc, each function call returns a different document node. As a consequence, document instances created by this function will not be kept in memory until the end of query evaluation.
The $options argument can be used to change the parsing behavior. Allowed options are all parsing and XML parsing options in lower case.
Errors open: the URI could not be resolved, or the resource could not be retrieved.
Examples
  • Retrieve an XML representation of the English Wikipedia main HTML page, chop all whitespace nodes:
fetch:xml("http://en.wikipedia.org", map { 'chop': true() })
  • Return a document located in the current base directory:
fetch:xml(file:base-dir() || "example.xml")

fetch:xml-binary

Introduced with Version 9.0:

Signatures fetch:xml-binary($data as xs:base64Binary) as document-node()
fetch:xml-binary($data as xs:base64Binary, $options as map(*)) as document-node()
Summary Parses binary $data and returns it as XML document node.
In contrast to fn:parse-xml, which expects an XQuery string, the input of this function can be arbitrarily encoded. The encoding will be derived from the XML declaration or (in case of UTF16 or UTF32) from the first bytes of the input.
The $options argument can be used to change the parsing behavior. Allowed options are all parsing and XML parsing options in lower case.
Examples
  • Retrieves file input as binary data and parses it as XML:
fetch:xml-binary(file:read-binary('doc.xml'))
  • Encodes a string as CP1252 and parses it as XML. The input and the string touché will be correctly decoded because of the XML declaration:
fetch:xml-binary(convert:string-to-base64(
  "<?xml version='1.0' encoding='CP1252'?><xml>touché</xml>",
  "CP1252"
))
  • Encodes a string as UTF16 and parses it as XML. The document will be correctly decoded, as the first bytes of the data indicate that the input must be UTF16:
fetch:xml-binary(convert:string-to-base64("<xml/>", "UTF16"))

fetch:content-type

Signatures fetch:content-type($uri as xs:string) as xs:string
Summary Returns the content-type (also called mime-type) of the resource specified by $uri:
  • If a remote resource is addressed, the request header will be evaluated.
  • If the addressed resource is locally stored, the content-type will be guessed based on the file extension.
Errors open: the URI could not be resolved, or the resource could not be retrieved.
Examples
  • fetch:content-type("http://docs.basex.org/skins/vector/images/wiki.png") returns image/png.

Errors

Updated with Version 9.0:

Code Description
encoding The specified encoding is not supported, or unknown.
open The URI could not be resolved, or the resource could not be retrieved.

Changelog

Version 9.0
  • Added: fetch:xml-binary
  • Updated: error codes updates; errors now use the module namespace
Version 8.5
Version 8.0

The module was introduced with Version 7.6.