Difference between revisions of "Fetch Module"
m (Text replacement - "</syntaxhighlight>" to "</pre>") |
|||
Line 80: | Line 80: | ||
<syntaxhighlight lang="xquery"> | <syntaxhighlight lang="xquery"> | ||
fetch:doc("http://en.wikipedia.org", map { 'stripws': true() }) | fetch:doc("http://en.wikipedia.org", map { 'stripws': true() }) | ||
− | </ | + | </pre> |
* Return a web page as XML, preserve namespaces: | * Return a web page as XML, preserve namespaces: | ||
<syntaxhighlight lang="xquery"> | <syntaxhighlight lang="xquery"> | ||
Line 90: | Line 90: | ||
} | } | ||
) | ) | ||
− | </ | + | </pre> |
|} | |} | ||
Line 111: | Line 111: | ||
<syntaxhighlight lang="xquery"> | <syntaxhighlight lang="xquery"> | ||
fetch:binary-doc(file:read-binary('doc.xml')) | fetch:binary-doc(file:read-binary('doc.xml')) | ||
− | </ | + | </pre> |
* Encodes a string as CP1252 and parses it as XML. The input and the string {{Code|touché}} will be correctly decoded because of the XML declaration: | * Encodes a string as CP1252 and parses it as XML. The input and the string {{Code|touché}} will be correctly decoded because of the XML declaration: | ||
<syntaxhighlight lang="xquery"> | <syntaxhighlight lang="xquery"> | ||
Line 118: | Line 118: | ||
"CP1252" | "CP1252" | ||
)) | )) | ||
− | </ | + | </pre> |
* Encodes a string as UTF-16 and parses it as XML. The document will be correctly decoded, as the first bytes of the data indicate that the input must be UTF-16: | * Encodes a string as UTF-16 and parses it as XML. The document will be correctly decoded, as the first bytes of the data indicate that the input must be UTF-16: | ||
<syntaxhighlight lang="xquery"> | <syntaxhighlight lang="xquery"> | ||
fetch:binary-doc(convert:string-to-base64("<xml/>", "UTF16")) | fetch:binary-doc(convert:string-to-base64("<xml/>", "UTF16")) | ||
− | </ | + | </pre> |
|- valign="top" | |- valign="top" | ||
| '''Errors''' | | '''Errors''' |
Revision as of 18:30, 1 December 2023
This XQuery Module provides simple functions to fetch the content of resources identified by URIs. Resources can be stored locally or remotely and e.g. use the file://
or http://
scheme. If more control over HTTP requests is required, the HTTP Client Module can be used. With the HTML Module, retrieved HTML documents can be converted to XML.
Contents
Conventions
All functions and errors in this module are assigned to the http://basex.org/modules/fetch
namespace, which is statically bound to the fetch
prefix.
URI arguments can point be URLs or point to local files. Relative file paths will be resolved against the current working directory (for more details, have a look at the File Module).
Functions
fetch:binary
Signature | fetch:binary( $href as xs:string ) as xs:base64Binary |
Summary | Fetches the resource referred to by the given href string and returns it as lazy xs:base64Binary item.
|
Errors | open : the URI could not be resolved, or the resource could not be retrieved.
|
Examples |
|
fetch:text
Signature | fetch:text( $href as xs:string, $encoding as xs:string := (), $fallback as xs:boolean? := false() ) as xs:string |
Summary | Fetches the resource referred to by the given href string and returns it as lazy xs:string item:
|
Errors | open : the URI could not be resolved, or the resource could not be retrieved.encoding : the specified encoding is not supported, or unknown.
|
Examples |
|
fetch:doc
Signature | fetch:doc( $href as xs:string, $options as map(*)? := map { } ) as document-node() |
Summary | Fetches the resource referred to by the given href string and returns it as a document node.The $options argument can be used to change the parsing behavior. Allowed options are all parsing and XML parsing options in lower case.The function differs from fn:doc in various aspects:
|
Errors | open : the URI could not be resolved, or the resource could not be retrieved.
|
Examples |
<syntaxhighlight lang="xquery"> fetch:doc("http://en.wikipedia.org", map { 'stripws': true() })
<syntaxhighlight lang="xquery"> fetch:doc( 'http://basex.org/', map { 'parser': 'html', 'htmlparser': map { 'nons': false() } } ) |
fetch:binary-doc
Signature | fetch:binary-doc( $input as xs:anyAtomicType, $options as map(*)? := map { } ) as document-node() |
Summary | Converts the specified $input (xs:base64Binary , xs:hexBinary ) to XML and returns it as a document node.In contrast to fn:parse-xml , which expects a string, the input can be arbitrarily encoded. The encoding will be derived from the XML declaration or (in case of UTF-16 or UTF-32) from the first bytes of the input.The $options argument can be used to change the parsing behavior. Allowed options are all parsing and XML parsing options in lower case.
|
Examples |
<syntaxhighlight lang="xquery"> fetch:binary-doc(file:read-binary('doc.xml'))
<syntaxhighlight lang="xquery"> fetch:binary-doc(convert:string-to-base64( "<?xml version='1.0' encoding='CP1252'?><xml>touché</xml>", "CP1252" ))
<syntaxhighlight lang="xquery"> fetch:binary-doc(convert:string-to-base64("<xml/>", "UTF16")) |
Errors | open : the input could not be parsed.
|
fetch:content-type
Signature | fetch:content-type( $href as xs:string ) as xs:string |
Summary | Returns the content-type (also called mime-type) of the resource specified by href string:
|
Errors | open : the URI could not be resolved, or the resource could not be retrieved.
|
Examples |
|
Errors
Code | Description |
---|---|
encoding
|
The specified encoding is not supported, or unknown. |
open
|
The URI could not be resolved, or the resource could not be retrieved. |
Changelog
- Version 10.0
- Updated:
fetch:doc
renamed (before:fetch:xml
). - Updated:
fetch:binary-doc
renamed (before:fetch:xml-binary
).
- Version 9.0
- Added:
fetch:xml-binary
- Updated: error codes updated; errors now use the module namespace
- Version 8.5
- Updated:
fetch:text
:$fallback
argument added.
- Version 8.0
- Added:
fetch:xml
The module was introduced with Version 7.6.