Difference between revisions of "HTTP Client Module"

From BaseX Documentation
Jump to navigation Jump to search
(15 intermediate revisions by 2 users not shown)
Line 2: Line 2:
  
 
If <code><http:header name="Accept-Encoding" value="gzip"/></code> is specified and if the addressed web server provides support for the {{Code|gzip}} compression algorithm, the response will automatically be decompressed.
 
If <code><http:header name="Accept-Encoding" value="gzip"/></code> is specified and if the addressed web server provides support for the {{Code|gzip}} compression algorithm, the response will automatically be decompressed.
 +
 +
{{Announce|Updated with Version 10:}} The Implementation of the module is now based on the new [https://openjdk.org/groups/net/httpclient/intro.html Java HTTP Client], which provides a better overall performance, uses internal connection pools and follows redirects across different protocols (http, https).
  
 
=Conventions=
 
=Conventions=
  
 
All functions in this module are assigned to the <code><nowiki>http://expath.org/ns/http-client</nowiki></code> namespace, which is statically bound to the {{Code|http}} prefix.<br/>
 
All functions in this module are assigned to the <code><nowiki>http://expath.org/ns/http-client</nowiki></code> namespace, which is statically bound to the {{Code|http}} prefix.<br/>
All errors are assigned to the <code><nowiki>http://expath.org/ns/error</nowiki></code> namespace, which is statically bound to the {{Code|exerr}} prefix.
+
All errors are assigned to the <code><nowiki>http://expath.org/ns/error</nowiki></code> namespace, which is statically bound to the {{Code|experr}} prefix.
  
 
=Functions=
 
=Functions=
Line 13: Line 15:
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
 
| width='120' | '''Signatures'''
 
| width='120' | '''Signatures'''
 
|{{Func|http:send-request|$request as element(http:request)|item()+}}<br />{{Func|http:send-request|$request as element(http:request)?, $href as xs:string?|item()+}}<br />{{Func|http:send-request|$request as element(http:request)?, $href as xs:string?, $bodies as item()*|item()+}}<br/>
 
|{{Func|http:send-request|$request as element(http:request)|item()+}}<br />{{Func|http:send-request|$request as element(http:request)?, $href as xs:string?|item()+}}<br />{{Func|http:send-request|$request as element(http:request)?, $href as xs:string?, $bodies as item()*|item()+}}<br/>
|-
+
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
 
|Sends an HTTP request and interprets the corresponding response:
 
|Sends an HTTP request and interprets the corresponding response:
Line 23: Line 25:
 
* If the URI is not given with the parameter {{Code|$href}}, its value in {{Code|$request}} is used instead.
 
* If the URI is not given with the parameter {{Code|$href}}, its value in {{Code|$request}} is used instead.
 
* The request body can also be supplied via the {{Code|$bodies}} parameter.
 
* The request body can also be supplied via the {{Code|$bodies}} parameter.
 +
* Certificate verification can be globally disabled via the {{Option|IGNORECERT}} option.
  
 
Notes:
 
Notes:
Line 28: Line 31:
 
* While the contents of the request can be supplied as child of the {{Code|http:body}} element, it is faster and safer to pass them on via the third argument.
 
* While the contents of the request can be supplied as child of the {{Code|http:body}} element, it is faster and safer to pass them on via the third argument.
 
* For further information, please check out the [http://expath.org/spec/http-client EXPath] specification.
 
* For further information, please check out the [http://expath.org/spec/http-client EXPath] specification.
|-
+
|- valign="top"
 
|'''Errors'''
 
|'''Errors'''
 
|{{Error|HC0001|#Errors}} an HTTP error occurred.<br/>{{Error|HC0002|#Errors}} error parsing the entity content as XML or HTML.<br/>{{Error|HC0003|#Errors}} with a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.<br/>{{Error|HC0004|#Errors}} the src attribute on the body element is mutually exclusive with all other attribute (except the media-type).<br/>{{Error|HC0005|#Errors}} the request element is not valid.<br/>{{Error|HC0006|#Errors}} a timeout occurred waiting for the response.
 
|{{Error|HC0001|#Errors}} an HTTP error occurred.<br/>{{Error|HC0002|#Errors}} error parsing the entity content as XML or HTML.<br/>{{Error|HC0003|#Errors}} with a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.<br/>{{Error|HC0004|#Errors}} the src attribute on the body element is mutually exclusive with all other attribute (except the media-type).<br/>{{Error|HC0005|#Errors}} the request element is not valid.<br/>{{Error|HC0006|#Errors}} a timeout occurred waiting for the response.
Line 39: Line 42:
  
 
'''Query:'''
 
'''Query:'''
<pre class="brush:xquery">http:send-request(<http:request method='get' status-only='true'/>, 'http://basex.org')</pre>
+
<syntaxhighlight lang="xquery">http:send-request(<http:request method='get' status-only='true'/>, 'http://basex.org')</syntaxhighlight>
 
'''Result:'''
 
'''Result:'''
<pre class="brush:xml"><http:response status="200" message="OK">
+
<syntaxhighlight lang="xml"><http:response status="200" message="OK">
 
   <http:header name="Date" value="Mon, 14 Mar 2011 20:55:53 GMT"/>
 
   <http:header name="Date" value="Mon, 14 Mar 2011 20:55:53 GMT"/>
 
   <http:header name="Content-Length" value="12671"/>
 
   <http:header name="Content-Length" value="12671"/>
Line 52: Line 55:
 
   <http:header name="Cache-Control" value="max-age=90"/>
 
   <http:header name="Cache-Control" value="max-age=90"/>
 
   <http:body media-type="text/html; charset=utf-8"/>
 
   <http:body media-type="text/html; charset=utf-8"/>
</http:response></pre>
+
</http:response></syntaxhighlight>
  
 
==Google Homepage==
 
==Google Homepage==
Line 59: Line 62:
  
 
'''Query:'''
 
'''Query:'''
<pre class="brush:xquery">http:send-request(<http:request method='get' href='http://www.google.com' timeout='10'/>)</pre>
+
<syntaxhighlight lang="xquery">http:send-request(<http:request method='get' href='http://www.google.com' timeout='10'/>)</syntaxhighlight>
 
'''Result:'''
 
'''Result:'''
<pre class="brush:xml">
+
<syntaxhighlight lang="xml">
 
<http:response status="200" message="OK">
 
<http:response status="200" message="OK">
 
   <http:header name="Date" value="Mon, 14 Mar 2011 22:03:25 GMT"/>
 
   <http:header name="Date" value="Mon, 14 Mar 2011 22:03:25 GMT"/>
Line 75: Line 78:
 
<html xmlns="http://www.w3.org/1999/xhtml">
 
<html xmlns="http://www.w3.org/1999/xhtml">
 
   <head>
 
   <head>
     <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"/>
+
     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
 
     <title>Google</title>
 
     <title>Google</title>
     <script>window.google={kEI:"rZB-
+
     ...
    ...
 
    </script>
 
    </center>
 
 
   </body>
 
   </body>
 
</html>
 
</html>
</pre>
+
</syntaxhighlight>
  
 
The response content type can also be overwritten in order to retrieve HTML pages and other textual data as plain string (using {{Code|text/plain}}) or in its binary representation (using {{Code|application/octet-stream}}). With the {{Code|http:header}} element, a custom user agent can be set. See the following example:
 
The response content type can also be overwritten in order to retrieve HTML pages and other textual data as plain string (using {{Code|text/plain}}) or in its binary representation (using {{Code|application/octet-stream}}). With the {{Code|http:header}} element, a custom user agent can be set. See the following example:
  
 
'''Query:'''
 
'''Query:'''
<pre class="brush:xquery">
+
<syntaxhighlight lang="xquery">
 
let $binary :=  http:send-request(
 
let $binary :=  http:send-request(
 
   <http:request method='get'
 
   <http:request method='get'
Line 101: Line 101:
 
   'Conversion to XML failed: ' || $err:description
 
   'Conversion to XML failed: ' || $err:description
 
}
 
}
</pre>
+
</syntaxhighlight>
  
 
===SVG Data===
 
===SVG Data===
Line 108: Line 108:
  
 
'''Query:'''
 
'''Query:'''
<pre class="brush:xquery">http:send-request(<http:request method='get'/>, 'http://upload.wikimedia.org/wikipedia/commons/6/6b/Bitmap_VS_SVG.svg')</pre>
+
<syntaxhighlight lang="xquery">http:send-request(<http:request method='get'/>, 'http://upload.wikimedia.org/wikipedia/commons/6/6b/Bitmap_VS_SVG.svg')</syntaxhighlight>
  
 
'''Result:'''
 
'''Result:'''
<pre class="brush:xml"><http:response status="200" message="OK">
+
<syntaxhighlight lang="xml"><http:response status="200" message="OK">
 
   <http:header name="ETag" value="W/&quot;11b6d-4ba15ed4&quot;"/>
 
   <http:header name="ETag" value="W/&quot;11b6d-4ba15ed4&quot;"/>
 
   <http:header name="Age" value="9260"/>
 
   <http:header name="Age" value="9260"/>
Line 131: Line 131:
 
     </linearGradient>
 
     </linearGradient>
 
     ...
 
     ...
</svg></pre>
+
</svg></syntaxhighlight>
  
 
==POST Request==
 
==POST Request==
Line 138: Line 138:
  
 
'''Query:'''
 
'''Query:'''
<pre class="brush:xquery">
+
<syntaxhighlight lang="xquery">
let $request :=
+
http:send-request(
   <http:request href='http://localhost:8984/rest'
+
   <http:request method='post' username='admin' password='admin'>
    method='post' username='admin' password='admin' send-authorization='true'>
+
     <http:body media-type='application/xml'/>
     <http:body media-type='application/xml'>
+
  </http:request>,
      <query xmlns="http://basex.org/rest">
+
  'http://localhost:8080/rest',
        <text><![CDATA[
+
  <query>
          <html>{
+
    <text>
            for $i in 1 to 3
+
      <html>{
            return <div>Section {$i }</div>
+
        for $i in 1 to 3
          }</html>
+
        return <div>Section {$i }</div>
        ]]></text>
+
      }</html>
      </query>
+
  </text>
    </http:body>
+
  </query>
  </http:request>
+
)
return http:send-request($request)
+
</syntaxhighlight>
</pre>
 
  
 
'''Result:'''
 
'''Result:'''
<pre class="brush:xml">
+
<syntaxhighlight lang="xml">
 
<http:response xmlns:http="http://expath.org/ns/http-client" status="200" message="OK">
 
<http:response xmlns:http="http://expath.org/ns/http-client" status="200" message="OK">
 
   <http:header name="Content-Length" value="135"/>
 
   <http:header name="Content-Length" value="135"/>
Line 169: Line 168:
 
   <div>Section 3</div>
 
   <div>Section 3</div>
 
</html>
 
</html>
</pre>
+
</syntaxhighlight>
 +
 
 +
==File Upload==
 +
 
 +
Performs an HTML file upload. In the RESTXQ code, the uploaded file is written to the temporary directory:
 +
 
 +
'''Query:'''
 +
 
 +
<syntaxhighlight lang="xquery">
 +
let $path := 'file-to-be.uploaded'
 +
return http:send-request(
 +
  <http:request method='POST'>
 +
    <http:multipart media-type='multipart/form-data'>
 +
      <http:header name='content-disposition'
 +
        value='form-data; name="files"; filename="{ file:name($path) }"'/>
 +
      <http:body media-type='application/octet-stream'/>
 +
    </http:multipart>
 +
  </http:request>,
 +
  'http://localhost:8080/write-to-temp',
 +
  file:read-binary($path)
 +
)
 +
</syntaxhighlight>
 +
 
 +
'''RESTXQ service:'''
 +
 
 +
<syntaxhighlight lang="xquery">
 +
declare
 +
  %rest:POST
 +
  %rest:path('/write-to-temp')
 +
  %rest:form-param('files', '{$files}')
 +
function dba:file-upload(
 +
  $files  as map(xs:string, xs:base64Binary)
 +
) as empty-sequence() {
 +
  map:for-each($files, function($file, $content) {
 +
    file:write-binary(file:temp-dir() || $file, $content)
 +
  });
 +
};
 +
</syntaxhighlight>
  
 
=Errors=
 
=Errors=
Line 176: Line 212:
 
! width="110"|Code
 
! width="110"|Code
 
|Description
 
|Description
|-
+
|- valign="top"
 
|{{Code|HC0001}}
 
|{{Code|HC0001}}
 
|An HTTP error occurred.
 
|An HTTP error occurred.
|-
+
|- valign="top"
 
|{{Code|HC0002}}
 
|{{Code|HC0002}}
 
|Error parsing the entity content as XML or HTML.
 
|Error parsing the entity content as XML or HTML.
|-
+
|- valign="top"
 
|{{Code|HC0003}}
 
|{{Code|HC0003}}
 
|With a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.
 
|With a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.
|-
+
|- valign="top"
 
|{{Code|HC0004}}
 
|{{Code|HC0004}}
 
|The src attribute on the body element is mutually exclusive with all other attribute (except the media-type).
 
|The src attribute on the body element is mutually exclusive with all other attribute (except the media-type).
|-
+
|- valign="top"
 
|{{Code|HC0005}}
 
|{{Code|HC0005}}
 
|The request element is not valid.
 
|The request element is not valid.
|-
+
|- valign="top"
 
|{{Code|HC0006}}
 
|{{Code|HC0006}}
 
|A timeout occurred waiting for the response.
 
|A timeout occurred waiting for the response.
Line 197: Line 233:
  
 
=Changelog=
 
=Changelog=
 +
 +
;Version 10.0
 +
* Updated: Implementation based on the new [https://openjdk.org/groups/net/httpclient/intro.html Java HTTP Client].
  
 
;Version 9.0
 
;Version 9.0
Line 205: Line 244:
  
 
;Version 7.6
 
;Version 7.6
* Updated: [[#http:send-request|http:send-request]]: {{Code|HC0002}} is raised if the input cannot be parsed or converted to the final data type.
+
* Updated: {{Function||http:send-request}}: {{Code|HC0002}} is raised if the input cannot be parsed or converted to the final data type.
 
* Updated: errors are using {{Code|text/plain}} as media-type.
 
* Updated: errors are using {{Code|text/plain}} as media-type.

Revision as of 15:57, 4 August 2022

This XQuery Module contains a single function to send HTTP requests and handle HTTP responses. The function send-request is based on the EXPath HTTP Client Module. It gives full control over the available request and response parameters. For simple GET requests, the Fetch Module may be sufficient.

If <http:header name="Accept-Encoding" value="gzip"/> is specified and if the addressed web server provides support for the gzip compression algorithm, the response will automatically be decompressed.

Updated with Version 10: The Implementation of the module is now based on the new Java HTTP Client, which provides a better overall performance, uses internal connection pools and follows redirects across different protocols (http, https).

Conventions

All functions in this module are assigned to the http://expath.org/ns/http-client namespace, which is statically bound to the http prefix.
All errors are assigned to the http://expath.org/ns/error namespace, which is statically bound to the experr prefix.

Functions

http:send-request

Signatures http:send-request($request as element(http:request)) as item()+
http:send-request($request as element(http:request)?, $href as xs:string?) as item()+
http:send-request($request as element(http:request)?, $href as xs:string?, $bodies as item()*) as item()+
Summary Sends an HTTP request and interprets the corresponding response:
  • $request contains the parameters of the HTTP request such as HTTP method and headers.
  • In addition to this it can also contain the URI to which the request will be sent and the body of the HTTP method.
  • If the URI is not given with the parameter $href, its value in $request is used instead.
  • The request body can also be supplied via the $bodies parameter.
  • Certificate verification can be globally disabled via the IGNORECERT option.

Notes:

  • Both basic and digest authentication is supported.
  • While the contents of the request can be supplied as child of the http:body element, it is faster and safer to pass them on via the third argument.
  • For further information, please check out the EXPath specification.
Errors HC0001: an HTTP error occurred.
HC0002: error parsing the entity content as XML or HTML.
HC0003: with a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.
HC0004: the src attribute on the body element is mutually exclusive with all other attribute (except the media-type).
HC0005: the request element is not valid.
HC0006: a timeout occurred waiting for the response.

Examples

Status Only

Simple GET request. As the attribute status-only is set to true, only the response element is returned.

Query: <syntaxhighlight lang="xquery">http:send-request(<http:request method='get' status-only='true'/>, 'http://basex.org')</syntaxhighlight> Result: <syntaxhighlight lang="xml"><http:response status="200" message="OK">

 <http:header name="Date" value="Mon, 14 Mar 2011 20:55:53 GMT"/>
 <http:header name="Content-Length" value="12671"/>
 <http:header name="Expires" value="Mon, 14 Mar 2011 20:57:23 GMT"/>
 <http:header name="Set-Cookie" value="fe_typo_user=d10c9552f9a784d1a73f8b6ebdf5ce63; path=/"/>
 <http:header name="Connection" value="close"/>
 <http:header name="Content-Type" value="text/html; charset=utf-8"/>
 <http:header name="Server" value="Apache/2.2.16"/>
 <http:header name="X-Powered-By" value="PHP/5.3.5"/>
 <http:header name="Cache-Control" value="max-age=90"/>
 <http:body media-type="text/html; charset=utf-8"/>

</http:response></syntaxhighlight>

Google Homepage

Retrieve the Google search home page with a timeout of 10 seconds. In order to parse HTML, TagSoup must be contained in the class path.

Query: <syntaxhighlight lang="xquery">http:send-request(<http:request method='get' href='http://www.google.com' timeout='10'/>)</syntaxhighlight> Result: <syntaxhighlight lang="xml"> <http:response status="200" message="OK">

 <http:header name="Date" value="Mon, 14 Mar 2011 22:03:25 GMT"/>
 <http:header name="Transfer-Encoding" value="chunked"/>
 <http:header name="Expires" value="-1"/>
 <http:header name="X-XSS-Protection" value="1; mode=block"/>
 <http:header name="Set-Cookie" value="...; expires=Tue, 13-Sep-2011 22:03:25 GMT; path=/; domain=.google.ch; HttpOnly"/>
 <http:header name="Content-Type" value="text/html; charset=ISO-8859-1"/>
 <http:header name="Server" value="gws"/>
 <http:header name="Cache-Control" value="private, max-age=0"/>
 <http:body media-type="text/html; charset=ISO-8859-1"/>

</http:response> <html xmlns="http://www.w3.org/1999/xhtml">

 <head>
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
   <title>Google</title>
   ...
 </body>

</html> </syntaxhighlight>

The response content type can also be overwritten in order to retrieve HTML pages and other textual data as plain string (using text/plain) or in its binary representation (using application/octet-stream). With the http:header element, a custom user agent can be set. See the following example:

Query: <syntaxhighlight lang="xquery"> let $binary := http:send-request(

 <http:request method='get'
    override-media-type='application/octet-stream'
    href='http://www.google.com'>
   <http:header name="User-Agent" value="Opera"/>
 </http:request>

)[2] return try {

 html:parse($binary)

} catch * {

 'Conversion to XML failed: ' || $err:description

} </syntaxhighlight>

SVG Data

Content-type ending with +xml, e.g. image/svg+xml.

Query: <syntaxhighlight lang="xquery">http:send-request(<http:request method='get'/>, 'http://upload.wikimedia.org/wikipedia/commons/6/6b/Bitmap_VS_SVG.svg')</syntaxhighlight>

Result: <syntaxhighlight lang="xml"><http:response status="200" message="OK">

 <http:header name="ETag" value="W/"11b6d-4ba15ed4""/>
 <http:header name="Age" value="9260"/>
 <http:header name="Date" value="Mon, 14 Mar 2011 19:17:10 GMT"/>
 <http:header name="Content-Length" value="72557"/>
 <http:header name="Last-Modified" value="Wed, 17 Mar 2010 22:59:32 GMT"/>
 <http:header name="Content-Type" value="image/svg+xml"/>
 <http:header name="X-Cache-Lookup" value="MISS from knsq22.knams.wikimedia.org:80"/>
 <http:header name="Connection" value="keep-alive"/>
 <http:header name="Server" value="Sun-Java-System-Web-Server/7.0"/>
 <http:header name="X-Cache" value="MISS from knsq22.knams.wikimedia.org"/>
 <http:body media-type="image/svg+xml"/>

</http:response> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="1063" height="638">

 <defs>
   <linearGradient id="lg0">
     <stop stop-color="#3333ff" offset="0"/>
     <stop stop-color="#3f3fff" stop-opacity="0" offset="1"/>
   </linearGradient>
   ...

</svg></syntaxhighlight>

POST Request

POST request to the BaseX REST Service, specifying a username and password.

Query: <syntaxhighlight lang="xquery"> http:send-request(

 <http:request method='post' username='admin' password='admin'>
   <http:body media-type='application/xml'/>
 </http:request>,
 'http://localhost:8080/rest',
 <query>
   <text>
     <html>{
       for $i in 1 to 3

return

Section {$i }
     }</html>
  </text>
 </query>

) </syntaxhighlight>

Result: <syntaxhighlight lang="xml"> <http:response xmlns:http="http://expath.org/ns/http-client" status="200" message="OK">

 <http:header name="Content-Length" value="135"/>
 <http:header name="Content-Type" value="application/xml"/>
 <http:header name="Server" value="Jetty(6.1.26)"/>
 <http:body media-type="application/xml"/>

</http:response> <html>

Section 1
Section 2
Section 3

</html> </syntaxhighlight>

File Upload

Performs an HTML file upload. In the RESTXQ code, the uploaded file is written to the temporary directory:

Query:

<syntaxhighlight lang="xquery"> let $path := 'file-to-be.uploaded' return http:send-request(

 <http:request method='POST'>
   <http:multipart media-type='multipart/form-data'>
     <http:header name='content-disposition'
       value='form-data; name="files"; filename="{ file:name($path) }"'/>
     <http:body media-type='application/octet-stream'/>
   </http:multipart>
 </http:request>,
 'http://localhost:8080/write-to-temp',
 file:read-binary($path)

) </syntaxhighlight>

RESTXQ service:

<syntaxhighlight lang="xquery"> declare

 %rest:POST
 %rest:path('/write-to-temp')
 %rest:form-param('files', '{$files}')

function dba:file-upload(

 $files  as map(xs:string, xs:base64Binary)

) as empty-sequence() {

 map:for-each($files, function($file, $content) {
   file:write-binary(file:temp-dir() || $file, $content)
 });

}; </syntaxhighlight>

Errors

Code Description
HC0001 An HTTP error occurred.
HC0002 Error parsing the entity content as XML or HTML.
HC0003 With a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.
HC0004 The src attribute on the body element is mutually exclusive with all other attribute (except the media-type).
HC0005 The request element is not valid.
HC0006 A timeout occurred waiting for the response.

Changelog

Version 10.0
Version 9.0
  • Updated: support for gzipped content encoding
Version 8.0
  • Added: digest authentication
Version 7.6
  • Updated: http:send-request: HC0002 is raised if the input cannot be parsed or converted to the final data type.
  • Updated: errors are using text/plain as media-type.