Difference between revisions of "HTTP Client Module"

From BaseX Documentation
Jump to navigation Jump to search
m (Text replacement - "syntaxhighlight" to "pre")
 
(22 intermediate revisions by the same user not shown)
Line 2: Line 2:
  
 
If <code><http:header name="Accept-Encoding" value="gzip"/></code> is specified and if the addressed web server provides support for the {{Code|gzip}} compression algorithm, the response will automatically be decompressed.
 
If <code><http:header name="Accept-Encoding" value="gzip"/></code> is specified and if the addressed web server provides support for the {{Code|gzip}} compression algorithm, the response will automatically be decompressed.
 +
 +
Please note that BaseX provides extensions to the specification:
 +
 +
* Since {{Announce|Version 11}}, {{Code|csv}}, {{Code|json}} and {{Code|html}} parser options can be supplied to influence the conversion of the response.
 +
 +
Since BaseX 10, the module is based on the [https://openjdk.org/groups/net/httpclient/intro.html Java HTTP Client], which provides a better overall performance, uses internal connection pools and follows redirects across different protocols (http, https).
  
 
=Conventions=
 
=Conventions=
Line 11: Line 17:
  
 
==http:send-request==
 
==http:send-request==
 +
 +
{{Announce|Updated with Version 11:}} {{Code|csv}}, {{Code|json}}, {{Code|html}} and {{Code|text}} attributes added.
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''
+
| width='120' | '''Signature'''
|{{Func|http:send-request|$request as element(http:request)|item()+}}<br />{{Func|http:send-request|$request as element(http:request)?, $href as xs:string?|item()+}}<br />{{Func|http:send-request|$request as element(http:request)?, $href as xs:string?, $bodies as item()*|item()+}}<br/>
+
|<pre>http:send-request(
|-
+
  $request as element(http:request)?,
 +
  $href     as xs:string?             := (),
 +
  $bodies   as item()*                 := ()
 +
) as item()+</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
 
|Sends an HTTP request and interprets the corresponding response:
 
|Sends an HTTP request and interprets the corresponding response:
* {{Code|$request}} contains the parameters of the HTTP request such as HTTP method and headers.
+
* {{Code|$request}} contains an {{Code|<http:request/>}} element with a {{Code|method}} attribute, an {{Code|href}} attribute with the target URI, and optional header and body elements.
* In addition to this it can also contain the URI to which the request will be sent and the body of the HTTP method.
+
* The request is either sent to the URI of the {{Code|$href}} argument or (if empty) to the URI supplied via the {{Code|href}} attribute.
* If the URI is not given with the parameter {{Code|$href}}, its value in {{Code|$request}} is used instead.
+
* In addition to the attributes of the official specification, {{Code|csv}}, {{Code|json}}, {{Code|html}} and {{Code|text}} attributes can be supplied to define how to convert the response body (see [[#Response Conversion|Response Conversion]] for an example).
* The request body can also be supplied via the {{Code|$bodies}} parameter.
 
  
 
Notes:
 
Notes:
 
* Both basic and digest authentication is supported.
 
* Both basic and digest authentication is supported.
 
* While the contents of the request can be supplied as child of the {{Code|http:body}} element, it is faster and safer to pass them on via the third argument.
 
* While the contents of the request can be supplied as child of the {{Code|http:body}} element, it is faster and safer to pass them on via the third argument.
 +
* Certificate verification can be globally disabled via the {{Option|IGNORECERT}} option.
 
* For further information, please check out the [http://expath.org/spec/http-client EXPath] specification.
 
* For further information, please check out the [http://expath.org/spec/http-client EXPath] specification.
|-
+
|- valign="top"
 
|'''Errors'''
 
|'''Errors'''
 
|{{Error|HC0001|#Errors}} an HTTP error occurred.<br/>{{Error|HC0002|#Errors}} error parsing the entity content as XML or HTML.<br/>{{Error|HC0003|#Errors}} with a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.<br/>{{Error|HC0004|#Errors}} the src attribute on the body element is mutually exclusive with all other attribute (except the media-type).<br/>{{Error|HC0005|#Errors}} the request element is not valid.<br/>{{Error|HC0006|#Errors}} a timeout occurred waiting for the response.
 
|{{Error|HC0001|#Errors}} an HTTP error occurred.<br/>{{Error|HC0002|#Errors}} error parsing the entity content as XML or HTML.<br/>{{Error|HC0003|#Errors}} with a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.<br/>{{Error|HC0004|#Errors}} the src attribute on the body element is mutually exclusive with all other attribute (except the media-type).<br/>{{Error|HC0005|#Errors}} the request element is not valid.<br/>{{Error|HC0006|#Errors}} a timeout occurred waiting for the response.
Line 36: Line 48:
  
 
==Status Only==
 
==Status Only==
 +
 
Simple GET request. As the attribute {{Code|status-only}} is set to true, only the response element is returned.
 
Simple GET request. As the attribute {{Code|status-only}} is set to true, only the response element is returned.
  
 
'''Query:'''
 
'''Query:'''
<pre class="brush:xquery">http:send-request(<http:request method='get' status-only='true'/>, 'http://basex.org')</pre>
+
<pre lang='xquery'>http:send-request(<http:request method='get' status-only='true'/>, 'http://basex.org')</pre>
 
'''Result:'''
 
'''Result:'''
<pre class="brush:xml"><http:response status="200" message="OK">
+
<pre lang="xml"><http:response status="200" message="OK">
 
   <http:header name="Date" value="Mon, 14 Mar 2011 20:55:53 GMT"/>
 
   <http:header name="Date" value="Mon, 14 Mar 2011 20:55:53 GMT"/>
 
   <http:header name="Content-Length" value="12671"/>
 
   <http:header name="Content-Length" value="12671"/>
Line 56: Line 69:
 
==Google Homepage==
 
==Google Homepage==
  
Retrieve the Google search home page with a timeout of 10 seconds. In order to [[Parsers#HTML_Parser|parse HTML]], TagSoup must be contained in the class path.
+
Retrieve the Google search home page with a timeout of 10 seconds. In order to [[Parsers#HTMLParser|parse HTML]], TagSoup must be contained in the class path.
  
 
'''Query:'''
 
'''Query:'''
<pre class="brush:xquery">http:send-request(<http:request method='get' href='http://www.google.com' timeout='10'/>)</pre>
+
<pre lang='xquery'>http:send-request(<http:request method='get' href='http://www.google.com' timeout='10'/>)</pre>
 
'''Result:'''
 
'''Result:'''
<pre class="brush:xml">
+
<pre lang="xml">
 
<http:response status="200" message="OK">
 
<http:response status="200" message="OK">
 
   <http:header name="Date" value="Mon, 14 Mar 2011 22:03:25 GMT"/>
 
   <http:header name="Date" value="Mon, 14 Mar 2011 22:03:25 GMT"/>
Line 85: Line 98:
  
 
'''Query:'''
 
'''Query:'''
<pre class="brush:xquery">
+
<pre lang='xquery'>
 
let $binary :=  http:send-request(
 
let $binary :=  http:send-request(
 
   <http:request method='get'
 
   <http:request method='get'
Line 105: Line 118:
  
 
'''Query:'''
 
'''Query:'''
<pre class="brush:xquery">http:send-request(<http:request method='get'/>, 'http://upload.wikimedia.org/wikipedia/commons/6/6b/Bitmap_VS_SVG.svg')</pre>
+
<pre lang='xquery'>http:send-request(<http:request method='get'/>, 'http://upload.wikimedia.org/wikipedia/commons/6/6b/Bitmap_VS_SVG.svg')</pre>
  
 
'''Result:'''
 
'''Result:'''
<pre class="brush:xml"><http:response status="200" message="OK">
+
<pre lang="xml"><http:response status="200" message="OK">
 
   <http:header name="ETag" value="W/&quot;11b6d-4ba15ed4&quot;"/>
 
   <http:header name="ETag" value="W/&quot;11b6d-4ba15ed4&quot;"/>
 
   <http:header name="Age" value="9260"/>
 
   <http:header name="Age" value="9260"/>
Line 135: Line 148:
  
 
'''Query:'''
 
'''Query:'''
<pre class="brush:xquery">
+
<pre lang='xquery'>
 
http:send-request(
 
http:send-request(
 
   <http:request method='post' username='admin' password='admin'>
 
   <http:request method='post' username='admin' password='admin'>
 
     <http:body media-type='application/xml'/>
 
     <http:body media-type='application/xml'/>
 
   </http:request>,
 
   </http:request>,
   'http://localhost:8984/rest',
+
   'http://localhost:8080/rest',
 
   <query>
 
   <query>
 
     <text>
 
     <text>
Line 153: Line 166:
  
 
'''Result:'''
 
'''Result:'''
<pre class="brush:xml">
+
<pre lang="xml">
 
<http:response xmlns:http="http://expath.org/ns/http-client" status="200" message="OK">
 
<http:response xmlns:http="http://expath.org/ns/http-client" status="200" message="OK">
 
   <http:header name="Content-Length" value="135"/>
 
   <http:header name="Content-Length" value="135"/>
Line 173: Line 186:
 
'''Query:'''
 
'''Query:'''
  
<pre class="brush:xquery">
+
<pre lang='xquery'>
 
let $path := 'file-to-be.uploaded'
 
let $path := 'file-to-be.uploaded'
 
return http:send-request(
 
return http:send-request(
Line 179: Line 192:
 
     <http:multipart media-type='multipart/form-data'>
 
     <http:multipart media-type='multipart/form-data'>
 
       <http:header name='content-disposition'
 
       <http:header name='content-disposition'
         value='form-data; name="files" filename="{ file:name($path) }"'/>
+
         value='form-data; name="files"; filename="{ file:name($path) }"'/>
 
       <http:body media-type='application/octet-stream'/>
 
       <http:body media-type='application/octet-stream'/>
 
     </http:multipart>
 
     </http:multipart>
 
   </http:request>,
 
   </http:request>,
   'http://localhost:8984/write-to-temp',
+
   'http://localhost:8080/write-to-temp',
 
   file:read-binary($path)
 
   file:read-binary($path)
 
)
 
)
Line 190: Line 203:
 
'''RESTXQ service:'''
 
'''RESTXQ service:'''
  
<pre class="brush:xquery">
+
<pre lang='xquery'>
 
declare
 
declare
 
   %rest:POST
 
   %rest:POST
Line 203: Line 216:
 
};
 
};
 
</pre>
 
</pre>
 +
 +
==Response Conversion==
 +
 +
CSV, JSON and HTML responses are automatically converted to an XML representation. The target format can be influenced by supplying {{Code|csv}}, {{Code|json}} and {{Code|html}} attributes:
 +
 +
'''Query:'''
 +
 +
<pre lang='xquery'>
 +
http:send-request(<http:request method='GET' href='http://localhost:8080/json' json='format=xquery,lax=true'/>)
 +
</pre>
 +
 +
'''Result:'''
 +
<pre lang="javascript">
 +
map { "abcde": 12345 }
 +
</pre>
 +
 +
Without the {{Code|json}} attribute, the response body is converted to the default XML representation:
 +
 +
<pre lang="xml">
 +
<json type="object">
 +
  <abcde>12345</abcde>
 +
</json>
 +
</pre>
 +
 +
'''RESTXQ service:'''
 +
 +
<pre lang='xquery'>
 +
declare
 +
  %rest:path('json')
 +
  %output:method('json')
 +
function local:json() {
 +
  map { 'abcde': 12345 }
 +
};
 +
</pre>
 +
 +
See the [[CSV Module]], [[JSON Module]] and [[HTML Module]] for a list of the available options.
  
 
=Errors=
 
=Errors=
Line 209: Line 258:
 
! width="110"|Code
 
! width="110"|Code
 
|Description
 
|Description
|-
+
|- valign="top"
 
|{{Code|HC0001}}
 
|{{Code|HC0001}}
 
|An HTTP error occurred.
 
|An HTTP error occurred.
|-
+
|- valign="top"
 
|{{Code|HC0002}}
 
|{{Code|HC0002}}
 
|Error parsing the entity content as XML or HTML.
 
|Error parsing the entity content as XML or HTML.
|-
+
|- valign="top"
 
|{{Code|HC0003}}
 
|{{Code|HC0003}}
 
|With a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.
 
|With a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.
|-
+
|- valign="top"
 
|{{Code|HC0004}}
 
|{{Code|HC0004}}
 
|The src attribute on the body element is mutually exclusive with all other attribute (except the media-type).
 
|The src attribute on the body element is mutually exclusive with all other attribute (except the media-type).
|-
+
|- valign="top"
 
|{{Code|HC0005}}
 
|{{Code|HC0005}}
 
|The request element is not valid.
 
|The request element is not valid.
|-
+
|- valign="top"
 
|{{Code|HC0006}}
 
|{{Code|HC0006}}
 
|A timeout occurred waiting for the response.
 
|A timeout occurred waiting for the response.
Line 230: Line 279:
  
 
=Changelog=
 
=Changelog=
 +
 +
;Version 11.0
 +
* Updated: {{Function||http:send-request}}: {{Code|csv}}, {{Code|json}}, {{Code|html}} and {{Code|text}} attributes added.
 +
 +
;Version 10.0
 +
* Updated: Implementation based on the new [https://openjdk.org/groups/net/httpclient/intro.html Java HTTP Client].
  
 
;Version 9.0
 
;Version 9.0
Line 238: Line 293:
  
 
;Version 7.6
 
;Version 7.6
* Updated: [[#http:send-request|http:send-request]]: {{Code|HC0002}} is raised if the input cannot be parsed or converted to the final data type.
+
* Updated: {{Function||http:send-request}}: {{Code|HC0002}} is raised if the input cannot be parsed or converted to the final data type.
 
* Updated: errors are using {{Code|text/plain}} as media-type.
 
* Updated: errors are using {{Code|text/plain}} as media-type.

Latest revision as of 17:38, 1 December 2023

This XQuery Module contains a single function to send HTTP requests and handle HTTP responses. The function send-request is based on the EXPath HTTP Client Module. It gives full control over the available request and response parameters. For simple GET requests, the Fetch Module may be sufficient.

If <http:header name="Accept-Encoding" value="gzip"/> is specified and if the addressed web server provides support for the gzip compression algorithm, the response will automatically be decompressed.

Please note that BaseX provides extensions to the specification:

  • Since Version 11, csv, json and html parser options can be supplied to influence the conversion of the response.

Since BaseX 10, the module is based on the Java HTTP Client, which provides a better overall performance, uses internal connection pools and follows redirects across different protocols (http, https).

Conventions[edit]

All functions in this module are assigned to the http://expath.org/ns/http-client namespace, which is statically bound to the http prefix.
All errors are assigned to the http://expath.org/ns/error namespace, which is statically bound to the experr prefix.

Functions[edit]

http:send-request[edit]

Updated with Version 11: csv, json, html and text attributes added.

Signature
http:send-request(
  $request  as element(http:request)?,
  $href     as xs:string?              := (),
  $bodies   as item()*                 := ()
) as item()+
Summary Sends an HTTP request and interprets the corresponding response:
  • $request contains an <http:request/> element with a method attribute, an href attribute with the target URI, and optional header and body elements.
  • The request is either sent to the URI of the $href argument or (if empty) to the URI supplied via the href attribute.
  • In addition to the attributes of the official specification, csv, json, html and text attributes can be supplied to define how to convert the response body (see Response Conversion for an example).

Notes:

  • Both basic and digest authentication is supported.
  • While the contents of the request can be supplied as child of the http:body element, it is faster and safer to pass them on via the third argument.
  • Certificate verification can be globally disabled via the IGNORECERT option.
  • For further information, please check out the EXPath specification.
Errors HC0001: an HTTP error occurred.
HC0002: error parsing the entity content as XML or HTML.
HC0003: with a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.
HC0004: the src attribute on the body element is mutually exclusive with all other attribute (except the media-type).
HC0005: the request element is not valid.
HC0006: a timeout occurred waiting for the response.

Examples[edit]

Status Only[edit]

Simple GET request. As the attribute status-only is set to true, only the response element is returned.

Query:

http:send-request(<http:request method='get' status-only='true'/>, 'http://basex.org')

Result:

<http:response status="200" message="OK">
  <http:header name="Date" value="Mon, 14 Mar 2011 20:55:53 GMT"/>
  <http:header name="Content-Length" value="12671"/>
  <http:header name="Expires" value="Mon, 14 Mar 2011 20:57:23 GMT"/>
  <http:header name="Set-Cookie" value="fe_typo_user=d10c9552f9a784d1a73f8b6ebdf5ce63; path=/"/>
  <http:header name="Connection" value="close"/>
  <http:header name="Content-Type" value="text/html; charset=utf-8"/>
  <http:header name="Server" value="Apache/2.2.16"/>
  <http:header name="X-Powered-By" value="PHP/5.3.5"/>
  <http:header name="Cache-Control" value="max-age=90"/>
  <http:body media-type="text/html; charset=utf-8"/>
</http:response>

Google Homepage[edit]

Retrieve the Google search home page with a timeout of 10 seconds. In order to parse HTML, TagSoup must be contained in the class path.

Query:

http:send-request(<http:request method='get' href='http://www.google.com' timeout='10'/>)

Result:

<http:response status="200" message="OK">
  <http:header name="Date" value="Mon, 14 Mar 2011 22:03:25 GMT"/>
  <http:header name="Transfer-Encoding" value="chunked"/>
  <http:header name="Expires" value="-1"/>
  <http:header name="X-XSS-Protection" value="1; mode=block"/>
  <http:header name="Set-Cookie" value="...; expires=Tue, 13-Sep-2011 22:03:25 GMT; path=/; domain=.google.ch; HttpOnly"/>
  <http:header name="Content-Type" value="text/html; charset=ISO-8859-1"/>
  <http:header name="Server" value="gws"/>
  <http:header name="Cache-Control" value="private, max-age=0"/>
  <http:body media-type="text/html; charset=ISO-8859-1"/>
</http:response>
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
    <title>Google</title>
    ...
  </body>
</html>

The response content type can also be overwritten in order to retrieve HTML pages and other textual data as plain string (using text/plain) or in its binary representation (using application/octet-stream). With the http:header element, a custom user agent can be set. See the following example:

Query:

let $binary :=  http:send-request(
  <http:request method='get'
     override-media-type='application/octet-stream'
     href='http://www.google.com'>
    <http:header name="User-Agent" value="Opera"/>
  </http:request>
)[2]
return try {
  html:parse($binary)
} catch * {
  'Conversion to XML failed: ' || $err:description
}

SVG Data[edit]

Content-type ending with +xml, e.g. image/svg+xml.

Query:

http:send-request(<http:request method='get'/>, 'http://upload.wikimedia.org/wikipedia/commons/6/6b/Bitmap_VS_SVG.svg')

Result:

<http:response status="200" message="OK">
  <http:header name="ETag" value="W/"11b6d-4ba15ed4""/>
  <http:header name="Age" value="9260"/>
  <http:header name="Date" value="Mon, 14 Mar 2011 19:17:10 GMT"/>
  <http:header name="Content-Length" value="72557"/>
  <http:header name="Last-Modified" value="Wed, 17 Mar 2010 22:59:32 GMT"/>
  <http:header name="Content-Type" value="image/svg+xml"/>
  <http:header name="X-Cache-Lookup" value="MISS from knsq22.knams.wikimedia.org:80"/>
  <http:header name="Connection" value="keep-alive"/>
  <http:header name="Server" value="Sun-Java-System-Web-Server/7.0"/>
  <http:header name="X-Cache" value="MISS from knsq22.knams.wikimedia.org"/>
  <http:body media-type="image/svg+xml"/>
</http:response>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="1063" height="638">
  <defs>
    <linearGradient id="lg0">
      <stop stop-color="#3333ff" offset="0"/>
      <stop stop-color="#3f3fff" stop-opacity="0" offset="1"/>
    </linearGradient>
    ...
</svg>

POST Request[edit]

POST request to the BaseX REST Service, specifying a username and password.

Query:

http:send-request(
  <http:request method='post' username='admin' password='admin'>
    <http:body media-type='application/xml'/>
  </http:request>,
  'http://localhost:8080/rest',
  <query>
    <text>
      <html>{
        for $i in 1 to 3
        return <div>Section {$i }</div>
      }</html>
   </text>
  </query>
)

Result:

<http:response xmlns:http="http://expath.org/ns/http-client" status="200" message="OK">
  <http:header name="Content-Length" value="135"/>
  <http:header name="Content-Type" value="application/xml"/>
  <http:header name="Server" value="Jetty(6.1.26)"/>
  <http:body media-type="application/xml"/>
</http:response>
<html>
  <div>Section 1</div>
  <div>Section 2</div>
  <div>Section 3</div>
</html>

File Upload[edit]

Performs an HTML file upload. In the RESTXQ code, the uploaded file is written to the temporary directory:

Query:

let $path := 'file-to-be.uploaded'
return http:send-request(
  <http:request method='POST'>
    <http:multipart media-type='multipart/form-data'>
      <http:header name='content-disposition'
        value='form-data; name="files"; filename="{ file:name($path) }"'/>
      <http:body media-type='application/octet-stream'/>
    </http:multipart>
  </http:request>,
  'http://localhost:8080/write-to-temp',
  file:read-binary($path)
)

RESTXQ service:

declare
  %rest:POST
  %rest:path('/write-to-temp')
  %rest:form-param('files', '{$files}')
function dba:file-upload(
  $files  as map(xs:string, xs:base64Binary)
) as empty-sequence() {
  map:for-each($files, function($file, $content) {
    file:write-binary(file:temp-dir() || $file, $content)
  });
};

Response Conversion[edit]

CSV, JSON and HTML responses are automatically converted to an XML representation. The target format can be influenced by supplying csv, json and html attributes:

Query:

http:send-request(<http:request method='GET' href='http://localhost:8080/json' json='format=xquery,lax=true'/>)

Result:

map { "abcde": 12345 }

Without the json attribute, the response body is converted to the default XML representation:

<json type="object">
  <abcde>12345</abcde>
</json>

RESTXQ service:

declare
  %rest:path('json')
  %output:method('json')
function local:json() {
  map { 'abcde': 12345 }
};

See the CSV Module, JSON Module and HTML Module for a list of the available options.

Errors[edit]

Code Description
HC0001 An HTTP error occurred.
HC0002 Error parsing the entity content as XML or HTML.
HC0003 With a multipart response, the override-media-type must be either a multipart media type or application/octet-stream.
HC0004 The src attribute on the body element is mutually exclusive with all other attribute (except the media-type).
HC0005 The request element is not valid.
HC0006 A timeout occurred waiting for the response.

Changelog[edit]

Version 11.0
Version 10.0
Version 9.0
  • Updated: support for gzipped content encoding
Version 8.0
  • Added: digest authentication
Version 7.6
  • Updated: http:send-request: HC0002 is raised if the input cannot be parsed or converted to the final data type.
  • Updated: errors are using text/plain as media-type.