Changes

Jump to navigation Jump to search
315 bytes removed ,  13:31, 28 June 2019
m
Made link to Wikipedia HTTPS for binary example - as HTTP returns nothing
This [[Module Library|XQuery Module]] provides functions for converting HTML to XML. Conversion will only take place if [http://home.ccil.org/~cowan/XML/tagsoup/ TagSoup] is included in the classpath (see [[Parsers#HTML Parser|HTML Parsing]] for more details).
=Conventions=
All functions and errors in this module are assigned to the {{Code|<code><nowiki>http://basex.org/modules/html}} </nowiki></code> namespace, which is statically bound to the {{Code|html}} prefix.<br/>All errors are assigned to the {{Code|http://basex.org/errors}} namespace, which is statically bound to the {{Code|bxerr}} prefix.
=Functions=
{| width='100%'
|-
| width='90120' | '''Signatures'''
|{{Code|'''html:parser'''() as xs:string}}<br />
|-
{| width='100%'
|-
| width='90120' | '''Signatures'''|{{Func|html:parse|$input as xs:anyAtomicType|document-node()}}<br />{{Func|html:parse|$input as xs:anyAtomicType, $options as itemmap(*)?|document-node()}}<br />
|-
| '''Summary'''
* If the input is passed on in its binary representation, the HTML parser will try to automatically choose the correct encoding.
The {{Code|$options}} argument can be used to set [[Parsers#TagSoup Options|TagSoup Options]], which can be specified…<br />* as children of an {{Code|<html:options/>}} element; e.g.:<pre class="brush:xml"><html:options> <html:key1 value='value1'/> ...</html:options></pre>* as map, which contains all key/value pairs:<pre class="brush:xml">map { "key1" := "value1", ... }</pre>
|-
| '''Errors'''
|{{Error|BXHL0001parse|#Errors}} the input cannot be converted to XML.
|}
==Examples==
===Simple Basic Example===
The following query converts the specified string to an XML document node.
;Query:
<pre class="brush:xquery">
html:parse("<html></html>")
</pre>
===Specifying Options===
The next query creates an XML document without with namespaces:
;Query:
<pre class="brush:xquery">
html:parse("<a href='ok.html'/>", map { 'nons' := truefalse() })
</pre>
;Result:
<pre class="brush:xml">
<htmlxmlns="http://www.w3.org/1999/xhtml">
<body>
<a shape="rect" href="ok.html"/>
</pre>
===Parsing binary inputBinary Input===
Binary If the input encoding is unknown, the data to be processed can be specified passed on in order to let the its binary representation.The HTML parser will automatically try to detect the correct encoding:
;Query:
<pre class="brush:xquery">
html:parse(fetch:content-binary("httphttps://en.wikipedia.org"))
</pre>
=Errors=
{| width='100%' class="wikitable" width="100%"! width="5%110"|Code! width="95%"|Description
|-
|{{Code|BXHL0001parse}}
|The input cannot be converted to XML.
|}
=Changelog=
The module was introduced with ;Version 7.5.19.0 * Updated: error codes updated; errors now use the module namespace
[[Category:XQuery]]The module was introduced with Version 7.6.
administrator, editor
33

edits

Navigation menu