Changes

Jump to navigation Jump to search
2,925 bytes added ,  02:04, 29 January 2020
URI rewrites from within XSLT transformations
This article is part of the [[Advanced User's Guide]]. It clarifies how to deal with external mapping system IDs (DTD declarations locations) and URIs to local resources when parsing and transforming XML data.
==Introduction==
</pre>
Fetching <code>xhtml1-strict.dtd</code> from the W3C’s server obviously involves network traffic. When dealing with single files, this may seem tolerable, but importing large collections benefits from caching these resources. Depending on the remote server, you will experience significant speed improvements when caching DTDs locally.
To address these issues, the [https://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html XML Catalogs Standard] defines an entity catalog that maps both external identifiers and arbitrary URI references to URI references.
 
Another application for XML catalogs is to provide local resources for reusable XSLT stylesheet libraries that are imported from a canonical location. This is described in greater detail in the following section.
==Usage==
 
===System ID (DTD Location) Rewrites===
BaseX relies on the Apache-maintained [http://xml.apache.org/commons XML Commons Resolver]. The ''xml-resolver-1.2.jar'' library is included in the full distributions of BaseX. If the resolver is not found in the classpath, and if Java 8 is used, Java’s built-in resolver will be applied (via <code>com.sun.org.apache.xml.internal.resolver.*</code>).
The catalog file ''etc/w3-catalog.xml'' in the full distributions can be used out of the box. It defines rewriting for some common W3 DTD files.
 
===URI Rewrites===
 
Consider a library of reusable XSLT stylesheets. For performance reasons, this library will be cached locally. However, the import URI for a given stylesheet should always be the same, independent of the accidental relative or absolute path that it is stored at locally. Example:
 
<pre class="brush:xml">
<xsl:import href="http://acme.com/xsltlib/acme2html/1.0/acme2html.xsl"/>
</pre>
 
The XSLT stylesheet might not even be available from this location. The URI serves as a canonical location identifier for this XSLT stylesheet. A local copy of the <code>acme2html/1.0/</code> directory is expected to reside somewhere, and the location of this directory relative to the local XML catalog file is specified in an entry in this catalog, like this:
 
<pre class="brush:xml">
<rewriteURI uriStartString="http://acme.com/xsltlib/acme2html/1.0/" rewritePrefix="../acmehtml10/"/>
</pre>
 
This way, XSLT import URIs don’t have to be adjusted for the relative or absolute locations of the XSLT library’s local copy.
 
The same URI rewriting works for resources retrieved by the <code>doc()</code> function from within an XSLT stylesheet. See [[XSLT Module]] for details on how to invoke XSLT stylesheets from within BaseX.
 
NOTE: This URI rewriting is currently restricted to XSLT stylesheets. It has neither been enabled yet for the XQuery function <code>doc()</code> nor for XSD schema locations.
===GUI Mode===
The runtime properties of the catalog resolver can be changed by setting system properties, or adding a ''CatalogManager.properties'' file to the classpath. By default, and if the system property {{Code|xml.catalog.ignoreMissing}} is not assigned, no warnings will be output to standard error if the properties file or resources linked from that file are not found. See [https://xerces.apache.org/xml-commons/components/resolver/resolver-article.html#ctrlresolver Controlling the Catalog Resolver] for more information.
 
When using a catalog within an XQuery Module, the global <code>db:catfile</code> option may not be set in this module. You can set it via pragma instead:
 
<pre class="brush:xquery">
(# db:catfile xmlcatalog/catalog.xml #) {
xslt:transform(db:open('acme_content')[1], '../acmecustom/acmehtml.xsl')
}
</pre>
 
It is assumed that this stylesheet <code>../acmecustom/acmehtml.xsl</code> (location relative to the current XQuery script or module) imports <code>acme2html/1.0/acme2html.xsl</code> by its canonical URI that will be resolved to a local URI by the catalog resolver.
 
Please note that since catalog-based URI rewriting does not work yet within URIs accessed from XQuery, you cannot give a canonical location that needs to be catalog-resolved as the second argument of <code>xslt:transform()</code>.
 
The catalog location in the pragma can be given relative to the current working directory (the directory that is returned by <code>file:current-dir()</code>) or as an absolute operating system path. The catalog location in the pragma is not an XQuery expression; no concatenation or other operations may occur in the pragma, and the location string must not be surrounded by quotes.
==Links==
editor, reviewer
9

edits

Navigation menu