Difference between revisions of "Catalog Resolver"
Line 47: | Line 47: | ||
BaseX offers support for the Apache maintained [http://xml.apache.org/commons XML Commons Resolver] available for download [http://xerces.apache.org/mirrors.cgi here]. | BaseX offers support for the Apache maintained [http://xml.apache.org/commons XML Commons Resolver] available for download [http://xerces.apache.org/mirrors.cgi here]. | ||
+ | |||
To use it add '''resolver.jar''' to the classpath when [[Startup|starting BaseX]]: | To use it add '''resolver.jar''' to the classpath when [[Startup|starting BaseX]]: | ||
<pre class="brush:bash"> | <pre class="brush:bash"> |
Revision as of 12:54, 24 January 2011
Contents
Overview
XML documents often rely on Document Type Definitions (DTD). While parsing a document with BaseX elements and entities can be checked for validity with respect to that particular DTD. Currently the DTD is used only for entity resolution.
XHTML for example defines its doctype via the following line:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
Fetching the xhtml1-strict.dtd
obviously involves network traffic. When dealing with single files this may seem tolerable, but
importing large collections might benefit from caching these resources locally.
Depending on your connection you will experience significant speed improvements.
XML Entity and URI Resolvers in BaseX
BaseX comes with a default URI resolver that is usable out of the box.
To enable entity resolving you have to provide a valid XML Catalog file. A simple working example for XHTML might look like this:
<?xml version="1.0"?> <catalog prefer="system" xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> <rewriteSystem systemIdStartString="http://www.w3.org/TR/xhtml1/DTD/" rewritePrefix="file:///path/to/dtds/" /> </catalog>
This rewrites all SystemIds starting with: http://www.w3.org/TR/xhtml1/DTD/ to file:///path/to/dtds/.
The XHTML DTD xhtml1-strict.dtd
and all its linked resources will now be loaded from the specified path.
GUI Mode
When running BaseX in GUI mode simply provide the path to your XML Catalog file in the Parsing-Tab of the Database Creation Dialog.
Console & Server Mode
To enable Entity Resolving in Console Mode specify the following options:
SET CATFILE [path]
Now entity resolving is active for the current session. All subsequent ADD
commands will use the catalog file to resolve entities.
The path to your catalog file and the actual entities may be either absolute or are relative to the current working directory.
Please note that entity resolving only works with option: SET INTPARSE false
. INTPARSE
is set to false by default.
Using the internal parser let's you specify manually whether you want to parse DTDs and entities or not.
Using other Resolvers
There might be some cases when you do not want to use the built-in resolver that Java provides by default (via com.sun.org.apache.xml.internal.resolver.*
).
BaseX offers support for the Apache maintained XML Commons Resolver available for download here.
To use it add resolver.jar to the classpath when starting BaseX:
java -cp basex.jar:resolver.jar org.basex.BaseXServer