Difference between revisions of "Java Bindings"

From BaseX Documentation
Jump to navigation Jump to search
 
(75 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
The Java Binding feature is an extensibility mechanism which enables developers
 
The Java Binding feature is an extensibility mechanism which enables developers
 
to directly access Java variables and execute code from XQuery. Addressed Java code must either be contained in the Java classpath, or it must be located in the [[Repository]].
 
to directly access Java variables and execute code from XQuery. Addressed Java code must either be contained in the Java classpath, or it must be located in the [[Repository]].
 +
 +
Please bear in mind that the execution of Java code may cause side effects that conflict with the functional nature of XQuery, or may introduce new security risks to your project.
 +
 +
{{Mark|Updated with Version 9.6:}}
 +
* With the middle dot notation, three adjacent dots can be used to specify array types.
 +
* The path to the standard package {{Code|java.lang.}} can now be omitted.
 +
* Java objects are now wrapped into function items.
 +
* Results of constructor calls are always returned as function item.
 +
* A new option {{Option|WRAPJAVA}} was added to control how Java values are converted to XQuery.
 +
* The Mapping rules were refined and unified. The most important changes:
 +
** {{Code|array(*)}} type added.
 +
** {{Code|xs:integer}} values are converted to {{Code|long}} values.
 +
** {{Code|xs:unsignedShort}} values are converted to {{Code|char}} values.
 +
* All error messages were revised and improved.
  
 
=Identification=
 
=Identification=
 
{{Mark|Updated with Version 8.4}}: address Java functions with specific types
 
  
 
==Classes==
 
==Classes==
Line 12: Line 24:
 
A Java class is identified by a namespace URI. The original URI is rewritten as follows:
 
A Java class is identified by a namespace URI. The original URI is rewritten as follows:
  
# The [[Repository#URI_Rewriting|URI Rewriting]] steps are applied to the URI.
+
# The [[#URI Rewriting|URI Rewriting]] steps are applied to the URI.
 
# Slashes in the resulting URI are replaced with dots.
 
# Slashes in the resulting URI are replaced with dots.
# The last path segment of the URI is capitalized and rewritten to [https://en.wikipedia.org/wiki/CamelCase camel case].
+
# The last path segment of the URI is capitalized and rewritten to [https://en.wikipedia.org/wiki/CamelCase CamelCase].
  
The normalization steps are skipped if the URI is prefixed with {{Code|java:}}. See the following examples:
+
The normalization steps are skipped if the URI is prefixed with {{Code|java:}}. The path to the standard package {{Code|java.lang.}} can be omitted:
  
 
* <code><nowiki>http://basex.org/modules/meta-data</nowiki></code> → <code>org.basex.modules.MetaData</code>
 
* <code><nowiki>http://basex.org/modules/meta-data</nowiki></code> → <code>org.basex.modules.MetaData</code>
 
* <code>java:java.lang.String</code> → <code>java.lang.String</code>
 
* <code>java:java.lang.String</code> → <code>java.lang.String</code>
 +
* <code>StringBuilder</code> → <code>java.lang.StringBuilder</code>
  
 
==Functions and Variables==
 
==Functions and Variables==
  
Java functions and variables can be referenced and evaluated by the existing XQuery function syntax:
+
Java constructors, functions and variables can be referenced and evaluated by the existing XQuery function syntax:
  
 
* The namespace of the function name identifies the Java class.
 
* The namespace of the function name identifies the Java class.
 
* The local part of the name, which is rewritten to camel case, identifies a variable or function of that class.
 
* The local part of the name, which is rewritten to camel case, identifies a variable or function of that class.
* The middle dot character (<code>[http://www.fileformat.info/info/unicode/char/b7/index.htm ·/&amp;#xB7;]</code>) is a valid character in XQuery names, but not in Java. It can be used to append exact Java parameter types to the function name. Class types must be referenced by their full path.
+
* The middle dot character <code>[https://www.fileformat.info/info/unicode/char/b7/index.htm ·]</code> (<code>&amp;#xB7;</code>, a valid character in XQuery names, but not in Java) can be used to append exact Java parameter types to the function name. Class types must be referenced by their full path. Three adjacent dots can be used to address an array argument.
  
 
{| class="wikitable"
 
{| class="wikitable"
 
|- valign="top"
 
|- valign="top"
! Type
+
! Addressed code
 
! XQuery
 
! XQuery
 
! Java
 
! Java
 
|- valign="top"
 
|- valign="top"
 
| Variable
 
| Variable
| <code>Q{java.lang.Integer}MIN_VALUE</code>
+
| <code>Q{Integer}MIN_VALUE()</code>
| <code>[https://docs.oracle.com/javase/8/docs/api/java/lang/Integer.html#MAX_VALUE Integer.MIN_VALUE]</code>
+
| <code>Integer.MIN_VALUE</code>
 
|- valign="top"
 
|- valign="top"
 
| Function
 
| Function
| <code>Q{java.lang.Object}hash-code()</code>
+
| <code>Q{Object}hash-code($object)</code>
| <code>[https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#hashCode() object.hashCode()]</code>
+
| <code>object.hashCode()</code>
 
|- valign="top"
 
|- valign="top"
| Function with types
+
| Function with argument
| <code>Q{java.lang.String}split·java.lang.String·int(';', 3)</code>
+
| <code>Q{String}split·String·int($string, ';', xs:int(3))</code>
| <code>[https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#split-java.lang.String-int- string.split(";", 3)]</code>
+
| <code>string.split(";", 3)</code>
 +
|- valign="top"
 +
| Constructor with array argument
 +
| <code>Q{String}new·byte...(xs:hexBinary('414243'))</code>
 +
| <code>new String(new byte[] { 41, 42, 43 })</code>
 
|}
 
|}
  
As XQuery and Java have different type systems, XQuery arguments are converted to equivalent Java values, and the result of a Java function is converted back to an XQuery value (see [[#Data Types|Data Types]]).
+
As XQuery and Java have different type systems, XQuery arguments must be converted to equivalent Java values, and the result of a Java function is converted back to an XQuery value (see [[#Data Types|Data Types]]).
  
If a Java function is not found, XQuery values may need to be cast the target type. For example, if a Java function expects a primitive {{Code|int}} value, you will need to convert your XQuery integers to {{Code|xs:int}}.
+
If the Java function you want to address is not detected, you may need to cast your values to the target type. For example, if a Java function expects a primitive {{Code|int}} value, you will need to convert your XQuery integers to {{Code|xs:int}}.
  
 
=Namespace Declarations=
 
=Namespace Declarations=
  
In the following example, Java’s {{Code|Math}} class is referenced. When executed, the query returns the cosine of an angle by calling the static method {{Code|cos()}}, and the value of π by addressing the static variable via {{Code|PI()}}:
+
In the following example, the Java {{Code|Math}} class is referenced. When executed, the query returns the cosine of an angle by calling the static method {{Code|cos()}}, and the value of π by addressing the static variable via {{Code|PI()}}:
  
<pre class="brush:xquery">
+
<syntaxhighlight lang="xquery">
 
declare namespace math = "java:java.lang.Math";
 
declare namespace math = "java:java.lang.Math";
 
math:cos(xs:double(0)), math:PI()
 
math:cos(xs:double(0)), math:PI()
</pre>
+
</syntaxhighlight>
  
With the [[XQuery 3.0#Expanded QNames|Expanded QName]] notation of XQuery 3.0,
+
With the [[XQuery 3.0#Expanded QNames|Expanded QName]] notation of XQuery 3.0, the namespace can directly be embedded in the function call:
the namespace can directly be embedded in the function call:
 
  
<pre class="brush:xquery">
+
<syntaxhighlight lang="xquery">
 
Q{java:java.lang.Math}cos(xs:double(0))
 
Q{java:java.lang.Math}cos(xs:double(0))
</pre>
+
</syntaxhighlight>
  
 
The constructor of a class can be invoked by calling the virtual function {{Code|new()}}. Instance methods can then called by passing on the resulting Java object as first argument. In the following example, 256 bytes are written to the file {{Code|output.txt}}. First, a new {{Code|FileWriter}} instance is created, and its {{Code|write()}} function is called in the next step:
 
The constructor of a class can be invoked by calling the virtual function {{Code|new()}}. Instance methods can then called by passing on the resulting Java object as first argument. In the following example, 256 bytes are written to the file {{Code|output.txt}}. First, a new {{Code|FileWriter}} instance is created, and its {{Code|write()}} function is called in the next step:
  
<pre class="brush:xquery">declare namespace fw = "java.io.FileWriter";
+
<syntaxhighlight lang="xquery">
 +
declare namespace fw = 'java:java.io.FileWriter';
 
let $file := fw:new('output.txt')
 
let $file := fw:new('output.txt')
 
return (
 
return (
Line 77: Line 94:
 
   fw:close($file)
 
   fw:close($file)
 
)
 
)
</pre>
+
</syntaxhighlight>
  
If the result of a Java call contains invalid XML characters, it will be rejected. The validity check can be disabled by setting the [[Options#CHECKSTRINGS|CHECKSTRINGS]] option to false. The following query writes a file with a single 00-byte, which will then be successfully read via Java functions:
+
If the result of a Java call contains invalid XML characters, it will be rejected. The validity check can be disabled by setting {{Option|CHECKSTRINGS}} to false. In the example below, a file with a single {{Code|00}} byte is written, and this file will then be accessed by via Java functions:
  
<pre class="brush:xquery">
+
<syntaxhighlight lang="xquery">
declare namespace br = 'java.io.BufferedReader';
+
declare namespace br = 'java:java.io.BufferedReader';
declare namespace fr = 'java.io.FileReader';
+
declare namespace fr = 'java:java.io.FileReader';
  
 
declare option db:checkstrings 'false';
 
declare option db:checkstrings 'false';
  
 +
(: write file :)
 
file:write-binary('00.bin', xs:hexBinary('00')),
 
file:write-binary('00.bin', xs:hexBinary('00')),
br:new(fr:new('00.bin')) ! (br:readLine(.), br:close(.))
+
(: read file :)
</pre>
+
let $br := br:new(fr:new('00.bin'))
 +
return (
 +
  br:readLine($br),  
 +
  br:close($br)
 +
)
 +
</syntaxhighlight>
 +
 
 +
The option can also be specified via a pragma:
  
Note that Java code cannot be pre-compiled, and will as such be evaluated slower than optimized XQuery code.
+
<syntaxhighlight lang="xquery">
 +
(# db:checkstrings #) {
 +
  br:new(fr:new('00.bin')) ! (br:readLine(.), br:close(.))
 +
}
 +
</syntaxhighlight>
  
 
=Module Imports=
 
=Module Imports=
  
Java code can also be integrated by ''importing'' classes as modules. A new instance of the addressed class is created, which can then be accessed in the query body.
+
A Java classes can also be instantiated by ''importing'' them as a module: A new instance of the addressed class will be constructed, which can then be referenced in the query body.
  
The following, side-effecting example returns the number of distinct values added to a hash set (the boolean values returned by {{Code|set:add()}} will be swallowed):
+
In the (side-effecting) example below, a HashSet instance is created, values are added, and the size of the set is returned. As {{Code|set:add()}} returns boolean values, {{Function|Profiling|prof:void}} is used to swallow the values:
  
<pre class="brush:xquery">
+
<syntaxhighlight lang="xquery">
import module namespace set = "java.util.HashSet";
+
import module namespace set = "java:java.util.HashSet";
 
prof:void(
 
prof:void(
 
   for $s in ("one", "two", "one")
 
   for $s in ("one", "two", "one")
Line 106: Line 135:
 
),
 
),
 
set:size()
 
set:size()
</pre>
+
</syntaxhighlight>
 +
 
 +
The execution of imported classes is more efficient than the execution of instances that have been created via {{Code|new()}}. In turn, no arguments can be supplied in the import statement, and the construction will only be successful if the class can be instantiated without arguments.
 +
 
 +
=Integration=
  
The advantages of this approach is that imported code is executed faster than instances created at runtime via {{Code|new()}}. A drawback is that no arguments can be passed on to the class constructor. As a consequence, the import only works if the class provides a constructor with no arguments.
+
Java classes can be coupled more closely to BaseX. If a class inherits the abstract [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/QueryModule.java QueryModule] class, the two variables [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/QueryContext.java queryContext] and [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/StaticContext.java staticContext] get available, which provide access to the global and static context of a query.
  
=Context-Awareness=
+
The [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/QueryResource.java QueryResource] interface can be implemented to enforce finalizing operations, such as the closing of opened connections or resources in a module. Its {{Code|close()}} method will be called after the XQuery expression has been fully evaluated.
  
Java classes can be coupled more closely to the BaseX core library.
+
==Annotations==
If a class inherits the abstract [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/QueryModule.java QueryModule] class, the two variables [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/QueryContext.java queryContext] and [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/StaticContext.java staticContext] get available, which provide access to the global and static context of a query. Additionally, the default properties of functions can be changed via annotations:
 
  
* Java functions can only be executed by users with [[User_Management|Admin permissions]]. You may annotate a function with {{Code|@Requires(<Permission>)}} to also make it accessible to users with less privileges.
+
The internal properties of functions can be assigned via annotations:
* Java code is treated as ''non-deterministic'', as its behavior cannot be predicted by the XQuery processor. You may annotate a function as {{Code|@Deterministic}} if you know that it will have no side-effects and will always yield the same result.
+
 
 +
* Java functions can only be executed by users with [[User_Management|Admin permissions]]. You can annotate a function with {{Code|@Requires(<Permission>)}} to also make it accessible to users with fewer privileges.
 +
* Java code is treated as ''non-deterministic'', as its behavior cannot be predicted by the XQuery processor. You may annotate a function as {{Code|@Deterministic}} if you know that it will have no side effects and will always yield the same result.
 
* Java code is treated as ''context-independent''. If a function accesses the query context, it should be annotated as {{Code|@ContextDependent}}
 
* Java code is treated as ''context-independent''. If a function accesses the query context, it should be annotated as {{Code|@ContextDependent}}
 
* Java code is treated as ''focus-independent''. If a function accesses the current context item, position or size, it should be annotated as {{Code|@FocusDependent}}
 
* Java code is treated as ''focus-independent''. If a function accesses the current context item, position or size, it should be annotated as {{Code|@FocusDependent}}
  
The [https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/query/QueryResource.java QueryResource] interface can be implemented to enforce finalizing operations, such as the closing of opened connections or resources in a module. Its {{Code|close()}} method will be called after a query has been fully evaluated.
+
In the following code, information from the static query context is returned by the first function, and a query exception is raised by the second function:
  
The following XQuery code invokes two Java methods. The first Java function retrieves information from the static query context, and the second one throws a query exception:
+
<syntaxhighlight lang="xquery">
 
 
<pre class="brush:xquery">
 
 
import module namespace context = 'org.basex.examples.query.ContextModule';
 
import module namespace context = 'org.basex.examples.query.ContextModule';
  
Line 130: Line 162:
 
   context:user()
 
   context:user()
 
},
 
},
element to-int {
+
try {
  try { context:to-int('abc') }
+
  element to-int { context:to-int('abc') }
   catch * { 'Error in line', $err:line-number }
+
} catch basex:error {
 +
   element error { $err:description }
 
}
 
}
</pre>
+
</syntaxhighlight>
  
 
The imported Java class is shown below:
 
The imported Java class is shown below:
  
<pre class="brush:java">
+
<syntaxhighlight lang="java">
 
package org.basex.examples.query;
 
package org.basex.examples.query;
  
Line 151: Line 184:
 
public class ContextModule extends QueryModule implements QueryResource {
 
public class ContextModule extends QueryModule implements QueryResource {
 
   /**
 
   /**
   * Returns the name of the logged in user.
+
   * Returns the name of the logged-in user.
   * @return user
+
   * @return user string
 
   */
 
   */
 
   @Requires(Permission.NONE)
 
   @Requires(Permission.NONE)
Line 163: Line 196:
 
   /**
 
   /**
 
   * Converts the specified string to an integer.
 
   * Converts the specified string to an integer.
   * @param value string representation
+
   * @param value string to be converted
   * @return integer
+
   * @return resulting integer
 
   * @throws QueryException query exception
 
   * @throws QueryException query exception
 
   */
 
   */
Line 173: Line 206:
 
       return Integer.parseInt(value);
 
       return Integer.parseInt(value);
 
     } catch(NumberFormatException ex) {
 
     } catch(NumberFormatException ex) {
       throw new QueryException(ex.getMessage());
+
       throw new QueryException("Integer conversion failed: " + value);
 
     }
 
     }
 
   }
 
   }
Line 179: Line 212:
 
   @Override
 
   @Override
 
   public void close() {
 
   public void close() {
     // see description above
+
     // defined in QueryResource interface, will be called after query evaluation
 
   }
 
   }
 
}
 
}
</pre>
+
</syntaxhighlight>
  
 
The result will look as follows:
 
The result will look as follows:
  
<pre class="brush:xml">
+
<syntaxhighlight lang="xml">
 
<user>admin</admin>
 
<user>admin</admin>
<to-int>Error in line 6</to-int>
+
<error>Integer conversion failed: abc</error>
</pre>
+
</syntaxhighlight>
  
 
Please visit the XQuery 3.0 specification if you want to get more insight into
 
Please visit the XQuery 3.0 specification if you want to get more insight into
[http://www.w3.org/TR/xpath-functions-30/#properties-of-functions function properties].
+
[https://www.w3.org/TR/xpath-functions-31/#properties-of-functions function properties].
  
=Locking=
+
==Updates==
  
By default, a Java function will be executed in parallel with other code. However, if a Java function performs sensitive write operations, it is advisable to explicitly lock the code. This can be realized via locking annotations:
+
The {{Code|@Updating}} annotation can be applied to mark Java functions that perform write or update operations:
  
<pre class="brush:java">
+
<syntaxhighlight lang="java">
   @Lock(write = { "HEAVYIO" })
+
   @Updating
   public void write() {
+
   public void backup() {
 
     // ...
 
     // ...
 
   }
 
   }
 +
</syntaxhighlight>
  
   @Lock(read = { "HEAVYIO" })
+
An XQuery expression will be handled as an [[XQuery Update#Updating Expressions|updating expression]] if it calls an updating Java function. In contrast to XQuery update operations, the Java code will immediately be executed, but the result will be cached as if {{Function|Update|update:output}} was called.
 +
 
 +
The annotation is particularly helpful if combined with a lock annotation.
 +
 
 +
==Locking==
 +
 
 +
By default, a Java function will be executed in parallel with other code. If a Java function performs sensitive operations, it is advisable to explicitly lock the code.
 +
 
 +
===Java Locks===
 +
 
 +
Java provides a handful of mechanism to control the execution of code. The concurrent execution of functions can be avoided with the {{Code|synchronized}} keyword. For more complex scenarios, the Lock, Semaphore and Atomic classes can be brought into play.
 +
 
 +
===XQuery Locks===
 +
 
 +
If you want to synchronize the execution of your code with BaseX locks, you can take advantage of the {{Code|@Lock}} annotation:
 +
 
 +
<syntaxhighlight lang="java">
 +
   @Lock("HEAVYIO")
 
   public void read() {
 
   public void read() {
 
     // ...
 
     // ...
 
   }
 
   }
</pre>
 
  
If an XQuery expression is run which calls the Java {{Code|write()}} function, every other query that calls {{Code|write()}} or {{Code|read()}} needs to wait for the query to be finished. If a query calls the {{Code|read()}} function, only those queries are queued that call {{Code|write()}}, because this function is only annotated with a {{Code|read}} lock. More details on parallel query execution can be found in the article on [[Transaction Management]].
+
  @Updating
 +
  @Lock("HEAVYIO")
 +
  public void write() {
 +
    // ...
 +
  }
 +
</syntaxhighlight>
 +
 
 +
If an XQuery expression invokes {{Code|write()}}, any other query that calls {{Code|write()}} or {{Code|read()}} needs to wait for the query to be finished. The {{Code|read()}} function can be run in parallel; whereas queries will be queued if {{Code|write()}} is called.
 +
 
 +
More details on concurrent querying can be found in the article on [[Transaction Management]].
 +
 
 +
==Data Types==
 +
 
 +
===Conversion to Java===
 +
 
 +
Before Java code is executed, the arguments are converted to Java values, depending on the addressed function or constructor parameters. The accepted Java types and the original XQuery types are depicted in the second and first column of the table below.
 +
 
 +
If a numeric value is supplied for which no exact matching is defined, it is cast to the appropriate type unless it exceeds its limits. The following two function calls are equivalent:
 +
 
 +
<syntaxhighlight lang="xquery">
 +
(: exact match :)
 +
Q{String}codePointAt('ABC', xs:int(1)),
 +
(: xs:byte and xs:integer casts :)
 +
Q{String}codePointAt('ABC', xs:byte(1)),
 +
Q{String}codePointAt('ABC', 1)
 +
</syntaxhighlight>
 +
 
 +
===Conversion to XQuery===
 +
 
 +
By default, Java values with the most common types (as shown in the second and third column of the table) are converted to XQuery values. All other values are returned as ''Java items'', which are function items with a wrapped Java value. The results of constructor calls are always returned as Java items.
 +
 
 +
The conversion of the wrapped Java value to XQuery is enforced by invoking the function item: Values in {{Code|Iterator}} and {{Code|Iterable}} instances (Lists, Sets and Collections) are converted to items, and maps are converted to XQuery maps:
 +
 
 +
<syntaxhighlight lang="xquery">
 +
declare namespace Scanner = 'java:java.util.Scanner';
 +
let $scanner := Scanner:new("A B C") => Scanner:useDelimiter(" ")
 +
return $scanner()
 +
</syntaxhighlight>
 +
 
 +
If no conversion is defined, a string is returned, resulting from the {{Code|toString()}} method of the object. This method is also called is the string representation of a Java item is requested:
  
=Data Types=
+
<syntaxhighlight lang="xquery">
 +
(: returns the string representations of a HashMap and an ArrayList instance :)
 +
'Map: ' || Q{java.util.HashMap}new(),
 +
string(Q{java:java.util.ArrayList}new())
 +
</syntaxhighlight>
  
The following table lists the mappings of XQuery and Java types:
+
The conversion can be further controlled with the {{Option|WRAPJAVA}} option. The following values exist:
  
 
{| class="wikitable"
 
{| class="wikitable"
 
|- valign="top"
 
|- valign="top"
! XQuery Type
+
! Value
! Java Type
+
! Description
 +
|- valign="top"
 +
| {{Code|some}}
 +
| The default: Java values of the most common types are converted, others are wrapped into Java items.
 
|- valign="top"
 
|- valign="top"
 +
| {{Code|none}}
 +
| All Java values are converted. If no conversion is defined, a string is returned, resulting from the {{Code|toString()}} method.
 +
|- valign="top"
 +
| {{Code|all}}
 +
| Java values are wrapped into Java items (excluding those inheriting the internal type {{Code|org.basex.query.value.Value}}).
 +
|- valign="top"
 +
| {{Code|instance}}
 +
| If the method of a class instance was called, the Java value is ignored and the instance is wrapped into a Java item. Otherwise, the Java value is returned.
 +
|- valign="top"
 +
| {{Code|void}}
 +
| Java values are ignored, and an empty sequence is returned instead.
 +
|}
 +
 +
In the following example, the result of the first function – a char array – is wrapped and passed on to a {{Code|CharBuffer}} function. Without the option, the single-value array would be converted to an {{Code|xs:unsignedShort}} item and the second function call would fail:
 +
 +
<syntaxhighlight lang="xquery">
 +
(: Without the pragma, the result of toChars would be converted to an xs:unsignedShort item, and the second function call would fail :)
 +
 +
(# db:wrapjava all #) {
 +
  Q{Character}toChars(xs:int(33))
 +
  => Q{java.nio.CharBuffer}wrap()
 +
}
 +
</syntaxhighlight>
 +
 +
The next example demonstrates a use case for the {{Code|instance}} option:
 +
 +
<syntaxhighlight lang="xquery">
 +
(: Thanks to the pragma, the function calls can be chained :)
 +
 +
declare namespace set = 'java:java.util.HashSet';
 +
let $set := (# db:wrapjava instance #) {
 +
  set:new()
 +
  => set:add('1')
 +
  => set:add('2')
 +
}
 +
return $set()
 +
</syntaxhighlight>
 +
 +
The {{Code|void}} option is helpful if side-effecting methods return values that do not contribute to the final result:
 +
 +
<syntaxhighlight lang="xquery">
 +
(: Without the pragma, 100 booleans would be returned by the FLWOR expression :)
 +
 +
declare namespace set = 'java:java.util.HashSet';
 +
let $set := set:new()
 +
return (
 +
  (# db:wrapjava void #) {
 +
    for $i in 1 to 100
 +
    return set:add($set, $i)
 +
  },
 +
  $set()
 +
)
 +
</syntaxhighlight>
 +
 +
The irrelevant results could also be swallowed with {{Function|Profiling|prof:void}}.
 +
 +
{| class="wikitable"
 +
|- valign="top"
 +
! XQuery input
 +
! Expected or returned Java type
 +
! XQuery output
 +
|- valign="top"
 +
| <code>item()*</code> (no conversion)
 +
| <code>org.basex.query.value.Value</code>
 +
| <code>item()*</code> (no conversion)
 +
|- valign="top"
 +
| <code>empty-sequence()</code>
 +
| <code>null</code>
 +
| <code>empty-sequence()</code>
 +
|- valign="top"
 +
| <code>xs:string</code>, <code>xs:untypedAtomic</code>
 +
| <code>String</code>
 
| <code>xs:string</code>
 
| <code>xs:string</code>
| <code>String</code>, <code>char</code>, <code>Character</code>
+
|- valign="top"
 +
| <code>xs:unsignedShort</code>
 +
| <code>char</code>, <code>Character</code>
 +
| <code>xs:unsignedShort</code>
 
|- valign="top"
 
|- valign="top"
 
| <code>xs:boolean</code>
 
| <code>xs:boolean</code>
 
| <code>boolean</code>, <code>Boolean</code>
 
| <code>boolean</code>, <code>Boolean</code>
 +
| <code>xs:boolean</code>
 
|- valign="top"
 
|- valign="top"
 
| <code>xs:byte</code>
 
| <code>xs:byte</code>
 
| <code>byte</code>, <code>Byte</code>
 
| <code>byte</code>, <code>Byte</code>
 +
| <code>xs:byte</code>
 
|- valign="top"
 
|- valign="top"
 
| <code>xs:short</code>
 
| <code>xs:short</code>
 
| <code>short</code>, <code>Short</code>
 
| <code>short</code>, <code>Short</code>
 +
| <code>xs:short</code>
 
|- valign="top"
 
|- valign="top"
 
| <code>xs:int</code>
 
| <code>xs:int</code>
 
| <code>int</code>, <code>Integer</code>
 
| <code>int</code>, <code>Integer</code>
 +
| <code>xs:int</code>
 
|- valign="top"
 
|- valign="top"
| <code>xs:long</code>
+
| <code>xs:integer</code>, <code>xs:long</code>
 
| <code>long</code>, <code>Long</code>
 
| <code>long</code>, <code>Long</code>
 +
| <code>xs:integer</code>
 +
|- valign="top"
 +
| <code>xs:unsignedLong</code>
 +
| <code>java.math.BigInteger</code>
 +
| <code>xs:unsignedLong</code> (lossy)
 +
|- valign="top"
 +
| <code>xs:decimal</code>
 +
| <code>java.math.BigDecimal</code>
 +
| <code>xs:decimal</code>
 
|- valign="top"
 
|- valign="top"
 
| <code>xs:float</code>
 
| <code>xs:float</code>
 
| <code>float</code>, <code>Float</code>
 
| <code>float</code>, <code>Float</code>
 +
| <code>xs:float</code>
 
|- valign="top"
 
|- valign="top"
 
| <code>xs:double</code>
 
| <code>xs:double</code>
 
| <code>double</code>, <code>Double</code>
 
| <code>double</code>, <code>Double</code>
|- valign="top"
+
| <code>xs:double</code>
| <code>xs:decimal</code>
 
| <code>java.math.BigDecimal</code>
 
|- valign="top"
 
| <code>xs:integer</code>
 
| <code>java.math.BigInteger</code>
 
 
|- valign="top"
 
|- valign="top"
 
| <code>xs:QName</code>
 
| <code>xs:QName</code>
 
| <code>javax.xml.namespace.QName</code>
 
| <code>javax.xml.namespace.QName</code>
 +
| <code>xs:QName</code>
 
|- valign="top"
 
|- valign="top"
 
| <code>xs:anyURI</code>
 
| <code>xs:anyURI</code>
 
| <code>java.net.URI</code>, <code>java.net.URL</code>
 
| <code>java.net.URI</code>, <code>java.net.URL</code>
 +
| <code>xs:anyURI</code>
 +
|- valign="top"
 +
| <code>xs:date</code>
 +
| <code>javax.xml.datatype.XMLGregorianCalendar</code>
 +
| <code>xs:date</code>
 +
|- valign="top"
 +
| <code>xs:duration</code>
 +
| <code>javax.xml.datatype.Duration</code>
 +
| <code>xs:duration</code>
 +
|- valign="top"
 +
| <code>node()</code>
 +
| <code>org.w3c.dom.Node</code>
 +
| <code>node()</code>
 +
|- valign="top"
 +
| <code>array(xs:boolean)</code>
 +
| <code>boolean[]</code>
 +
| <code>xs:boolean*</code>
 +
|- valign="top"
 +
| <code>array(xs:string)</code>
 +
| <code>String[]</code>
 +
| <code>xs:string*</code>
 +
|- valign="top"
 +
| <code>array(xs:unsignedShort)</code>
 +
| <code>char[]</code>
 +
| <code>xs:unsignedShort*</code>
 +
|- valign="top"
 +
| <code>array(xs:short)</code>
 +
| <code>short[]</code>
 +
| <code>xs:short*</code>
 +
|- valign="top"
 +
| <code>array(xs:int)</code>
 +
| <code>int[]</code>
 +
| <code>xs:int*</code>
 +
|- valign="top"
 +
| <code>array(xs:integer)</code>, <code>array(xs:long)</code>
 +
| <code>long[]</code>
 +
| <code>xs:integer*</code>
 +
|- valign="top"
 +
| <code>array(xs:float)</code>
 +
| <code>float[]</code>
 +
| <code>xs:float*</code>
 +
|- valign="top"
 +
| <code>array(xs:double)</code>
 +
| <code>double[]</code>
 +
| <code>xs:double*</code>
 +
|- valign="top"
 +
| <code>Object[]</code> (others)
 +
| <code>item()*</code>
 +
| <code>array(*)</code> (others)
 
|- valign="top"
 
|- valign="top"
| ''empty sequence''
+
| <code>map(*)</code>
| <code>null</code>
+
| java.util.HashMap
 +
| <code>Wrapped Java object</code>
 
|}
 
|}
 +
 +
==URI Rewriting==
 +
 +
Before a Java class or module is accessed, its namespace URI will be normalized:
 +
 +
# If the URI is a URL:
 +
## colons will be replaced with slashes,
 +
## in the URI authority, the order of all substrings separated by dots is reversed, and
 +
## dots in the authority and the path are replaced by slashes. If no path exists, a single slash is appended.
 +
# Otherwise, if the URI is a URN, colons will be replaced with slashes.
 +
# Characters other than letters, dots and slashes will be replaced with dashes.
 +
# If the resulting string ends with a slash, the {{Code|index}} string is appended.
 +
 +
If the resulting path has no file suffix, it may point to either an XQuery module or a Java archive:
 +
 +
* {{Code|<nowiki>http://basex.org/modules/hello/World</nowiki>}} → {{Code|org/basex/modules/hello/World}}
 +
* {{Code|<nowiki>http://www.example.com</nowiki>}} → {{Code|com/example/www/index}}
 +
* {{Code|a/little/example}} → {{Code|a/little/example}}
 +
* {{Code|a:b:c}} → {{Code|a/b/c}}
  
 
=Changelog=
 
=Changelog=
 +
 +
; Version 9.6
 +
* Updated: Java Bindings revised (new mappings, Java functiom items, {{Option|WRAPJAVA}} option).
 +
 +
; Version 9.4
 +
* Added: Annotation for [[#Updates|updating functions]].
 +
* Updated: Single annotation for read and write locks.
  
 
; Version 8.4
 
; Version 8.4
 +
* Updated: Rewriting rules
  
* Updates: Rewriting rules
+
;Version 8.2
 +
* Added: [[#URI Rewriting|URI Rewriting]]: support for URNs
  
 
; Version 8.0
 
; Version 8.0
 
 
* Added: {{Code|QueryResource}} interface, called after a query has been fully evaluated.
 
* Added: {{Code|QueryResource}} interface, called after a query has been fully evaluated.
  
 
; Version 7.8
 
; Version 7.8
 
 
* Added: Java locking annotations
 
* Added: Java locking annotations
 
* Updated: {{Code|context}} variable has been split into {{Code|queryContext}} and {{Code|staticContext}}.
 
* Updated: {{Code|context}} variable has been split into {{Code|queryContext}} and {{Code|staticContext}}.
  
 
; Version 7.2.1
 
; Version 7.2.1
 
 
* Added: import of Java modules, context awareness
 
* Added: import of Java modules, context awareness
 +
* Added: [[#Packaging|Packaging]], [[#URI Rewriting|URI Rewriting]]

Latest revision as of 10:25, 18 August 2021

This article is part of the XQuery Portal. It demonstrates different ways to invoke Java code from XQuery, and it presents extensions to access the current query context from Java.

The Java Binding feature is an extensibility mechanism which enables developers to directly access Java variables and execute code from XQuery. Addressed Java code must either be contained in the Java classpath, or it must be located in the Repository.

Please bear in mind that the execution of Java code may cause side effects that conflict with the functional nature of XQuery, or may introduce new security risks to your project.

Updated with Version 9.6:

  • With the middle dot notation, three adjacent dots can be used to specify array types.
  • The path to the standard package java.lang. can now be omitted.
  • Java objects are now wrapped into function items.
  • Results of constructor calls are always returned as function item.
  • A new option WRAPJAVA was added to control how Java values are converted to XQuery.
  • The Mapping rules were refined and unified. The most important changes:
    • array(*) type added.
    • xs:integer values are converted to long values.
    • xs:unsignedShort values are converted to char values.
  • All error messages were revised and improved.

Identification[edit]

Classes[edit]

A Java class is identified by a namespace URI. The original URI is rewritten as follows:

  1. The URI Rewriting steps are applied to the URI.
  2. Slashes in the resulting URI are replaced with dots.
  3. The last path segment of the URI is capitalized and rewritten to CamelCase.

The normalization steps are skipped if the URI is prefixed with java:. The path to the standard package java.lang. can be omitted:

  • http://basex.org/modules/meta-dataorg.basex.modules.MetaData
  • java:java.lang.Stringjava.lang.String
  • StringBuilderjava.lang.StringBuilder

Functions and Variables[edit]

Java constructors, functions and variables can be referenced and evaluated by the existing XQuery function syntax:

  • The namespace of the function name identifies the Java class.
  • The local part of the name, which is rewritten to camel case, identifies a variable or function of that class.
  • The middle dot character · (&#xB7;, a valid character in XQuery names, but not in Java) can be used to append exact Java parameter types to the function name. Class types must be referenced by their full path. Three adjacent dots can be used to address an array argument.
Addressed code XQuery Java
Variable Q{Integer}MIN_VALUE() Integer.MIN_VALUE
Function Q{Object}hash-code($object) object.hashCode()
Function with argument Q{String}split·String·int($string, ';', xs:int(3)) string.split(";", 3)
Constructor with array argument Q{String}new·byte...(xs:hexBinary('414243')) new String(new byte[] { 41, 42, 43 })

As XQuery and Java have different type systems, XQuery arguments must be converted to equivalent Java values, and the result of a Java function is converted back to an XQuery value (see Data Types).

If the Java function you want to address is not detected, you may need to cast your values to the target type. For example, if a Java function expects a primitive int value, you will need to convert your XQuery integers to xs:int.

Namespace Declarations[edit]

In the following example, the Java Math class is referenced. When executed, the query returns the cosine of an angle by calling the static method cos(), and the value of π by addressing the static variable via PI():

declare namespace math = "java:java.lang.Math";
math:cos(xs:double(0)), math:PI()

With the Expanded QName notation of XQuery 3.0, the namespace can directly be embedded in the function call:

Q{java:java.lang.Math}cos(xs:double(0))

The constructor of a class can be invoked by calling the virtual function new(). Instance methods can then called by passing on the resulting Java object as first argument. In the following example, 256 bytes are written to the file output.txt. First, a new FileWriter instance is created, and its write() function is called in the next step:

declare namespace fw = 'java:java.io.FileWriter';
let $file := fw:new('output.txt')
return (
  for $i in 0 to 255
  return fw:write($file, xs:int($i)),
  fw:close($file)
)

If the result of a Java call contains invalid XML characters, it will be rejected. The validity check can be disabled by setting CHECKSTRINGS to false. In the example below, a file with a single 00 byte is written, and this file will then be accessed by via Java functions:

declare namespace br = 'java:java.io.BufferedReader';
declare namespace fr = 'java:java.io.FileReader';

declare option db:checkstrings 'false';

(: write file :)
file:write-binary('00.bin', xs:hexBinary('00')),
(: read file :)
let $br := br:new(fr:new('00.bin'))
return (
  br:readLine($br), 
  br:close($br)
)

The option can also be specified via a pragma:

(# db:checkstrings #) {
  br:new(fr:new('00.bin')) ! (br:readLine(.), br:close(.))
}

Module Imports[edit]

A Java classes can also be instantiated by importing them as a module: A new instance of the addressed class will be constructed, which can then be referenced in the query body.

In the (side-effecting) example below, a HashSet instance is created, values are added, and the size of the set is returned. As set:add() returns boolean values, prof:void is used to swallow the values:

import module namespace set = "java:java.util.HashSet";
prof:void(
  for $s in ("one", "two", "one")
  return set:add($s)
),
set:size()

The execution of imported classes is more efficient than the execution of instances that have been created via new(). In turn, no arguments can be supplied in the import statement, and the construction will only be successful if the class can be instantiated without arguments.

Integration[edit]

Java classes can be coupled more closely to BaseX. If a class inherits the abstract QueryModule class, the two variables queryContext and staticContext get available, which provide access to the global and static context of a query.

The QueryResource interface can be implemented to enforce finalizing operations, such as the closing of opened connections or resources in a module. Its close() method will be called after the XQuery expression has been fully evaluated.

Annotations[edit]

The internal properties of functions can be assigned via annotations:

  • Java functions can only be executed by users with Admin permissions. You can annotate a function with @Requires(<Permission>) to also make it accessible to users with fewer privileges.
  • Java code is treated as non-deterministic, as its behavior cannot be predicted by the XQuery processor. You may annotate a function as @Deterministic if you know that it will have no side effects and will always yield the same result.
  • Java code is treated as context-independent. If a function accesses the query context, it should be annotated as @ContextDependent
  • Java code is treated as focus-independent. If a function accesses the current context item, position or size, it should be annotated as @FocusDependent

In the following code, information from the static query context is returned by the first function, and a query exception is raised by the second function:

import module namespace context = 'org.basex.examples.query.ContextModule';

element user {
  context:user()
},
try {
  element to-int { context:to-int('abc') }
} catch basex:error {
  element error { $err:description }
}

The imported Java class is shown below:

package org.basex.examples.query;

import org.basex.query.*;
import org.basex.query.value.item.*;
import org.basex.util.*;

/**
 * This example inherits the {@link QueryModule} class and
 * implements the QueryResource interface.
 */
public class ContextModule extends QueryModule implements QueryResource {
  /**
   * Returns the name of the logged-in user.
   * @return user string
   */
  @Requires(Permission.NONE)
  @Deterministic
  @ContextDependent
  public String user() {
    return queryContext.context.user.name;
  }

  /**
   * Converts the specified string to an integer.
   * @param value string to be converted
   * @return resulting integer
   * @throws QueryException query exception
   */
  @Requires(Permission.NONE)
  @Deterministic
  public int toInt(final String value) throws QueryException {
    try {
      return Integer.parseInt(value);
    } catch(NumberFormatException ex) {
      throw new QueryException("Integer conversion failed: " + value);
    }
  }

  @Override
  public void close() {
    // defined in QueryResource interface, will be called after query evaluation
  }
}

The result will look as follows:

<user>admin</admin>
<error>Integer conversion failed: abc</error>

Please visit the XQuery 3.0 specification if you want to get more insight into function properties.

Updates[edit]

The @Updating annotation can be applied to mark Java functions that perform write or update operations:

  @Updating
  public void backup() {
    // ...
  }

An XQuery expression will be handled as an updating expression if it calls an updating Java function. In contrast to XQuery update operations, the Java code will immediately be executed, but the result will be cached as if update:output was called.

The annotation is particularly helpful if combined with a lock annotation.

Locking[edit]

By default, a Java function will be executed in parallel with other code. If a Java function performs sensitive operations, it is advisable to explicitly lock the code.

Java Locks[edit]

Java provides a handful of mechanism to control the execution of code. The concurrent execution of functions can be avoided with the synchronized keyword. For more complex scenarios, the Lock, Semaphore and Atomic classes can be brought into play.

XQuery Locks[edit]

If you want to synchronize the execution of your code with BaseX locks, you can take advantage of the @Lock annotation:

  @Lock("HEAVYIO")
  public void read() {
    // ...
  }

  @Updating
  @Lock("HEAVYIO")
  public void write() {
    // ...
  }

If an XQuery expression invokes write(), any other query that calls write() or read() needs to wait for the query to be finished. The read() function can be run in parallel; whereas queries will be queued if write() is called.

More details on concurrent querying can be found in the article on Transaction Management.

Data Types[edit]

Conversion to Java[edit]

Before Java code is executed, the arguments are converted to Java values, depending on the addressed function or constructor parameters. The accepted Java types and the original XQuery types are depicted in the second and first column of the table below.

If a numeric value is supplied for which no exact matching is defined, it is cast to the appropriate type unless it exceeds its limits. The following two function calls are equivalent:

(: exact match :)
Q{String}codePointAt('ABC', xs:int(1)),
(: xs:byte and xs:integer casts :)
Q{String}codePointAt('ABC', xs:byte(1)),
Q{String}codePointAt('ABC', 1)

Conversion to XQuery[edit]

By default, Java values with the most common types (as shown in the second and third column of the table) are converted to XQuery values. All other values are returned as Java items, which are function items with a wrapped Java value. The results of constructor calls are always returned as Java items.

The conversion of the wrapped Java value to XQuery is enforced by invoking the function item: Values in Iterator and Iterable instances (Lists, Sets and Collections) are converted to items, and maps are converted to XQuery maps:

declare namespace Scanner = 'java:java.util.Scanner';
let $scanner := Scanner:new("A B C") => Scanner:useDelimiter(" ")
return $scanner()

If no conversion is defined, a string is returned, resulting from the toString() method of the object. This method is also called is the string representation of a Java item is requested:

(: returns the string representations of a HashMap and an ArrayList instance :)
'Map: ' || Q{java.util.HashMap}new(),
string(Q{java:java.util.ArrayList}new())

The conversion can be further controlled with the WRAPJAVA option. The following values exist:

Value Description
some The default: Java values of the most common types are converted, others are wrapped into Java items.
none All Java values are converted. If no conversion is defined, a string is returned, resulting from the toString() method.
all Java values are wrapped into Java items (excluding those inheriting the internal type org.basex.query.value.Value).
instance If the method of a class instance was called, the Java value is ignored and the instance is wrapped into a Java item. Otherwise, the Java value is returned.
void Java values are ignored, and an empty sequence is returned instead.

In the following example, the result of the first function – a char array – is wrapped and passed on to a CharBuffer function. Without the option, the single-value array would be converted to an xs:unsignedShort item and the second function call would fail:

(: Without the pragma, the result of toChars would be converted to an xs:unsignedShort item, and the second function call would fail :)

(# db:wrapjava all #) {
  Q{Character}toChars(xs:int(33))
  => Q{java.nio.CharBuffer}wrap()
}

The next example demonstrates a use case for the instance option:

(: Thanks to the pragma, the function calls can be chained :)

declare namespace set = 'java:java.util.HashSet';
let $set := (# db:wrapjava instance #) {
  set:new()
  => set:add('1')
  => set:add('2')
}
return $set()

The void option is helpful if side-effecting methods return values that do not contribute to the final result:

(: Without the pragma, 100 booleans would be returned by the FLWOR expression :)

declare namespace set = 'java:java.util.HashSet';
let $set := set:new()
return (
  (# db:wrapjava void #) {
    for $i in 1 to 100
    return set:add($set, $i)
  },
  $set()
)

The irrelevant results could also be swallowed with prof:void.

XQuery input Expected or returned Java type XQuery output
item()* (no conversion) org.basex.query.value.Value item()* (no conversion)
empty-sequence() null empty-sequence()
xs:string, xs:untypedAtomic String xs:string
xs:unsignedShort char, Character xs:unsignedShort
xs:boolean boolean, Boolean xs:boolean
xs:byte byte, Byte xs:byte
xs:short short, Short xs:short
xs:int int, Integer xs:int
xs:integer, xs:long long, Long xs:integer
xs:unsignedLong java.math.BigInteger xs:unsignedLong (lossy)
xs:decimal java.math.BigDecimal xs:decimal
xs:float float, Float xs:float
xs:double double, Double xs:double
xs:QName javax.xml.namespace.QName xs:QName
xs:anyURI java.net.URI, java.net.URL xs:anyURI
xs:date javax.xml.datatype.XMLGregorianCalendar xs:date
xs:duration javax.xml.datatype.Duration xs:duration
node() org.w3c.dom.Node node()
array(xs:boolean) boolean[] xs:boolean*
array(xs:string) String[] xs:string*
array(xs:unsignedShort) char[] xs:unsignedShort*
array(xs:short) short[] xs:short*
array(xs:int) int[] xs:int*
array(xs:integer), array(xs:long) long[] xs:integer*
array(xs:float) float[] xs:float*
array(xs:double) double[] xs:double*
Object[] (others) item()* array(*) (others)
map(*) java.util.HashMap Wrapped Java object

URI Rewriting[edit]

Before a Java class or module is accessed, its namespace URI will be normalized:

  1. If the URI is a URL:
    1. colons will be replaced with slashes,
    2. in the URI authority, the order of all substrings separated by dots is reversed, and
    3. dots in the authority and the path are replaced by slashes. If no path exists, a single slash is appended.
  2. Otherwise, if the URI is a URN, colons will be replaced with slashes.
  3. Characters other than letters, dots and slashes will be replaced with dashes.
  4. If the resulting string ends with a slash, the index string is appended.

If the resulting path has no file suffix, it may point to either an XQuery module or a Java archive:

  • http://basex.org/modules/hello/Worldorg/basex/modules/hello/World
  • http://www.example.comcom/example/www/index
  • a/little/examplea/little/example
  • a:b:ca/b/c

Changelog[edit]

Version 9.6
  • Updated: Java Bindings revised (new mappings, Java functiom items, WRAPJAVA option).
Version 9.4
  • Added: Annotation for updating functions.
  • Updated: Single annotation for read and write locks.
Version 8.4
  • Updated: Rewriting rules
Version 8.2
Version 8.0
  • Added: QueryResource interface, called after a query has been fully evaluated.
Version 7.8
  • Added: Java locking annotations
  • Updated: context variable has been split into queryContext and staticContext.
Version 7.2.1