Difference between revisions of "Server Protocol"

From BaseX Documentation
Jump to navigation Jump to search
(12 intermediate revisions by the same user not shown)
Line 27: Line 27:
 
===Conventions===
 
===Conventions===
  
* <code>\x</code>: single byte.
+
* <code>\xx</code>: single byte.
* <code>{...}</code>: utf8 strings or raw data, suffixed with a <code>\0</code> byte. To avoid confusion with this end-of-string byte, all <code>\0</code> and <code>\FF</code> bytes that occur in raw data will be prefixed with <code>\FF</code>.
+
* <code>{...}</code>: utf8 strings or raw data, suffixed with a <code>\00</code> byte. To avoid confusion with this end-of-string byte, all transfered <code>\00</code> and <code>\FF</code> bytes are prefixed by an additional <code>\FF</code> byte.
  
 
===Authentication===
 
===Authentication===
  
{{Mark|Updated with Version 8.0}}: As unsalted md5 hashes can easily be uncovered using
+
====Digest====
rainbow tables, a light-weight variant of digest authentication is now used for
+
 
client/server communication:
+
Digest authentication is used since Version 8.0:
  
 
# Client connects to server socket
 
# Client connects to server socket
# Server sends a realm and nonce, separated by a colon: <code>{realm:nonce}</code>
+
# Server sends a '''realm''' and '''nonce''', separated by a colon: <code>{realm:nonce}</code>
# Client sends the user name and a hash value. The hash is composed of the md5 hash of
+
# Client sends the '''user name''' and a hash value. The hash is composed of the md5 hash of
## the md5 hash of the user name, realm and password (all separated by a colon), and
+
## the md5 hash of the '''user name''', '''realm''', and '''password''' (all separated by a colon), and
## the server nonce: <code>{username} {md5(md5(username:realm:password) + nonce)}</code>
+
## the '''nonce''': <code>{username} {md5(md5(username:realm:password) + nonce)}</code>
# Server replies with <code>\0</code> (success) or <code>\1</code> (error)
+
# Server replies with <code>\00</code> (success) or <code>\01</code> (error)
 +
 
 +
====CRAM-MD5====
  
====Legacy: cram-md5====
+
CRAM-MD5 was discarded, because unsalted md5 hashes could easily be uncovered
 +
using rainbow tables. However, most client bindings still provide support for
 +
the outdated handshaking, as it only slightly differs from the new protocol:
  
 
# Client connects to server socket
 
# Client connects to server socket
# Server sends a nonce (timestamp): <code>{nonce}</code>
+
# Server sends a '''nonce''' (timestamp): <code>{nonce}</code>
# Client sends the user name and a hash value. The hash is composed of the md5 hash of
+
# Client sends the '''user name''' and a hash value. The hash is composed of the md5 hash of
## the md5 of the password and
+
## the md5 of the '''password''' and
## the server nonce: <code>{username} {md5(md5(password) + nonce)}</code>
+
## the '''nonce''': <code>{username} {md5(md5(password) + nonce)}</code>
# Server replies with <code>\0</code> (success) or <code>\1</code> (error)
+
# Server replies with <code>\00</code> (success) or <code>\01</code> (error)
  
 
Clients can easily be implemented to both support {{Code|digest}} and {{Code|cram-md5}} authentication: If the first server response contains no colon, {{Code|cram-md5}} should be chosen.
 
Clients can easily be implemented to both support {{Code|digest}} and {{Code|cram-md5}} authentication: If the first server response contains no colon, {{Code|cram-md5}} should be chosen.
Line 67: Line 71:
 
| COMMAND
 
| COMMAND
 
| <code>{command}</code>
 
| <code>{command}</code>
| <code>{result} {info} \0</code>
+
| <code>{result} {info} \00</code>
 
| Executes a database command.
 
| Executes a database command.
 
|-
 
|-
 
| QUERY
 
| QUERY
| <code>\0 {query}</code>
+
| <code>\00 {query}</code>
| <code>{id} \0</code>
+
| <code>{id} \00</code>
 
| Creates a new query instance and returns its id.
 
| Creates a new query instance and returns its id.
 
|-
 
|-
 
| CREATE
 
| CREATE
| <code>\8 {name} {input}</code>
+
| <code>\08 {name} {input}</code>
| <code>{info} \0</code>
+
| <code>{info} \00</code>
 
| Creates a new database with the specified input (may be empty).
 
| Creates a new database with the specified input (may be empty).
 
|-
 
|-
 
| ADD
 
| ADD
| <code>\9 {name} {path} {input}</code>
+
| <code>\09 {name} {path} {input}</code>
| <code>{info} \0</code>
+
| <code>{info} \00</code>
 
| Adds a new resource to the opened database.
 
| Adds a new resource to the opened database.
|-
 
| WATCH
 
| <code>\10 {name}</code>
 
| <code>{info} \0</code>
 
| Registers the client for the specified event.
 
|-
 
| UNWATCH
 
| <code>\11 {name}</code>
 
| <code>{info} \0</code>
 
| Unregisters the client.
 
 
|-
 
|-
 
| REPLACE
 
| REPLACE
| <code>\12 {path} {input}</code>
+
| <code>\0C {path} {input}</code>
| <code>{info} \0</code>
+
| <code>{info} \00</code>
 
| Replaces a resource with the specified input.
 
| Replaces a resource with the specified input.
 
|-
 
|-
 
| STORE
 
| STORE
| <code>\13 {path} {input}</code>
+
| <code>\0D {path} {input}</code>
| <code>{info} \0</code>
+
| <code>{info} \00</code>
 
| Stores a binary resource in the opened database.
 
| Stores a binary resource in the opened database.
 
|-
 
|-
 
| ↯ error
 
| ↯ error
 
| <code></code>
 
| <code></code>
| <code>{</code>''partial result''<code>} {error} \1</code>
+
| <code>{</code>''partial result''<code>} {error} \01</code>
 
| Error feedback.
 
| Error feedback.
 
|}
 
|}
Line 123: Line 117:
 
|-
 
|-
 
| CLOSE
 
| CLOSE
| <code>\2 {id}</code>
+
| <code>\02 {id}</code>
| <code>\0 \0</code>
+
| <code>\00 \00</code>
 
| Closes and unregisters the query with the specified id.
 
| Closes and unregisters the query with the specified id.
 
|-
 
|-
 
| BIND
 
| BIND
| <code>\3 {id} {name} {value} {type}</code>
+
| <code>\03 {id} {name} {value} {type}</code>
| <code>\0 \0</code>
+
| <code>\00 \00</code>
 
| Binds a value to a variable. The type will be ignored if the string is empty.
 
| Binds a value to a variable. The type will be ignored if the string is empty.
 
|-
 
|-
 
| RESULTS
 
| RESULTS
| <code>\4 {id}</code>
+
| <code>\04 {id}</code>
| <code>\x {item} ... \x {item} \0</code>
+
| <code>\xx {item} ... \xx {item} \00</code>
| Returns all resulting items as strings, prefixed by a single byte ({{Code|\x}}) that represents the [[Server Protocol: Types|Type ID]]. This command is called by the {{Code|more()}} function of a client implementation.
+
| Returns all resulting items as strings, prefixed by a single byte ({{Code|\xx}}) that represents the [[Server Protocol: Types|Type ID]]. This command is called by the {{Code|more()}} function of a client implementation.
 
|-
 
|-
 
| EXECUTE
 
| EXECUTE
| <code>\5 {id}</code>
+
| <code>\05 {id}</code>
| <code>{result} \0</code>
+
| <code>{result} \00</code>
| Executes the query and returns all results as a single string.
+
| Executes the query and returns the result as a single string.
 
|-
 
|-
 
| INFO
 
| INFO
| <code>\6 {id}</code>
+
| <code>\06 {id}</code>
| <code>{result} \0</code>
+
| <code>{result} \00</code>
 
| Returns a string with query compilation and profiling info.
 
| Returns a string with query compilation and profiling info.
 
|-
 
|-
 
| OPTIONS
 
| OPTIONS
| <code>\7 {id}</code>
+
| <code>\07 {id}</code>
| <code>{result} \0</code>
+
| <code>{result} \00</code>
| Returns a string with all query serialization parameters.
+
| Returns a string with all query serialization parameters, which can e.g. be assigned to the [[Options#SERIALIZER|SERIALIZER]] option.
 
|-
 
|-
 
| CONTEXT
 
| CONTEXT
| <code>\14 {id} {value} {type}</code>
+
| <code>\0E {id} {value} {type}</code>
| <code>\0 \0</code>
+
| <code>\00 \00</code>
 
| Binds a value to the context. The type will be ignored if the string is empty.
 
| Binds a value to the context. The type will be ignored if the string is empty.
 
|-
 
|-
 
| UPDATING
 
| UPDATING
| <code>\30 {id}</code>
+
| <code>\1E {id}</code>
| <code>{result} \0</code>
+
| <code>{result} \00</code>
| Returns {{Code|true}} if the query may perform updates; {{Code|false}} otherwise.
+
| Returns {{Code|true}} if the query contains updating expressions; {{Code|false}} otherwise.
 
|-
 
|-
 
| FULL
 
| FULL
| <code>\31 {id}</code>
+
| <code>\1F {id}</code>
| <code>''XDM'' {item} ... ''XDM'' {item} \0</code>
+
| <code>''XDM'' {item} ... ''XDM'' {item} \00</code>
 
| Returns all resulting items as strings, prefixed by the [[Server Protocol: Types#XDM Meta Data|XDM Meta Data]]. This command is e. g. used by the [[Developing|XQJ API]].
 
| Returns all resulting items as strings, prefixed by the [[Server Protocol: Types#XDM Meta Data|XDM Meta Data]]. This command is e. g. used by the [[Developing|XQJ API]].
 
|}
 
|}
  
As can be seen in the table, all results end with a single {{Code|\0}} byte, which indicates that the process was successful. If an error occurs, an additional byte {{Code|\1}} is sent, which is then followed by the {{Code|error}} message string.
+
As can be seen in the table, all results end with a single {{Code|\00}} byte, which indicates that the process was successful. If an error occurs, an additional byte {{Code|\01}} is sent, which is then followed by the {{Code|error}} message string.
  
 
====Binding Sequences====
 
====Binding Sequences====
  
{{Mark|Since Version 8.0}}, also sequences can be bound to variables and the context:
+
Also sequences can be bound to variables and the context:
  
 
* {{Code|empty-sequence()}} must be supplied as type if an empty sequence is to be bound.
 
* {{Code|empty-sequence()}} must be supplied as type if an empty sequence is to be bound.
* Multiple items are supplied via the <code>{value}</code> argument and separated with {{Code|\1}} bytes.
+
* Multiple items are supplied via the <code>{value}</code> argument and separated with {{Code|\01}} bytes.
* Item types are specified by appending {{Code|\2}} and the type in its string representation to an item. If no item type is specified, the general type is used.
+
* Item types are specified by appending {{Code|\02}} and the type in its string representation to an item. If no item type is specified, the general type is used.
  
 
Some examples for the <code>{value}</code> argument:
 
Some examples for the <code>{value}</code> argument:
  
* the two integers {{Code|123}} and {{Code|789}} are encoded as {{Code|123}}, {{Code|\1}}, {{Code|789}} and {{Code|\0}} ({{Code|xs:integer}} may be specified via the <code>{type}</code> argument).
+
* the two integers {{Code|123}} and {{Code|789}} are encoded as {{Code|123}}, {{Code|\01}}, {{Code|789}} and {{Code|\00}} ({{Code|xs:integer}} may be specified via the <code>{type}</code> argument).
* the two items {{Code|xs:integer(123)}} and {{Code|xs:string('ABC')}} are encoded as {{Code|123}}, {{Code|\2}}, {{Code|xs:integer}}, {{Code|\1}}, {{Code|ABC}}, {{Code|\2}}, {{Code|xs:string}} and {{Code|\0}}.
+
* the two items {{Code|xs:integer(123)}} and {{Code|xs:string('ABC')}} are encoded as {{Code|123}}, {{Code|\02}}, {{Code|xs:integer}}, {{Code|\01}}, {{Code|ABC}}, {{Code|\02}}, {{Code|xs:string}} and {{Code|\00}}.
  
 
==Example==
 
==Example==
Line 188: Line 182:
  
 
* '''Client''' connects to the database server socket
 
* '''Client''' connects to the database server socket
* '''Server''' sends timestamp "1369578179679": {{Code|◄ 31 33 36 39 35 37 38 31 37 39 36 37 39 00}}
+
* '''Server''' sends realm and timestamp "BaseX:1369578179679": {{Code|◄ 42 61 73 65 58 3A 31 33 36 39 35 37 38 31 37 39 36 37 39 00}}
 
* '''Client''' sends user name "jack": {{Code|6A 61 63 6B 00 ►}}
 
* '''Client''' sends user name "jack": {{Code|6A 61 63 6B 00 ►}}
* '''Client''' additionally sends hashed password/timestamp combination: md5(md5("topsecret") + "1369578179679") = "66442c0e3b5af8b9324f7e31b7f5cca8": {{Code|36 36 34 ... 00 ►}}
+
* '''Client''' sends hash: md5(md5("jack:BaseX:topsecret") + "1369578179679") = "ca664a31f8deda9b71ea3e79347f6666": {{Code|63 61 36 ... 00 ►}}
 
* '''Server''' replies with success code: {{Code|◄ 00}}
 
* '''Server''' replies with success code: {{Code|◄ 00}}
 
* '''Client''' sends the "INFO" command: {{Code|49 4E 46 4F 00 ►}}
 
* '''Client''' sends the "INFO" command: {{Code|49 4E 46 4F 00 ►}}
Line 199: Line 193:
 
* '''Client''' requests the query results via the RESULTS protocol command and its query id: {{Code|04 31 00 ►}}
 
* '''Client''' requests the query results via the RESULTS protocol command and its query id: {{Code|04 31 00 ►}}
 
* '''Server''' returns the first result ("1", type xs:integer): {{Code|◄ 52 31 00}}
 
* '''Server''' returns the first result ("1", type xs:integer): {{Code|◄ 52 31 00}}
* '''Server''' sends a single "\0" byte instead of a new result, which indicates that no more results can be expected: {{Code|◄ 00}}
+
* '''Server''' sends a single {{Code|\00}} byte instead of a new result, which indicates that no more results can be expected: {{Code|◄ 00}}
* '''Server''' sends the error code "\1" and the error message ("Stopped at..."): {{Code|◄ 01 53 74 6f ... 00}}
+
* '''Server''' sends the error code {{Code|\01}} and the error message ("Stopped at..."): {{Code|◄ 01 53 74 6f ... 00}}
 
* '''Client''' closes the query instance: {{Code|02 31 00 ►}}
 
* '''Client''' closes the query instance: {{Code|02 31 00 ►}}
 
* '''Server''' sends a response (which is equal to an empty info string) and success code: {{Code|◄ 00 00}}
 
* '''Server''' sends a response (which is equal to an empty info string) and success code: {{Code|◄ 00 00}}
Line 226: Line 220:
  
 
* Store raw data at the specified path:<br/><code>void store(String path, InputStream input)</code>
 
* Store raw data at the specified path:<br/><code>void store(String path, InputStream input)</code>
 
* Watch the specified event:<br/><code>void watch(String name, Event notifier)</code>
 
 
* Unwatch the specified event:<br/><code>void unwatch(String name)</code>
 
  
 
* Return process information:<br/><code>String info()</code>
 
* Return process information:<br/><code>String info()</code>
Line 262: Line 252:
  
 
=Changelog=
 
=Changelog=
 +
 +
;Version 8.2
 +
 +
* Removed: {{Code|WATCH}} and {{Code|UNWATCH}} command
  
 
;Version 8.0
 
;Version 8.0
Line 274: Line 268:
  
 
</div>
 
</div>
[[Category:Developer]]
 
[[Category:Server]]
 
[[Category:API]]
 

Revision as of 14:53, 17 January 2016

This page presents the classes and functions of the BaseX Clients, and the underlying protocol, which is utilized for communicating with the database server. A detailed example demonstrates how a concrete byte exchange can look like.

Workflow

  • All clients are based on the client/server architecture. Hence, a BaseX database server must be started first.
  • Each client provides a session class or script with methods to connect to and communicate with the database server. A socket connection will be established by the constructor, which expects a host, port, user name and password as arguments.
  • The execute() method is called to launch a database command. It returns the result or throws an exception with the received error message.
  • The query() method creates a query instance. Variables and the context item can be bound to that instance, and the result can either be requested via execute(), or in an iterative manner with the more() and next() functions. If an error occurs, an exception will be thrown.
  • The create(), add(), replace() and store() method pass on input streams to the corresponding database commands.
  • To speed up execution, an output stream can be specified by some clients; this way, all results will be directed to that output stream.
  • Most clients are accompanied by some example files, which demonstrate how database commands can be executed or how queries can be evaluated.

Transfer Protocol

All Clients use the following client/server protocol to communicate with the server. The description of the protocol is helpful if you want to implement your own client.

Conventions

  • \xx: single byte.
  • {...}: utf8 strings or raw data, suffixed with a \00 byte. To avoid confusion with this end-of-string byte, all transfered \00 and \FF bytes are prefixed by an additional \FF byte.

Authentication

Digest

Digest authentication is used since Version 8.0:

  1. Client connects to server socket
  2. Server sends a realm and nonce, separated by a colon: {realm:nonce}
  3. Client sends the user name and a hash value. The hash is composed of the md5 hash of
    1. the md5 hash of the user name, realm, and password (all separated by a colon), and
    2. the nonce: {username} {md5(md5(username:realm:password) + nonce)}
  4. Server replies with \00 (success) or \01 (error)

CRAM-MD5

CRAM-MD5 was discarded, because unsalted md5 hashes could easily be uncovered using rainbow tables. However, most client bindings still provide support for the outdated handshaking, as it only slightly differs from the new protocol:

  1. Client connects to server socket
  2. Server sends a nonce (timestamp): {nonce}
  3. Client sends the user name and a hash value. The hash is composed of the md5 hash of
    1. the md5 of the password and
    2. the nonce: {username} {md5(md5(password) + nonce)}
  4. Server replies with \00 (success) or \01 (error)

Clients can easily be implemented to both support digest and cram-md5 authentication: If the first server response contains no colon, cram-md5 should be chosen.

Command Protocol

The following byte sequences are sent and received from the client (please note that a specific client may not support all of the presented commands):

Command Client Request Server Response Description
COMMAND {command} {result} {info} \00 Executes a database command.
QUERY \00 {query} {id} \00 Creates a new query instance and returns its id.
CREATE \08 {name} {input} {info} \00 Creates a new database with the specified input (may be empty).
ADD \09 {name} {path} {input} {info} \00 Adds a new resource to the opened database.
REPLACE \0C {path} {input} {info} \00 Replaces a resource with the specified input.
STORE \0D {path} {input} {info} \00 Stores a binary resource in the opened database.
↯ error {partial result} {error} \01 Error feedback.

Query Command Protocol

Queries are referenced via an id, which has been returned by the QUERY command (see above).

Query Command Client Request Server Response Description
CLOSE \02 {id} \00 \00 Closes and unregisters the query with the specified id.
BIND \03 {id} {name} {value} {type} \00 \00 Binds a value to a variable. The type will be ignored if the string is empty.
RESULTS \04 {id} \xx {item} ... \xx {item} \00 Returns all resulting items as strings, prefixed by a single byte (\xx) that represents the Type ID. This command is called by the more() function of a client implementation.
EXECUTE \05 {id} {result} \00 Executes the query and returns the result as a single string.
INFO \06 {id} {result} \00 Returns a string with query compilation and profiling info.
OPTIONS \07 {id} {result} \00 Returns a string with all query serialization parameters, which can e.g. be assigned to the SERIALIZER option.
CONTEXT \0E {id} {value} {type} \00 \00 Binds a value to the context. The type will be ignored if the string is empty.
UPDATING \1E {id} {result} \00 Returns true if the query contains updating expressions; false otherwise.
FULL \1F {id} XDM {item} ... XDM {item} \00 Returns all resulting items as strings, prefixed by the XDM Meta Data. This command is e. g. used by the XQJ API.

As can be seen in the table, all results end with a single \00 byte, which indicates that the process was successful. If an error occurs, an additional byte \01 is sent, which is then followed by the error message string.

Binding Sequences

Also sequences can be bound to variables and the context:

  • empty-sequence() must be supplied as type if an empty sequence is to be bound.
  • Multiple items are supplied via the {value} argument and separated with \01 bytes.
  • Item types are specified by appending \02 and the type in its string representation to an item. If no item type is specified, the general type is used.

Some examples for the {value} argument:

  • the two integers 123 and 789 are encoded as 123, \01, 789 and \00 (xs:integer may be specified via the {type} argument).
  • the two items xs:integer(123) and xs:string('ABC') are encoded as 123, \02, xs:integer, \01, ABC, \02, xs:string and \00.

Example

In the following example, a client registers a new session and executes the INFO database command. Next, it creates a new query instance for the XQuery expression 1, 2+'3'. The query is then evaluated, and the server returns the result of the first subexpression 1 and an error for the second sub expression. Finally, the query instance and client session are closed.

  • Client connects to the database server socket
  • Server sends realm and timestamp "BaseX:1369578179679": ◄ 42 61 73 65 58 3A 31 33 36 39 35 37 38 31 37 39 36 37 39 00
  • Client sends user name "jack": 6A 61 63 6B 00 ►
  • Client sends hash: md5(md5("jack:BaseX:topsecret") + "1369578179679") = "ca664a31f8deda9b71ea3e79347f6666": 63 61 36 ... 00 ►
  • Server replies with success code: ◄ 00
  • Client sends the "INFO" command: 49 4E 46 4F 00 ►
  • Server responds with the result "General Information...": ◄ 47 65 6e 65 ... 00
  • Server additionally sends an (empty) info string: ◄ 00
  • Client creates a new query instance for the XQuery "1, 2+'3'": 00 31 2C 20 32 2B 27 33 27 00 ►
  • Server returns query id "1" and a success code: ◄ 31 00 00
  • Client requests the query results via the RESULTS protocol command and its query id: 04 31 00 ►
  • Server returns the first result ("1", type xs:integer): ◄ 52 31 00
  • Server sends a single \00 byte instead of a new result, which indicates that no more results can be expected: ◄ 00
  • Server sends the error code \01 and the error message ("Stopped at..."): ◄ 01 53 74 6f ... 00
  • Client closes the query instance: 02 31 00 ►
  • Server sends a response (which is equal to an empty info string) and success code: ◄ 00 00
  • Client closes the socket connection

Constructors and Functions

Most language bindings provide the following constructors and functions:

Session

  • Create and return session with host, port, user name and password:
    Session(String host, int port, String name, String password)
  • Execute a command and return the result:
    String execute(String command)
  • Return a query instance for the specified query:
    Query query(String query)
  • Create a database from an input stream:
    void create(String name, InputStream input)
  • Add a document to the current database from an input stream:
    void add(String path, InputStream input)
  • Replace a document with the specified input stream:
    void replace(String path, InputStream input)
  • Store raw data at the specified path:
    void store(String path, InputStream input)
  • Return process information:
    String info()
  • Close the session:
    void close()
 

Query

  • Create query instance with session and query:
    Query(Session session, String query)
  • Bind an external variable:
    void bind(String name, String value, String type)
    The type can be an empty string.
  • Bind the context item:
    void context(String value, String type)
    The type can be an empty string.
  • Execute the query and return the result:
    String execute()
  • Iterator: check if a query returns more items:
    boolean more()
  • Iterator: return the next item:
    String next()
  • Return query information:
    String info()
  • Return serialization parameters:
    String options()
  • Return if the query may perform updates:
    boolean updating()
  • Close the query:
    void close()

Changelog

Version 8.2
  • Removed: WATCH and UNWATCH command
Version 8.0
  • Updated: cram-md5 replaced with digest authentication
  • Updated: BIND command: support more than one item
Version 7.2
  • Added: Query Commands CONTEXT, UPDATING and FULL
  • Added: Client function context(String value, String type)