Difference between revisions of "Archive Module"

From BaseX Documentation
Jump to navigation Jump to search
m (Text replacement - "syntaxhighlight" to "pre")
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
This [[Module Library|XQuery Module]] contains functions to handle archives (including ePub, Open Office, JAR, and many other formats). New ZIP and GZIP archives can be created, existing archives can be updated, and the archive entries can be listed and extracted. The [[#archive:extract-binary|archive:extract-binary]] function includes an example for writing the contents of an archive to disk.
+
This [[Module Library|XQuery Module]] contains functions to handle archives (including ePub, Open Office, JAR, and many other formats). New ZIP and GZIP archives can be created, existing archives can be updated, and the archive entries can be listed and extracted. The {{Function||archive:extract-binary}} function includes an example for writing the contents of an archive to disk.
  
 
=Conventions=
 
=Conventions=
Line 10: Line 10:
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''
+
| width='120' | '''Signature'''
|{{Func|archive:entries|$archive as xs:base64Binary|element(archive:entry)*}}<br />
+
|<pre>archive:entries(
|-
+
  $archive as xs:base64Binary
 +
) as element(archive:entry)*</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
 
|Returns the entry descriptors of the specified {{Code|$archive}}. A descriptor contains the following attributes, provided that they are available in the archive format:
 
|Returns the entry descriptors of the specified {{Code|$archive}}. A descriptor contains the following attributes, provided that they are available in the archive format:
Line 20: Line 22:
 
* {{Code|compressed-size}}: compressed file size
 
* {{Code|compressed-size}}: compressed file size
 
An example:
 
An example:
<syntaxhighlight lang="xml">
+
<pre lang="xml">
 
<archive:entry size="1840" last-modified="2009-03-20T03:30:32" compressed-size="672">
 
<archive:entry size="1840" last-modified="2009-03-20T03:30:32" compressed-size="672">
 
   doc/index.html
 
   doc/index.html
 
</archive:entry>
 
</archive:entry>
</syntaxhighlight>
+
</pre>
|-
+
|- valign="top"
 
| '''Errors'''
 
| '''Errors'''
 
|{{Error|error|#Errors}} archive creation failed.
 
|{{Error|error|#Errors}} archive creation failed.
|-
+
|- valign="top"
 
|'''Examples'''
 
|'''Examples'''
 
|Sums up the file sizes of all entries of a JAR file:
 
|Sums up the file sizes of all entries of a JAR file:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
sum(archive:entries(file:read-binary('zip.zip'))/@size)
 
sum(archive:entries(file:read-binary('zip.zip'))/@size)
</syntaxhighlight>
+
</pre>
 
|}
 
|}
  
Line 39: Line 41:
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''
+
| width='120' | '''Signature'''
|{{Func|archive:options|$archive as xs:base64Binary|map(*)}}<br />
+
|<pre>archive:options(
|-
+
  $archive as xs:base64Binary
 +
) as map(*)</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
|Returns the options of the specified {{Code|$archive}} in the format specified by [[#archive:create|archive:create]].
+
|Returns the options of the specified {{Code|$archive}} in the format specified by {{Function||archive:create}}.
|-
+
|- valign="top"
 
| '''Errors'''
 
| '''Errors'''
 
|{{Error|format|#Errors}} The archive format is not supported.<br/>{{Error|error|#Errors}} archive creation failed.
 
|{{Error|format|#Errors}} The archive format is not supported.<br/>{{Error|error|#Errors}} archive creation failed.
|-
+
|- valign="top"
 
| '''Examples'''
 
| '''Examples'''
 
|A standard ZIP archive will return the following options:
 
|A standard ZIP archive will return the following options:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
map {
 
map {
 
   "format": "zip",
 
   "format": "zip",
 
   "algorithm": "deflate"
 
   "algorithm": "deflate"
 
}
 
}
</syntaxhighlight>
+
</pre>
 
|}
 
|}
  
Line 62: Line 66:
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''
+
| width='120' | '''Signature'''
|{{Func|archive:extract-text|$archive as xs:base64Binary|xs:string*}}<br/>{{Func|archive:extract-text|$archive as xs:base64Binary, $entries as item()*|xs:string*}}<br/>{{Func|archive:extract-text|$archive as xs:base64Binary, $entries as item()*, $encoding as xs:string|xs:string*}}<br/>
+
|<pre>archive:extract-text(
|-
+
  $archive   as xs:base64Binary,
 +
  $entries   as item()*         := (),
 +
  $encoding as xs:string       := ()
 +
) as xs:string*</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
|Extracts entries of the specified {{Code|$archive}} and returns them as texts.<br/>The returned entries can be limited via {{Code|$entries}}. The format of the argument is the same as for [[#archive:create|archive:create]] (attributes will be ignored).<br/>The encoding of the input files can be specified via {{Code|$encoding}}.
+
|Extracts entries of the specified {{Code|$archive}} and returns them as texts.<br/>The returned entries can be limited via {{Code|$entries}}. The format of the argument is the same as for {{Function||archive:create}} (attributes will be ignored).<br/>The encoding of the input files can be specified via {{Code|$encoding}}.
|-
+
|- valign="top"
 
| '''Errors'''
 
| '''Errors'''
|{{Error|encode|#Errors}} the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if {{Option|CHECKSTRINGS}} is turned off.<br />{{Error|error|#Errors}} archive creation failed.
+
|{{Error|encode|#Errors}} the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if {{Option|CHECKSTRINGS}} is turned off.<br/>{{Error|error|#Errors}} archive creation failed.
|-
+
|- valign="top"
 
| '''Examples'''
 
| '''Examples'''
 
|The following expression extracts all {{Code|.txt}} files from an archive:
 
|The following expression extracts all {{Code|.txt}} files from an archive:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
let $archive := file:read-binary("documents.zip")
 
let $archive := file:read-binary("documents.zip")
 
for $entry in archive:entries($archive)[ends-with(., '.txt')]
 
for $entry in archive:entries($archive)[ends-with(., '.txt')]
 
return archive:extract-text($archive, $entry)
 
return archive:extract-text($archive, $entry)
</syntaxhighlight>
+
</pre>
 
|}
 
|}
  
Line 84: Line 92:
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''
+
| width='120' | '''Signature'''
|{{Func|archive:extract-binary|$archive as xs:base64Binary|xs:base64Binary*}}<br/>{{Func|archive:extract-binary|$archive as xs:base64Binary, $entries as item()*|xs:base64Binary*}}
+
|<pre>archive:extract-binary(
|-
+
  $archive as xs:base64Binary,
 +
  $entries as item()*         := ()
 +
) as xs:base64Binary*</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
|Extracts entries of the specified {{Code|$archive}} and returns them as binaries.<br/>The returned entries can be limited via {{Code|$entries}}. The format of the argument is the same as for [[#archive:create|archive:create]] (attributes will be ignored).
+
|Extracts entries of the specified {{Code|$archive}} and returns them as binaries.<br/>The returned entries can be limited via {{Code|$entries}}. The format of the argument is the same as for {{Function||archive:create}} (attributes will be ignored).
|-
+
|- valign="top"
 
| '''Errors'''
 
| '''Errors'''
 
|{{Error|error|#Errors}} archive creation failed.
 
|{{Error|error|#Errors}} archive creation failed.
|-
+
|- valign="top"
 
| '''Examples'''
 
| '''Examples'''
 
|This example unzips all files of an archive to the current directory:
 
|This example unzips all files of an archive to the current directory:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
let $archive  := file:read-binary('archive.zip')
 
let $archive  := file:read-binary('archive.zip')
 
let $entries  := archive:entries($archive)
 
let $entries  := archive:entries($archive)
Line 103: Line 114:
 
   file:create-dir(replace($entry, "[^/]+$", "")),
 
   file:create-dir(replace($entry, "[^/]+$", "")),
 
   file:write-binary($entry, $content)
 
   file:write-binary($entry, $content)
})</syntaxhighlight>
+
})</pre>
 
|}
 
|}
  
Line 111: Line 122:
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''  
+
| width='120' | '''Signature'''  
|{{Func|archive:create|$entries as item(), $contents as item()*|xs:base64Binary}}<br />{{Func|archive:create|$entries as item(), $contents as item()*, $options as map(*)?|xs:base64Binary}}<br />
+
|<pre>archive:create(
|-
+
  $entries   as item(),
 +
  $contents as item()*,
 +
  $options   as map(*)? := map { }
 +
) as xs:base64Binary</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
 
|Creates a new archive from the specified entries and contents.<br/>The {{Code|$entries}} argument contains meta information required to create new entries. All items may either be of type {{Code|xs:string}}, representing the entry name, or {{Code|element(archive:entry)}}, containing the name as text node and additional, optional attributes:
 
|Creates a new archive from the specified entries and contents.<br/>The {{Code|$entries}} argument contains meta information required to create new entries. All items may either be of type {{Code|xs:string}}, representing the entry name, or {{Code|element(archive:entry)}}, containing the name as text node and additional, optional attributes:
Line 121: Line 136:
 
* {{Code|encoding}}: for textual entries (default: UTF-8)
 
* {{Code|encoding}}: for textual entries (default: UTF-8)
 
An example:
 
An example:
<syntaxhighlight lang="xml">
+
<pre lang="xml">
 
<archive:entry last-modified='2011-11-11T11:11:11'
 
<archive:entry last-modified='2011-11-11T11:11:11'
 
               compression-level='8'
 
               compression-level='8'
 
               encoding='US-ASCII'>hello.txt</archive:entry>
 
               encoding='US-ASCII'>hello.txt</archive:entry>
</syntaxhighlight>
+
</pre>
 
The actual {{Code|$contents}} must be {{Code|xs:string}} or {{Code|xs:base64Binary}} items.<br/>
 
The actual {{Code|$contents}} must be {{Code|xs:string}} or {{Code|xs:base64Binary}} items.<br/>
 
The {{Code|$options}} parameter contains archiving options:
 
The {{Code|$options}} parameter contains archiving options:
 
* {{Code|format}}: allowed values are {{Code|zip}} and {{Code|gzip}}. {{Code|zip}} is the default.
 
* {{Code|format}}: allowed values are {{Code|zip}} and {{Code|gzip}}. {{Code|zip}} is the default.
 
* {{Code|algorithm}}: allowed values are {{Code|deflate}} and {{Code|stored}} (for the {{Code|zip}} format). {{Code|deflate}} is the default.
 
* {{Code|algorithm}}: allowed values are {{Code|deflate}} and {{Code|stored}} (for the {{Code|zip}} format). {{Code|deflate}} is the default.
|-
+
|- valign="top"
 
| '''Errors'''
 
| '''Errors'''
|{{Error|number|#Errors}} the number of entries and contents differs.<br />{{Error|format|#Errors}} the specified option or its value is invalid or not supported.<br />{{Error|descriptor|#Errors}} entry descriptors contain invalid entry names, timestamps or compression levels.<br/>{{Error|encode|#Errors}} the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if {{Option|CHECKSTRINGS}} is turned off.<br/>{{Error|single|#Errors}} the chosen archive format only allows single entries.<br />{{Error|error|#Errors}} archive creation failed.
+
|{{Error|number|#Errors}} the number of entries and contents differs.<br/>{{Error|format|#Errors}} the specified option or its value is invalid or not supported.<br/>{{Error|descriptor|#Errors}} entry descriptors contain invalid entry names, timestamps or compression levels.<br/>{{Error|encode|#Errors}} the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if {{Option|CHECKSTRINGS}} is turned off.<br/>{{Error|single|#Errors}} the chosen archive format only allows single entries.<br/>{{Error|error|#Errors}} archive creation failed.
|-
+
|- valign="top"
 
| '''Examples'''
 
| '''Examples'''
 
|The following one-liner creates an archive {{Code|archive.zip}} with one file {{Code|file.txt}}:
 
|The following one-liner creates an archive {{Code|archive.zip}} with one file {{Code|file.txt}}:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
archive:create(<archive:entry>file.txt</archive:entry>, 'Hello World')
 
archive:create(<archive:entry>file.txt</archive:entry>, 'Hello World')
</syntaxhighlight>
+
</pre>
 
The following function creates an archive {{Code|mp3.zip}}, which contains all MP3 files of a local directory:
 
The following function creates an archive {{Code|mp3.zip}}, which contains all MP3 files of a local directory:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
let $path  := 'audio/'
 
let $path  := 'audio/'
 
let $files := file:list($path, true(), '*.mp3')
 
let $files := file:list($path, true(), '*.mp3')
Line 147: Line 162:
 
   return file:read-binary($path || $file)
 
   return file:read-binary($path || $file)
 
)
 
)
return file:write-binary('mp3.zip', $zip)</syntaxhighlight>
+
return file:write-binary('mp3.zip', $zip)</pre>
 
|}
 
|}
  
Line 153: Line 168:
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''
+
| width='120' | '''Signature'''
|{{Func|archive:update|$archive as xs:base64Binary, $entries as item()*, $contents as item()*|xs:base64Binary}}
+
|<pre>archive:update(
|-
+
  $archive   as xs:base64Binary,
 +
  $entries   as item()*,
 +
  $contents as item()*
 +
) as xs:base64Binary</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
|Creates an updated version of the specified {{Code|$archive}} with new or replaced entries.<br/>The format of {{Code|$entries}} and {{Code|$contents}} is the same as for [[#archive:create|archive:create]].
+
|Creates an updated version of the specified {{Code|$archive}} with new or replaced entries.<br/>The format of {{Code|$entries}} and {{Code|$contents}} is the same as for {{Function||archive:create}}.
|-
+
|- valign="top"
 
| '''Errors'''
 
| '''Errors'''
|{{Error|number|#Errors}} the number of entries and contents differs.<br />{{Error|descriptor|#Errors}} entry descriptors contain invalid entry names, timestamps, compression levels or encodings.<br/>{{Error|encode|#Errors}} the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if {{Option|CHECKSTRINGS}} is turned off.<br />{{Error|modify|#Errors}} the entries of the given archive cannot be modified.<br/>{{Error|error|#Errors}} archive creation failed.
+
|{{Error|number|#Errors}} the number of entries and contents differs.<br/>{{Error|descriptor|#Errors}} entry descriptors contain invalid entry names, timestamps, compression levels or encodings.<br/>{{Error|encode|#Errors}} the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if {{Option|CHECKSTRINGS}} is turned off.<br/>{{Error|modify|#Errors}} the entries of the given archive cannot be modified.<br/>{{Error|error|#Errors}} archive creation failed.
|-
+
|- valign="top"
 
| '''Examples'''
 
| '''Examples'''
 
|This example replaces texts in a Word document:
 
|This example replaces texts in a Word document:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
declare variable $input  := "HelloWorld.docx";
 
declare variable $input  := "HelloWorld.docx";
 
declare variable $output := "HelloUniverse.docx";
 
declare variable $output := "HelloUniverse.docx";
Line 177: Line 196:
 
let $updated := archive:update($archive, $doc, $entry)
 
let $updated := archive:update($archive, $doc, $entry)
 
return file:write-binary($output, $updated)
 
return file:write-binary($output, $updated)
</syntaxhighlight>
+
</pre>
 
|}
 
|}
  
Line 183: Line 202:
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''
+
| width='120' | '''Signature'''
|{{Func|archive:delete|$archive as xs:base64Binary, $entries as item()*|xs:base64Binary}}
+
|<pre>archive:delete(
|-
+
  $archive as xs:base64Binary,
 +
  $entries as item()*
 +
) as xs:base64Binary</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
|Deletes entries from an {{Code|$archive}}.<br/>The format of {{Code|$entries}} is the same as for [[#archive:create|archive:create]].
+
|Deletes entries from an {{Code|$archive}}.<br/>The format of {{Code|$entries}} is the same as for {{Function||archive:create}}.
|-
+
|- valign="top"
 
| '''Errors'''
 
| '''Errors'''
 
|{{Error|modify|#Errors}} the entries of the given archive cannot be modified.<br/>{{Error|error|#Errors}} archive creation failed.
 
|{{Error|modify|#Errors}} the entries of the given archive cannot be modified.<br/>{{Error|error|#Errors}} archive creation failed.
|-
+
|- valign="top"
 
| '''Examples'''
 
| '''Examples'''
 
|This example deletes all HTML files in an archive and creates a new file:
 
|This example deletes all HTML files in an archive and creates a new file:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
let $zip := file:read-binary('old.zip')
 
let $zip := file:read-binary('old.zip')
 
let $entries := archive:entries($zip)[matches(., '\.x?html?$', 'i')]
 
let $entries := archive:entries($zip)[matches(., '\.x?html?$', 'i')]
 
return file:write-binary('new.zip', archive:delete($zip, $entries))
 
return file:write-binary('new.zip', archive:delete($zip, $entries))
</syntaxhighlight>
+
</pre>
 
|}
 
|}
  
Line 207: Line 229:
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''
+
| width='120' | '''Signature'''
|{{Func|archive:create-from|$path as xs:string|xs:base64Binary}}<br/>{{Func|archive:create-from|$path as xs:string, $options as map(*)?|xs:base64Binary}}<br/>{{Func|archive:create-from|$path as xs:string, $options as map(*)?, $entries as item()*|xs:base64Binary}}
+
|<pre>archive:create-from(
|-
+
  $path     as xs:string,
 +
  $options as map(*)?   := map { },
 +
  $entries as item()*   := ()
 +
) as xs:base64Binary</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
|This convenience function creates an archive from all files in the specified directory {{Code|$path}}.<br/>The {{Code|$options}} parameter contains archiving options, and the files to be archived can be limited via {{Code|$entries}}. The format of the two last arguments is identical to [[#archive:create|archive:create]], but two additional options are available:
+
|This convenience function creates an archive from all files in the specified directory {{Code|$path}}.<br/>The {{Code|$options}} parameter contains archiving options, and the files to be archived can be limited via {{Code|$entries}}. The format of the two last arguments is identical to {{Function||archive:create}}, with two additional options:
 
* {{Code|recursive}}: parse all files recursively (default: {{Code|true}}; ignored if entries are specified via the last argument).
 
* {{Code|recursive}}: parse all files recursively (default: {{Code|true}}; ignored if entries are specified via the last argument).
 
* {{Code|root-dir}}: use name of supplied directory as archive root directory (default: {{Code|false}}).
 
* {{Code|root-dir}}: use name of supplied directory as archive root directory (default: {{Code|false}}).
|-
+
|- valign="top"
 
| '''Errors'''
 
| '''Errors'''
 
|{{Error|file:no-dir|File Module#Errors}} the specified path does not point to a directory.<br/>{{Error|file:is-dir|File Module#Errors}} one of the specified entries points to a directory.<br/>{{Error|file:not-found|File Module#Errors}} a specified entry does not exist.<br/>{{Error|error|#Errors}} archive creation failed.
 
|{{Error|file:no-dir|File Module#Errors}} the specified path does not point to a directory.<br/>{{Error|file:is-dir|File Module#Errors}} one of the specified entries points to a directory.<br/>{{Error|file:not-found|File Module#Errors}} a specified entry does not exist.<br/>{{Error|error|#Errors}} archive creation failed.
|-
+
|- valign="top"
 
| '''Examples'''
 
| '''Examples'''
 
|This example writes the files of a user’s home directory to <code>archive.zip</code>:
 
|This example writes the files of a user’s home directory to <code>archive.zip</code>:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
let $zip := archive:create-from('/home/user/')
 
let $zip := archive:create-from('/home/user/')
 
return file:write-binary('archive.zip', $zip)
 
return file:write-binary('archive.zip', $zip)
</syntaxhighlight>
+
</pre>
 
|}
 
|}
  
Line 230: Line 256:
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''
+
| width='120' | '''Signature'''
|{{Func|archive:extract-to|$path as xs:string, $archive as xs:base64Binary|empty-sequence()}}<br/>{{Func|archive:extract-to|$path as xs:string, $archive as xs:base64Binary, $entries as item()*|empty-sequence()}}
+
|<pre>archive:extract-to(
|-
+
  $path     as xs:string,
 +
  $archive as xs:base64Binary,
 +
  $entries as item()*         := ()
 +
) as empty-sequence()</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
|This convenience function writes files of an {{Code|$archive}} directly to the specified directory {{Code|$path}}.<br/>The archive entries to be written can be restricted via {{Code|$entries}}. The format of the argument is the same as for [[#archive:create|archive:create]] (attributes will be ignored).
+
|This convenience function writes files of an {{Code|$archive}} directly to the specified directory {{Code|$path}}.<br/>The archive entries to be written can be restricted via {{Code|$entries}}. The format of the argument is the same as for {{Function||archive:create}} (attributes will be ignored).
|-
+
|- valign="top"
 
| '''Errors'''
 
| '''Errors'''
 
|{{Error|error|#Errors}} archive creation failed.
 
|{{Error|error|#Errors}} archive creation failed.
|-
+
|- valign="top"
 
| '''Examples'''
 
| '''Examples'''
 
|The following expression unzips all files of an archive to the current directory:
 
|The following expression unzips all files of an archive to the current directory:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
archive:extract-to('.', file:read-binary('archive.zip'))
 
archive:extract-to('.', file:read-binary('archive.zip'))
</syntaxhighlight>
+
</pre>
 
|}
 
|}
  
 
==archive:write==
 
==archive:write==
 
{{Mark|Introduced with BaseX 9.6}}
 
  
 
{| width='100%'
 
{| width='100%'
|-
+
|- valign="top"
| width='120' | '''Signatures'''  
+
| width='120' | '''Signature'''  
|{{Func|archive:write|$path as xs:string, $entries as item(), $contents as item()*|xs:base64Binary}}<br />{{Func|archive:write|$path as xs:string, $entries as item(), $contents as item()*, $options as map(*)?|xs:base64Binary}}<br />
+
|<pre>archive:write(
|-
+
  $path     as xs:string,
 +
  $entries   as item(),
 +
  $contents as item()*,
 +
  $options   as map(*)? := map { }
 +
) as xs:base64Binary</pre>
 +
|- valign="top"
 
| '''Summary'''
 
| '''Summary'''
 
|This convenience function creates a new archive from the specified {{Code|$entries}} and {{Code|$contents}} and writes it disk.<br/> See {{Function||archive:create}} for more details.
 
|This convenience function creates a new archive from the specified {{Code|$entries}} and {{Code|$contents}} and writes it disk.<br/> See {{Function||archive:create}} for more details.
|-
+
|- valign="top"
 
| '''Errors'''
 
| '''Errors'''
|{{Error|number|#Errors}} the number of entries and contents differs.<br />{{Error|format|#Errors}} the specified option or its value is invalid or not supported.<br />{{Error|descriptor|#Errors}} entry descriptors contain invalid entry names, timestamps or compression levels.<br/>{{Error|encode|#Errors}} the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if {{Option|CHECKSTRINGS}} is turned off.<br/>{{Error|single|#Errors}} the chosen archive format only allows single entries.<br />{{Error|error|#Errors}} archive creation failed.
+
|{{Error|number|#Errors}} the number of entries and contents differs.<br/>{{Error|format|#Errors}} the specified option or its value is invalid or not supported.<br/>{{Error|descriptor|#Errors}} entry descriptors contain invalid entry names, timestamps or compression levels.<br/>{{Error|encode|#Errors}} the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if {{Option|CHECKSTRINGS}} is turned off.<br/>{{Error|single|#Errors}} the chosen archive format only allows single entries.<br/>{{Error|error|#Errors}} archive creation failed.
|-
+
|- valign="top"
 
| '''Examples'''
 
| '''Examples'''
|All mp3 files from a directory are zipped and written to a file:
+
|All mp3 files from a directory are zipped and written to a file, along with an info file:
<syntaxhighlight lang="xquery">
+
<pre lang='xquery'>
 
let $files := file:children('music')[ends-with(., 'mp3')]
 
let $files := file:children('music')[ends-with(., 'mp3')]
 
return archive:write(
 
return archive:write(
 
   'music.zip',
 
   'music.zip',
   $files ! file:name(.),
+
   ('info.txt', $files ! file:name(.)),
   $files ! file:read-binary(.)
+
   ('Archive with MP3 files', $files ! file:read-binary(.))
 
)
 
)
</syntaxhighlight>
+
</pre>
 
|}
 
|}
  
Line 279: Line 312:
 
! width="110"|Code
 
! width="110"|Code
 
|Description
 
|Description
|-
+
|- valign="top"
 
|{{Code|descriptor}}
 
|{{Code|descriptor}}
 
|Entry descriptors contain invalid entry names, timestamps or compression levels.
 
|Entry descriptors contain invalid entry names, timestamps or compression levels.
|-
+
|- valign="top"
 
|{{Code|encode}}
 
|{{Code|encode}}
 
|The specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if {{Option|CHECKSTRINGS}} is turned off.
 
|The specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if {{Option|CHECKSTRINGS}} is turned off.
|-
+
|- valign="top"
 
|{{Code|error}}
 
|{{Code|error}}
|Archive processing failed.
+
| processing failed.
|-
+
|- valign="top"
 
|{{Code|format}}
 
|{{Code|format}}
 
|The archive format or the specified option is invalid or not supported.
 
|The archive format or the specified option is invalid or not supported.
|-
+
|- valign="top"
 
|{{Code|modify}}
 
|{{Code|modify}}
 
|The entries of the given archive cannot be modified.
 
|The entries of the given archive cannot be modified.
|-
+
|- valign="top"
 
|{{Code|number}}
 
|{{Code|number}}
 
|The number of specified entries and contents differs.
 
|The number of specified entries and contents differs.
|-
+
|- valign="top"
 
|{{Code|single}}
 
|{{Code|single}}
 
|The chosen archive format only allows single entries.
 
|The chosen archive format only allows single entries.
Line 305: Line 338:
  
 
;Version 9.6
 
;Version 9.6
* Added: [[#archive:write|archive:write]]
+
* Added: {{Function||archive:write}}
  
 
;Version 9.0
 
;Version 9.0
* Updated: [[#archive:create-from|archive:create-from]]: options added
+
* Updated: {{Function||archive:create-from}}: options added
 
* Updated: error codes updated; errors now use the module namespace
 
* Updated: error codes updated; errors now use the module namespace
  
 
;Version 8.5
 
;Version 8.5
* Updated: [[#archive:options|archive:options]]: map returned instead of element
+
* Updated: {{Function||archive:options}}: map returned instead of element
  
 
;Version 8.3
 
;Version 8.3
* Added: [[#archive:create-from|archive:create-from]], [[#archive:extract-to|archive:extract-to]] (replaces <code>archive:write</code>)
+
* Added: {{Function||archive:create-from}}, {{Function||archive:extract-to}} (replaces <code>archive:write</code>)
  
 
The module was introduced with Version 7.3.
 
The module was introduced with Version 7.3.

Latest revision as of 18:38, 1 December 2023

This XQuery Module contains functions to handle archives (including ePub, Open Office, JAR, and many other formats). New ZIP and GZIP archives can be created, existing archives can be updated, and the archive entries can be listed and extracted. The archive:extract-binary function includes an example for writing the contents of an archive to disk.

Conventions[edit]

All functions and errors in this module are assigned to the http://basex.org/modules/archive namespace, which is statically bound to the archive prefix.

Content Handling[edit]

archive:entries[edit]

Signature
archive:entries(
  $archive  as xs:base64Binary
) as element(archive:entry)*
Summary Returns the entry descriptors of the specified $archive. A descriptor contains the following attributes, provided that they are available in the archive format:
  • size: original file size
  • last-modified: timestamp, formatted as xs:dateTime
  • compressed-size: compressed file size

An example:

<archive:entry size="1840" last-modified="2009-03-20T03:30:32" compressed-size="672">
  doc/index.html
</archive:entry>
Errors error: archive creation failed.
Examples Sums up the file sizes of all entries of a JAR file:
sum(archive:entries(file:read-binary('zip.zip'))/@size)

archive:options[edit]

Signature
archive:options(
  $archive  as xs:base64Binary
) as map(*)
Summary Returns the options of the specified $archive in the format specified by archive:create.
Errors format: The archive format is not supported.
error: archive creation failed.
Examples A standard ZIP archive will return the following options:
map {
  "format": "zip",
  "algorithm": "deflate"
}

archive:extract-text[edit]

Signature
archive:extract-text(
  $archive   as xs:base64Binary,
  $entries   as item()*          := (),
  $encoding  as xs:string        := ()
) as xs:string*
Summary Extracts entries of the specified $archive and returns them as texts.
The returned entries can be limited via $entries. The format of the argument is the same as for archive:create (attributes will be ignored).
The encoding of the input files can be specified via $encoding.
Errors encode: the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off.
error: archive creation failed.
Examples The following expression extracts all .txt files from an archive:
let $archive := file:read-binary("documents.zip")
for $entry in archive:entries($archive)[ends-with(., '.txt')]
return archive:extract-text($archive, $entry)

archive:extract-binary[edit]

Signature
archive:extract-binary(
  $archive  as xs:base64Binary,
  $entries  as item()*          := ()
) as xs:base64Binary*
Summary Extracts entries of the specified $archive and returns them as binaries.
The returned entries can be limited via $entries. The format of the argument is the same as for archive:create (attributes will be ignored).
Errors error: archive creation failed.
Examples This example unzips all files of an archive to the current directory:
let $archive  := file:read-binary('archive.zip')
let $entries  := archive:entries($archive)
let $contents := archive:extract-binary($archive)
return for-each-pair($entries, $contents, function($entry, $content) {
  file:create-dir(replace($entry, "[^/]+$", "")),
  file:write-binary($entry, $content)
})

Updates[edit]

archive:create[edit]

Signature
archive:create(
  $entries   as item(),
  $contents  as item()*,
  $options   as map(*)?  := map { }
) as xs:base64Binary
Summary Creates a new archive from the specified entries and contents.
The $entries argument contains meta information required to create new entries. All items may either be of type xs:string, representing the entry name, or element(archive:entry), containing the name as text node and additional, optional attributes:
  • last-modified: timestamp, specified as xs:dateTime (default: current time)
  • compression-level: 0-9, 0 = uncompressed (default: 8)
  • encoding: for textual entries (default: UTF-8)

An example:

<archive:entry last-modified='2011-11-11T11:11:11'
               compression-level='8'
               encoding='US-ASCII'>hello.txt</archive:entry>

The actual $contents must be xs:string or xs:base64Binary items.
The $options parameter contains archiving options:

  • format: allowed values are zip and gzip. zip is the default.
  • algorithm: allowed values are deflate and stored (for the zip format). deflate is the default.
Errors number: the number of entries and contents differs.
format: the specified option or its value is invalid or not supported.
descriptor: entry descriptors contain invalid entry names, timestamps or compression levels.
encode: the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off.
single: the chosen archive format only allows single entries.
error: archive creation failed.
Examples The following one-liner creates an archive archive.zip with one file file.txt:
archive:create(<archive:entry>file.txt</archive:entry>, 'Hello World')

The following function creates an archive mp3.zip, which contains all MP3 files of a local directory:

let $path  := 'audio/'
let $files := file:list($path, true(), '*.mp3')
let $zip   := archive:create($files,
  for $file in $files
  return file:read-binary($path || $file)
)
return file:write-binary('mp3.zip', $zip)

archive:update[edit]

Signature
archive:update(
  $archive   as xs:base64Binary,
  $entries   as item()*,
  $contents  as item()*
) as xs:base64Binary
Summary Creates an updated version of the specified $archive with new or replaced entries.
The format of $entries and $contents is the same as for archive:create.
Errors number: the number of entries and contents differs.
descriptor: entry descriptors contain invalid entry names, timestamps, compression levels or encodings.
encode: the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off.
modify: the entries of the given archive cannot be modified.
error: archive creation failed.
Examples This example replaces texts in a Word document:
declare variable $input  := "HelloWorld.docx";
declare variable $output := "HelloUniverse.docx";
declare variable $doc    := "word/document.xml";
 
let $archive := file:read-binary($input)
let $entry   :=
  copy $c := fn:parse-xml(archive:extract-text($archive, $doc))
  modify replace value of node $c//*[text() = "HELLO WORLD!"] with "HELLO UNIVERSE!"
  return fn:serialize($c)
let $updated := archive:update($archive, $doc, $entry)
return file:write-binary($output, $updated)

archive:delete[edit]

Signature
archive:delete(
  $archive  as xs:base64Binary,
  $entries  as item()*
) as xs:base64Binary
Summary Deletes entries from an $archive.
The format of $entries is the same as for archive:create.
Errors modify: the entries of the given archive cannot be modified.
error: archive creation failed.
Examples This example deletes all HTML files in an archive and creates a new file:
let $zip := file:read-binary('old.zip')
let $entries := archive:entries($zip)[matches(., '\.x?html?$', 'i')]
return file:write-binary('new.zip', archive:delete($zip, $entries))

Convenience[edit]

archive:create-from[edit]

Signature
archive:create-from(
  $path     as xs:string,
  $options  as map(*)?    := map { },
  $entries  as item()*    := ()
) as xs:base64Binary
Summary This convenience function creates an archive from all files in the specified directory $path.
The $options parameter contains archiving options, and the files to be archived can be limited via $entries. The format of the two last arguments is identical to archive:create, with two additional options:
  • recursive: parse all files recursively (default: true; ignored if entries are specified via the last argument).
  • root-dir: use name of supplied directory as archive root directory (default: false).
Errors file:no-dir: the specified path does not point to a directory.
file:is-dir: one of the specified entries points to a directory.
file:not-found: a specified entry does not exist.
error: archive creation failed.
Examples This example writes the files of a user’s home directory to archive.zip:
let $zip := archive:create-from('/home/user/')
return file:write-binary('archive.zip', $zip)

archive:extract-to[edit]

Signature
archive:extract-to(
  $path     as xs:string,
  $archive  as xs:base64Binary,
  $entries  as item()*          := ()
) as empty-sequence()
Summary This convenience function writes files of an $archive directly to the specified directory $path.
The archive entries to be written can be restricted via $entries. The format of the argument is the same as for archive:create (attributes will be ignored).
Errors error: archive creation failed.
Examples The following expression unzips all files of an archive to the current directory:
archive:extract-to('.', file:read-binary('archive.zip'))

archive:write[edit]

Signature
archive:write(
  $path      as xs:string,
  $entries   as item(),
  $contents  as item()*,
  $options   as map(*)?  := map { }
) as xs:base64Binary
Summary This convenience function creates a new archive from the specified $entries and $contents and writes it disk.
See archive:create for more details.
Errors number: the number of entries and contents differs.
format: the specified option or its value is invalid or not supported.
descriptor: entry descriptors contain invalid entry names, timestamps or compression levels.
encode: the specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off.
single: the chosen archive format only allows single entries.
error: archive creation failed.
Examples All mp3 files from a directory are zipped and written to a file, along with an info file:
let $files := file:children('music')[ends-with(., 'mp3')]
return archive:write(
  'music.zip',
  ('info.txt', $files ! file:name(.)),
  ('Archive with MP3 files', $files ! file:read-binary(.))
)

Errors[edit]

Code Description
descriptor Entry descriptors contain invalid entry names, timestamps or compression levels.
encode The specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off.
error processing failed.
format The archive format or the specified option is invalid or not supported.
modify The entries of the given archive cannot be modified.
number The number of specified entries and contents differs.
single The chosen archive format only allows single entries.

Changelog[edit]

Version 9.6
Version 9.0
  • Updated: archive:create-from: options added
  • Updated: error codes updated; errors now use the module namespace
Version 8.5
Version 8.3

The module was introduced with Version 7.3.