Archive Functions
This module contains functions to handle archives (including ePub, Open Office, JAR, and many other formats). New ZIP and GZIP archives can be created, existing archives can be updated, and the archive entries can be listed and extracted.
Updated:
- Input archives can now be addressed via file paths and URIs.
- Archive entries can be deleted by specifying empty arrays as contents.
All functions and errors in this module are assigned to the
http://basex.org/modules/archive
namespace, which is statically bound to the
archive
prefix.
Signature | archive:entries(
$archive as (xs:string|xs:base64Binary|xs:hexBinary)
) as element(archive:entry)* |
---|
Summary | Returns the entry descriptors of the specified $archive (supplied as binary or URI/file path). A descriptor contains the following attributes, provided that they are available in the archive format:
size : original file sizelast-modified : timestamp, formatted as xs:dateTimecompressed-size : compressed file size
An example:
<archive:entry size="1840" last-modified="2024-03-20T03:30:32" compressed-size="672">
doc/index.html
</archive:entry>
|
---|
Errors | |
---|
Examples | sum(archive:entries(file:read-binary('zip.zip'))/@size) Sums up the file sizes of all entries of a JAR file. |
---|
Signature | archive:options(
$archive as (xs:string|xs:base64Binary|xs:hexBinary)
) as map(*) |
---|
Summary | Returns the options of the specified $archive (supplied as binary or URI/file path) in the format specified by archive:create . |
---|
Errors | error | Processing failed. | format | The archive format or the specified option is invalid or not supported. |
|
---|
Examples | {
"format": "zip",
"algorithm": "deflate"
} Returned for a standard ZIP archive. |
---|
Signature | archive:extract-text(
$archive as (xs:string|xs:base64Binary|xs:hexBinary),
$entries as xs:string* := (),
$encoding as xs:string := ()
) as xs:string* |
---|
Summary | Extracts entries of the specified $archive (supplied as binary or URI/file path) and returns them as texts. The returned entries can be limited via $entries . The format of the argument is the same as for archive:create (attributes will be ignored). The encoding of the input files can be specified via $encoding . |
---|
Errors | encode | The specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off. | error | Processing failed. |
|
---|
Examples | let $archive := file:read-binary("documents.zip")
for $entry in archive:entries($archive)[ends-with(., '.txt')]
return archive:extract-text($archive, $entry) Extracts all .txt files from an archive. |
---|
Signature | archive:extract-binary(
$archive as (xs:string|xs:base64Binary|xs:hexBinary),
$entries as xs:string* := ()
) as xs:base64Binary* |
---|
Summary | Extracts entries of the specified $archive (supplied as binary or URI/file path) and returns them as binaries. The returned entries can be limited via $entries . The format of the argument is the same as for archive:create (attributes will be ignored). |
---|
Errors | |
---|
Examples | let $archive := file:read-binary('archive.zip')
let $entries := archive:entries($archive)
let $contents := archive:extract-binary($archive)
return for-each-pair($entries, $contents, fn($entry, $content) {
file:create-dir(replace($entry, "[^/]+$", "")),
file:write-binary($entry, $content)
}) Unzips all files of an archive to the current directory. |
---|
Signature | archive:create(
$entries as item(),
$contents as item()*,
$options as map(*)? := {}
) as xs:base64Binary |
---|
Summary | Creates a new archive from the specified entries and contents.
The $entries argument contains metadata. Its items may be of type xs:string , representing the name of the file, or element(archive:entry) , containing the name as string value and additional, optional attributes:
last-modified : timestamp, specified as xs:dateTime (default: current time)compression-level : 0 –9 , 0 = uncompressed (default: 8 )encoding : for textual entries (default: UTF-8 )
An entry may look as follows:
<archive:entry last-modified='2011-11-11T11:11:11' compression-level='8' encoding='US-ASCII'>
hello.txt
</archive:entry>
The $contents must have one of the following types:
- Items of type
xs:string are treated as text. - Items of type
xs:base64Binary or xs:hexBinary are treated as binaries. - In the case of updates (see below), an empty array indicates that an entry is to be deleted.
The following $options are available:
option | default | description |
---|
format | zip |
Allowed values are zip and gzip .
| algorithm | deflate |
Allowed values are deflate and stored (for the zip format).
|
|
---|
Errors | descriptor | Entry descriptors contain invalid entry names, timestamps or compression levels. | encode | The specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off. | error | Processing failed. | format | The archive format or the specified option is invalid or not supported. | number | The number of specified entries and contents differs. | single | The chosen archive format only allows single entries. |
|
---|
Examples | archive:create(<archive:entry>file.txt</archive:entry>, 'Hello World') Creates an archive archive.zip with one file file.txt .
let $path := 'audio/'
let $files := file:list($path, true(), '*.mp3')
let $zip := archive:create($files, $files ! file:read-binary($path || .))
return file:write-binary('mp3.zip', $zip) Creates an archive mp3.zip , which contains all MP3 files of a local directory. |
---|
Signature | archive:update(
$archive as (xs:string|xs:base64Binary|xs:hexBinary),
$entries as item()*,
$contents as item()*
) as xs:base64Binary |
---|
Summary | Creates an updated version of the specified $archive (supplied as binary or URI/file path) with new or replaced entries. The format of $entries and $contents is the same as for archive:create . |
---|
Errors | descriptor | Entry descriptors contain invalid entry names, timestamps or compression levels. | encode | The specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off. | error | Processing failed. | modify | The entries of the given archive cannot be modified. | number | The number of specified entries and contents differs. |
|
---|
Examples | let $archive := archive:update('source.zip', 'delete-me.txt', [ ])
return file:write-binary('target.zip', $archive) Removes a file from an archive.
declare variable $input := "HelloWorld.docx";
declare variable $output := "HelloUniverse.docx";
declare variable $doc := "word/document.xml";
let $xml := parse-xml(archive:extract-text($input, $doc))
let $updated := $xml update {
replace value of node .//*[text() = "HELLO WORLD!"] with "HELLO UNIVERSE!"
}
let $archive := archive:update($input, $updated, serialize($updated))
return file:write-binary($output, $archive) Replaces texts in a Word document. |
---|
Signature | archive:delete(
$archive as (xs:string|xs:base64Binary|xs:hexBinary),
$entries as xs:string*
) as xs:base64Binary |
---|
Summary | Deletes entries from an $archive (supplied as binary or URI/file path). The format of $entries is the same as for archive:create . |
---|
Errors | error | Processing failed. | modify | The entries of the given archive cannot be modified. |
|
---|
Examples | let $zip := file:read-binary('old.zip')
let $entries := archive:entries($zip)[matches(., '\.x?html?$', 'i')]
return file:write-binary('new.zip', archive:delete($zip, $entries)) Deletes all HTML files in an archive and creates a new file. |
---|
Signature | archive:create-from(
$path as xs:string,
$options as map(*)? := {},
$entries as item()* := ()
) as xs:base64Binary |
---|
Summary | This convenience function creates an archive from all files in the specified directory $path . The $options parameter contains archiving options, and the files to be archived can be limited via $entries . The format of the two last arguments is identical to archive:create , with two additional options:
recursive : parse all files recursively (default: true ; ignored if entries are specified via the last argument).root-dir : use name of supplied directory as archive root directory (default: false ).
|
---|
Errors | |
---|
Examples | let $zip := archive:create-from('/home/user/')
return file:write-binary('archive.zip', $zip) Writes the files of a user’s home directory to archive.zip . |
---|
Signature | archive:extract-to(
$path as xs:string,
$archive as (xs:string|xs:base64Binary|xs:hexBinary),
$entries as xs:string* := ()
) as empty-sequence() |
---|
Summary | This convenience function writes files of an $archive (supplied as binary or URI/file path) to the specified directory $path . The archive entries to be written can be restricted via $entries . The format of the argument is the same as for archive:create (attributes will be ignored). |
---|
Errors | |
---|
Examples | archive:extract-to('.', 'archive.zip') Unzips all files of an archive to the current directory. |
---|
Signature | archive:write(
$path as xs:string,
$entries as item()*,
$contents as item()*,
$options as map(*)? := {}
) as empty-sequence() |
---|
Summary | This convenience function creates a new archive from the specified $entries and $contents and writes it to $path . See archive:create for more details. |
---|
Errors | descriptor | Entry descriptors contain invalid entry names, timestamps or compression levels. | encode | The specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off. | error | Processing failed. | format | The archive format or the specified option is invalid or not supported. | number | The number of specified entries and contents differs. | single | The chosen archive format only allows single entries. |
|
---|
Examples | let $files := file:children('music')[ends-with(., 'mp3')]
return archive:write(
'music.zip',
('info.txt', $files ! file:name(.)),
('Archive with MP3 files', $files ! file:read-binary(.))
) All mp3 files from a directory are zipped and written to a file, along with an info file. |
---|
Added: New function for updating existing archives.
Signature | archive:refresh(
$path as xs:string,
$entries as item()*,
$contents as item()*
) as empty-sequence() |
---|
Summary | This convenience function updates a local ZIP archive located at $path with new or replaced entries:
- The format of
$entries and $contents is the same as for archive:create . - If the path points to a remote resource or a GZIP archive, it is rejected.
- Custom compression levels are ignored.
|
---|
Errors | descriptor | Entry descriptors contain invalid entry names, timestamps or compression levels. | encode | The specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off. | error | Processing failed. | number | The number of specified entries and contents differs. | zip | The input is expected to be a local ZIP archive. |
|
---|
Examples | archive:refresh(
'archive.zip',
('readme.txt', 'changelog.txt'),
('Read this info carefully: ...', 'These are the latest changes: ...')
) The two files readme.txt and changelog.txt are added to (or updated in) a local ZIP file.
archive:refresh('archive.zip', 'delete-me.txt', []) The file delete-me.txt is removed from a local ZIP file. |
---|
Code | Description |
---|
descriptor | Entry descriptors contain invalid entry names, timestamps or compression levels. |
encode | The specified encoding is invalid or not supported, or the string conversion failed. Invalid XML characters will be ignored if CHECKSTRINGS is turned off. |
error | Processing failed. |
format | The archive format or the specified option is invalid or not supported. |
modify | The entries of the given archive cannot be modified. |
number | The number of specified entries and contents differs. |
single | The chosen archive format only allows single entries. |
zip | The input is expected to be a local ZIP archive. |
Version 11.0- Added:
archive:refresh
: New function for updating existing archives. - Updated: Input archives can be addressed via file paths and URIs.
- Updated: Archive entries can be deleted by specifying empty arrays as contents.
Version 9.6Version 9.0- Updated:
archive:create-from
: options added - Updated: error codes updated; errors now use the module namespace
Version 8.5Version 8.3Version 7.3
⚡Generated with XQuery