Armarius

About

Armarius is a JavaScript library to read, write, and merge ZIP archives in web browsers.

This library mainly focuses on a low memory footprint, especially when reading archives with tens of thousands of entries, and the ability to merge archives without decompressing and recompressing all entries.

For deflate/inflate support, this library uses the Compression Streams API (if available) or fflate. In Node.js environments, the built-in zlib module is used.

Installation

Armarius can be installed using npm:

npm install armarius

It can then be loaded as an ES Module:

import * as armarius from 'armarius';

IO operations and compression are not part of the library itself, and are packaged separately as armarius-io:

import * as io from 'armarius-io';

For use in web browsers, this library can be bundled using esbuild. Other bundlers like webpack should work as well, but are not officially supported.

Usage

Reading a ZIP archive

To read an archive, an IO context is required. The armarius-io library provides IO implementations for Blob, ArrayBuffer, and Node.js FileHandle objects. Other IO contexts can be implemented by extending the IO class.

let fileInput = document.getElementById('file-input');
let reader = new io.BlobIO(fileInput.files[0]);

A ReadArchive can then be created from an IO context.

let archive = new armarius.ReadArchive(reader, options);
await archive.init();

The ReadArchive constructor optionally accepts an ReadArchiveOptions object with the following properties:

Name	Type	Description
`centralDirectoryBufferSize`	number	Buffer size used when reading central directory contents. Larger buffer sizes may improve performance, but also increase RAM usage.
`createEntryIndex`	boolean	Whether an index of all central directory entries should be created the first time they are read. Massively increases performance when using `findEntry` multiple times.
`entryOptions`	EntryOptions	Options passed to each created Entry object.
`ignoreMultiDiskErrors`	boolean	Simply ignore information about multiple disks instead of throwing an error when encountering a multi disk archive
`allowTruncatedCentralDirectory`	boolean	Do not throw an error if the central directory does not contain the expected number of entries
`allowAdditionalCentralDirectoryEntries`	boolean	Continue reading central directory entries even after the expected number of entries was reached

EntryOptions can have the following properties:

Name	Type	Description
`dataProcessors`	Map<number, typeof DataProcessor>	Map of compressionMethod => DataProcessor Can be used to implement custom compression methods

Reading all archive entries

let entries = await archive.getAllEntries();

Since this method will load all entries (not including their compressed data) into memory, it is not recommended when working with large archives.

Iterating over archive entries

let iterator = await archive.getEntryIterator();

let entry;
while (entry = await iterator.next()) {
    console.log(await entry.getFileNameString());
}

Finding specific entries

let entry = await archive.findEntry('some/file.txt');
console.log(await entry.getFileNameString());

In most cases, this method is faster than iterating through all archive entries multiple times, since an internal index is used to find files quickly.

Reading entry data

Reading a full entry

let entry = await archive.findEntry('example.txt');
let data = await entry.getData();

// Decode UTF-8
let decoder = new TextDecoder();
let text = decoder.decode(data);

console.log(text);

Reading entry data in chunks

let entry = await archive.findEntry('example.txt');
let entryReader = await entry.getDataReader();

let chunk;
while (chunk = await reader.read(1024 * 64)) {
    console.log(chunk);
}

Note that the length parameter passed to EntryDataReader.read is the length of the compressed data read from the file. Since this data is decompressed, the size of the returned chunk might differ.

Also note that an empty chunk returned from EntryDataReader.read does not necessarily indicate that all data has been read. After all data was read, null will be returned instead.

Both getDataReader and getData optionally accept an EntryDataReaderOptions object with the following properties:

Name	Type	Description
`ignoreInvalidChecksums`	boolean	Do not throw an error if the uncompressed data does not match the checksum
`ignoreInvalidUncompressedSize`	boolean	Do not throw an error if the uncompressed data does not match the expected size

Writing archives

New archives can be created using a WriteArchive object. The WriteArchive constructor needs to be passed a function, Iterator, or AsyncIterator that generates new EntrySource objects when needed.

Additionally, a WriteArchiveOptions object can be passed:

Name	Type	Description
`forceZIP64`	boolean	Whether ZIP64 structures should always be created, even if not required by the archive content.

async function *generateNextEntrySource() {
    yield new armarius.DataStreamEntrySource(new io.ArrayBufferIO(new ArrayBuffer(0)), {fileName: 'file.txt'});
    yield new armarius.DataStreamEntrySource(new io.ArrayBufferIO(new ArrayBuffer(0)), {fileName: 'file2.txt'});
    return null;
}

let writeArchive = new armarius.WriteArchive(generateNextEntrySource(), options);

Generating entries

If nextEntryFunction is an Iterator or AsyncIterator, the WriteArchive will iterate over it to generate new entries.

If it is a function, it will be called whenever a new entry needs to be written to the archive and should return a new Instance of EntrySource, or null if no more entries should be added to the archive.

This simple example will generate an archive that contains 10 text files:

let encoder = new TextEncoder();

function *generateEntrySources() {
    for (let i = 0; i < 10; i++) {
        let fileName = `file-${i}`;
        let fileContent = encoder.encode(`Content of file ${i}`);

        let reader = new io.ArrayBufferIO(fileContent.buffer, fileContent.byteOffset, fileContent.byteLength);
        let entry = new armarius.DataStreamEntrySource(reader, {fileName: fileName});
        yield entry;
    }
}

let writeArchive = new armarius.WriteArchive(generateEntrySources());

Any EntrySource accepts an EntrySourceOptions object with the following properties:

Name	Type	Description
`fileComment`	string	Entry file comment
`fileName`	string	Entry file name
`forceUTF8FileName`	boolean	Always encode the filename and file comment in UTF-8, even if it could be encoded in CP437
`compressionMethod`	number	Compression method that should be used for this entry. By default, this library only supports `0` (Store) and `8` (Deflate). More compression methods can be added using the `dataProcessors` option. When using an ArchiveEntryEntrySource, this option will be ignored and the compression method of the original entry is used.
`forceZIP64`	boolean	Whether ZIP64 structures should always be created, even if not required by the content.
`minMadeByVersion`	number	The minimum `madeByVersion` value to be used for this entry. If a higher version is required (e.g. because of ZIP64) is used, it will be set automatically and this option will be ignored.
`minExtractionVersion`	number	The minimum `extractionVersion` value to be used for this entry. If a higher version is required (e.g. because of ZIP64) is used, it will be set automatically and this option will be ignored.
`modTime`	Date	Last modified time of the entry
`acTime`	Date	Last access time of the entry. This option is ignored if `extendedTimeStampField` is `false`.
`crTime`	Date	File creation time of the entry. This option is ignored if `extendedTimeStampField` is `false`.
`unicodeFileNameField`	boolean	Whether a Unicode Path Extra Field should be added
`unicodeCommentField`	boolean	Whether a Unicode Comment Extra Field should be added
`extendedTimeStampField`	boolean	Whether an Extended Timestamp Extra Field should be added
`internalFileAttributes`	number	See https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
`externalFileAttributes`	number	See https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
`dataProcessors`	Map<number, typeof DataProcessor>	Map of compressionMethod => DataProcessor Can be used to implement custom compression methods

Reading output chunks

The generated archive can be read using the getNextChunk function.

let chunk;
while (chunk = await writeArchive.getNextChunk()) {
    console.log('New archive chunk:', chunk);
}

Merging ZIP archives

Armarius supports merging ZIP archives without decompressing and recompressing individual entries.

let archies = [myReadArchive1, myReadArchive2];

let merger = new armarius.ArchiveMerger(archives, options);
let outputWriteArchive = merger.getOutputArchive();

let chunk;
while (chunk = await outputWriteArchive.getNextChunk()) {
    console.log('New archive chunk:', chunk);
}

The ArchiveMerger constructor accepts a list of ReadArchive or MergeSource objects and a MergeOptions object with the following properties:

Name	Type	Description
`entrySourceOptions`	EntrySourceOptions	Options passed to each created EntrySource object
`writeArchiveOptions`	WriteArchiveOptions	Options passed to the output WriteArchive
`nextPrependingEntryFunction`	Function	Function generating EntrySource objects that are added to the output archive before the contents of the input archives

MergeSource objects

A MergeSource object allows greater control over how a source archive is merged into the destination archive.

let mergeSource = new armarius.MergeSource(readArchive);
mergeSource
    .setBasePath('base/path/within/the/source/archive')
    .setDestinationPath('path/within/the/destination/archive')
    .setFilter((entry) => {
        if (entry.getFileNameString().endsWith('.rar')) {
            return false; //Filter entry
        } else {
            return true; //Allow entry
        }
    });

Node.js

While mainly intended for use in web browsers, this library can also be used in Node.js.

To read data from files, a NodeFileIO object can be used:

import * as fs from 'node:fs';

let file = await fs.promises.open('path/to/file.zip', 'r');
let stat = await file.stat();
let reader = new io.NodeFileIO(file, 0, stat.size);

Armarius will automatically recognize that it is running in a Node.js environment and use the appropriate compression implementation based on the Node.js built-in zlib module.

License

Armarius is open source software released under the MIT license, see license.

Contributing

You can contribute to this project by forking the repository, adding your changes to your fork, and creating a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
.github/workflows		.github/workflows
src		src
test		test
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index-browser.js		index-browser.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Armarius

About

Installation

Usage

Reading a ZIP archive

Reading all archive entries

Iterating over archive entries

Finding specific entries

Reading entry data

Reading a full entry

Reading entry data in chunks

Writing archives

Generating entries

Reading output chunks

Merging ZIP archives

MergeSource objects

Node.js

License

Contributing

About

Releases 37

Packages

Contributors 2

Languages

License

aternosorg/armarius

Folders and files

Latest commit

History

Repository files navigation

Armarius

About

Installation

Usage

Reading a ZIP archive

Reading all archive entries

Iterating over archive entries

Finding specific entries

Reading entry data

Reading a full entry

Reading entry data in chunks

Writing archives

Generating entries

Reading output chunks

Merging ZIP archives

MergeSource objects

Node.js

License

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 37

Packages 0

Contributors 2

Languages

Packages