Generates a file index with raw offsets of a tarball, using the same format as emscripten file packager. This this metadata can be used to mount the tar blob in Emscripten's WORKERFS virtual filesystem without extracting it.
For a longer intro see this blog post: Mounting tar archives as a filesystem in WebAssembly
npm install tar-vfs-indexnpx tar-vfs-index archive.tar.gz
npx tar-vfs-index archive.tar.zst [output.json]
npx tar-vfs-index --append archive.tar.gzIf no input file is given, stdin is used:
curl -sSL https://cran.r-project.org/src/contrib/Archive/jose/jose_1.0.tar.gz | npx tar-vfs-indexOutput is written to stdout, or to a file if two arguments are given:
{
"files": [
{ "filename": "mypackage/DESCRIPTION", "start": 512, "end": 548 },
{ "filename": "mypackage/R/code.R", "start": 1536, "end": 1563 }
],
"remote_package_size": 3072
}import tarindex from 'tar-vfs-index';
import { createReadStream } from 'node:fs';
const result = await tarindex(createReadStream('archive.tar.gz'));
console.log(result.files);
// [
// { filename: 'mypackage/DESCRIPTION', start: 512, end: 548 },
// { filename: 'mypackage/R/code.R', start: 1536, end: 1563 },
// ]
console.log(result.remote_package_size); // total bytes consumedThe start and end values are byte offsets within the decompressed tar stream.
Emscripten's WORKERFS filesystem lets you mount a vfs image inside a web worker, giving compiled C/C++ code read-only access to its files without copying. Mounting an image requires a metadata JSON object (normally produced by file_packager --separate-metadata) alongside a Blob of the raw archive data.
tar-vfs-index generates this metadata object for a tar archive. Note that if your tar file is gzipped (tar.gz) you should use the browser-native DecompressionStream to get the blob of the uncompressed tarball.
const [metaRes, dataRes] = await Promise.all([
fetch('archive.tar.gz.json'), // output of tar-vfs-index
fetch('archive.tar.gz'),
]);
const metadata = await metaRes.json();
// WORKERFS slices the blob using the offsets in metadata, which refer to
// positions in the decompressed tar stream, so decompress before mounting.
const blob = await new Response(
dataRes.body.pipeThrough(new DecompressionStream('gzip'))
).blob();
FS.mkdir('/pkg');
FS.mount(WORKERFS, { packages: [{ metadata, blob }] }, '/pkg');The --append flag embeds the index directly into the archive as a .vfs-index.json entry, followed by a 16-byte lookup hint. This produces a self-contained .tar.gz that can be mounted by webR without a separate metadata file (as described in tar-metadata):
npx tar-vfs-index --append archive.tar.gz # modifies the file in-place