5 releases (breaking)
Uses new Rust 2024
| 0.10.0 | Dec 20, 2025 |
|---|---|
| 0.4.1 | Dec 3, 2025 |
| 0.3.1 | Dec 3, 2025 |
| 0.2.0 | Dec 1, 2025 |
| 0.1.0 | Dec 1, 2025 |
#572 in Math
165KB
3.5K
SLoC
JJJJJJJJJJJ AAA VVVVVVVV VVVVVVVV
J:::::::::J A:::A V::::::V V::::::V
J:::::::::J A:::::A V::::::V V::::::V
JJ:::::::JJ A:::::::AV::::::V V::::::V
J:::::J A:::::::::AV:::::V V:::::V
J:::::J A:::::A:::::AV:::::V V:::::V
J:::::J A:::::A A:::::AV:::::V V:::::V
J:::::j A:::::A A:::::AV:::::V V:::::V
J:::::J A:::::A A:::::AV:::::V V:::::V
JJJJJJJ J:::::J A:::::AAAAAAAAA:::::AV:::::V V:::::V
J:::::J J:::::J A:::::::::::::::::::::AV:::::V:::::V
J::::::J J::::::J A:::::AAAAAAAAAAAAA:::::AV:::::::::V
J:::::::JJJ:::::::JA:::::A A:::::AV:::::::V
JJ:::::::::::::JJA:::::A A:::::AV:::::V
JJ:::::::::JJ A:::::A A:::::AV:::V
JJJJJJJJJ AAAAAAA AAAAAAAVVV
EEEEEEEEEEEEEEEEEEEEEELLLLLLLLLLL IIIIIIIIII
E::::::::::::::::::::EL:::::::::L I::::::::I
E::::::::::::::::::::EL:::::::::L I::::::::I
EE::::::EEEEEEEEE::::ELL:::::::LL II::::::II
E:::::E EEEEEE L:::::L I::::I
E:::::E L:::::L I::::I
E::::::EEEEEEEEEE L:::::L I::::I
E:::::::::::::::E L:::::L I::::I
E:::::::::::::::E L:::::L I::::I
E::::::EEEEEEEEEE L:::::L I::::I
E:::::E L:::::L I::::I
E:::::E EEEEEE L:::::L LLLLLL I::::I
EE::::::EEEEEEEE:::::ELL:::::::LLLLLLLLL:::::LII::::::II
E::::::::::::::::::::EL::::::::::::::::::::::LI::::::::I
E::::::::::::::::::::EL::::::::::::::::::::::LI::::::::I
EEEEEEEEEEEEEEEEEEEEEELLLLLLLLLLLLLLLLLLLLLLLLIIIIIIIIII
NNNNNNNN NNNNNNNN
N:::::::N N::::::N
N::::::::N N::::::N
N:::::::::N N::::::N
N::::::::::N N::::::N
N:::::::::::N N::::::N
N:::::::N::::N N::::::N
N::::::N N::::N N::::::N
N::::::N N::::N:::::::N
N::::::N N:::::::::::N
N::::::N N::::::::::N
N::::::N N:::::::::N
N::::::N N::::::::N
N::::::N N:::::::N
N::::::N N::::::N
NNNNNNNN NNNNNNN
vector space embeddings visualisation
Lance inspector and TUI for Arrow/Lance datasets.
Javelin is a Rust-based command-line and TUI tool for inspecting datasets stored in the Lance format (and compatible Parquet exports). It focuses on fast, ergonomic exploration of embedding-like matrices, sparse COO data, and 1D vectors using an interactive terminal UI built on ratatui and crossterm. Next step: graph visualisations, please star ⭐ the repo.
Download one of the pre-built packages in Releases.
Compile with cargo:
cargo install javelin-tui
Usage
Quickstart
# generate a toy dataset
javelin generate
# load the dataset in the tui
javelin --filepath ./javelin_test
# Select one of the supported files types
- Dense
- Sparse (COO, like adjacency matrices)
- 1D vectors (i.e. norms)
# Explore the different modes
[Head]
[Sample]
[Display]
[T] Transpose
[q] Exit
Basic CLI
# Show the first 20 rows
javelin --filepath /path/to/dataset.lance head --n 20
# Randomly sample 50 rows, preserving original indices
javelin --filepath /path/to/dataset.lance sample --n 50
# Open full dataset in TUI viewer
javelin --filepath /path/to/dataset.lance display
TUI launcher (default)
# Launcher for a directory of Lance datasets
javelin --filepath /path/to/dir
If you omit the subcommand:
- When
filepathis a directory, the launcher scans it for.lancefiles. - When
filepathis a file, you can still use the launcher to choose a command.
Launcher key bindings
- Up / Down or k / j: Move selection between files.
- Left / Right or h / l: Cycle between commands (Head, Sample, Display, …).
- Enter: Run the selected command on the selected file.
- q / Esc: Exit the launcher.
Features
Single-entry TUI launcher
javelin --filepath /path/to/dirstarts a launcher that:- Scans a directory for
.lancedatasets. - Lets you select a file with Up/Down keys.
- Lets you select a command (Head, Sample, Display, etc.) with Left/Right keys.
- Launches the corresponding interactive viewer for the chosen file.
- Scans a directory for
If no subcommand is provided, the default is the TUI launcher.
Dense matrix viewer
- Detects Lance “vector” layout and generic dense
col_*layouts. - Shows a scrollable table with:
- A row index column.
- Feature columns from
col_*. - Per-row mean and standard deviation (for multi-column dense layouts).
- Supports:
- Horizontal scrolling over features.
- Vertical scrolling over rows.
- A transposed view (features × samples) toggled via a key.
1D vector viewer
- Specialized UI for
LanceLayout::Vector1Ddata (e.g. eigenvalues, norms). - No avg/std columns; values are displayed with 12 decimal digits.
- Same navigation shortcuts as the dense viewer.
Sparse COO viewer
- Expects COO data in
row,col,valueschema. - Layout:
- Top: matrix metadata and density.
- Middle:
- Triples table with vertical scrolling over
(row, col, value)entries. - ASCII sparsity map (downsampled for large matrices) that highlights nonzeros.
- Triples table with vertical scrolling over
- Bottom: diagonals and connectivity summaries (e.g., most connected rows).
Sampling and indexing
-
cmd_sample:- Randomly selects
ndistinct row indices. - Reads the minimal prefix needed to cover those indices.
- Uses Arrow
taketo build a sampledRecordBatch. - Adds a
row_idxcolumn with the original dataset indices. - Opens the sampled batch in the TUI viewer.
- Randomly selects
-
cmd_head:- Shows the first
nrows in the interactive viewer.
- Shows the first
-
cmd_stats:- Reports dataset row count and schema.
- Prints per-column structural information.
Storage integration
-
Uses a
LanceStoragebackend to:- Load dense matrices from Lance vector datasets.
- Load sparse COO matrices from Lance triplet datasets.
- Save dense matrices back as Lance vector datasets via
save_dense("raw_input").
-
Parquet support:
- Reads Parquet into Arrow
RecordBatches. - Detects:
- Lance-like vector format (
FixedSizeList<Float64>), or - Wide columnar float format with
col_*columns.
- Lance-like vector format (
- Converts wide columnar Parquet to a dense matrix and saves it as Lance row‑major “raw_input” using
save_dense.
- Reads Parquet into Arrow
Build from source
Prerequisites
- Rust (stable toolchain and
cargo). - A terminal that supports ANSI escape codes.
Build from source
git clone https://gitlab.com/yera/javelin.git
cd javelin
cargo build --release
The binary will be available at:
target/release/javelin
Interactive viewers
Dense and 1D viewers
Key bindings:
- Up / Down or k / j:
- Scroll vertically over rows.
- Left / Right or h / l:
- Scroll horizontally over feature columns (dense) or vector columns (1D).
- H:
- Jump to the first visible column.
- E:
- Jump to the last visible column window.
- t:
- Toggle transpose (N×F ↔ F×N) in dense layouts.
- q / Esc:
- Exit the viewer.
Behavior:
-
Dense layouts show:
- Synthetic “Row” index column.
col_*features.- Per-row
avgandstdcomputed over all numeric feature columns.
-
1D layouts show:
- Row index.
- One or more value columns with 12 decimal digits and no avg/std.
Sparse COO viewer
Key bindings are the same for scrolling:
- Up / Down or k / j: vertical scroll through triples.
- q / Esc: exit.
Panels:
- Metadata: matrix dimensions and density.
- Triples table: index,
row,col,valuewith vertical scrolling. - Sparsity map: ASCII grid marking nonzeros.
- Structure summary: main diagonal entries and most-connected rows.
Data formats
Dense Lance (vector) format
- Stored as a single
FixedSizeList<Float64>column (e.g.vector). - Reconstructed as a dense matrix in column-major order for computation.
- Displayed in the TUI as:
- A dense matrix table, or
- A 1D vector viewer for
LanceLayout::Vector1D.
Sparse COO format
- Schema:
row: UInt32
col: UInt32
value: Float64
- Matrix dimensions stored in schema metadata (
rows,cols,nnz). - Reconstructed internally as a CSR matrix when needed.
Parquet import
- Vector-like:
FixedSizeList<Float64>column(s). - Wide columnar: multiple
Float64col_*columns. - Wide columnar data is:
- Interpreted as a dense matrix.
- Saved via
save_dense("raw_input")into Lance format. - Then handled by the same TUI paths as native Lance data.
Typical workflows
Inspect an embedding matrix
javelin --filepath embeddings.lance display
- Scroll across feature dimensions.
- Toggle transpose to view feature-centric slices.
- Inspect per-row
avgandstd.
Random sample with original indices
javelin --filepath embeddings.lance sample --n 100
- Examine the
row_idxcolumn to see which original rows were sampled.
Explore a directory of experiments
javelin --filepath ./experiments
- Pick a dataset with Up/Down.
- Choose Head, Sample, or Display with Left/Right.
- Press Enter to launch the viewer.
Inspect sparse matrices
javelin --filepath graph.lance display
- Use the COO viewer to inspect structure, sparsity pattern, and connectivity.
License
See the LICENSE file in this repository for licensing details. Respect all third-party library licenses when redistributing binaries or integrating Javelin into other systems.
Dependencies
~176MB
~3M SLoC