174 releases (12 stable)

Uses new Rust 2024

new 6.0.0 May 11, 2026
4.0.1 Apr 24, 2026
4.0.0 Mar 30, 2026
3.0.1 Mar 19, 2026
0.0.1-alpha0 Jul 28, 2022

#9 in Machine learning

Download history 36530/week @ 2026-01-23 35336/week @ 2026-01-30 36072/week @ 2026-02-06 35610/week @ 2026-02-13 29685/week @ 2026-02-20 31317/week @ 2026-02-27 34890/week @ 2026-03-06 35707/week @ 2026-03-13 32011/week @ 2026-03-20 36862/week @ 2026-03-27 32936/week @ 2026-04-03 41518/week @ 2026-04-10 46602/week @ 2026-04-17 53378/week @ 2026-04-24 45480/week @ 2026-05-01 57237/week @ 2026-05-08

214,455 downloads per month
Used in 124 crates (27 directly)

Apache-2.0

13MB
287K SLoC

Rust Implementation of Lance

Lance Logo

The Open Lakehouse Format for Multimodal AI

Installation

Install using cargo:

cargo install lance

Examples

Create dataset

Suppose batches is an Arrow Vec<RecordBatch> and schema is Arrow SchemaRef:

use lance::{dataset::WriteParams, Dataset};

let write_params = WriteParams::default();
let mut reader = RecordBatchIterator::new(
    batches.into_iter().map(Ok),
    schema
);
Dataset::write(reader, &uri, Some(write_params)).await.unwrap();

Read

let dataset = Dataset::open(path).await.unwrap();
let mut scanner = dataset.scan();
let batches: Vec<RecordBatch> = scanner
    .try_into_stream()
    .await
    .unwrap()
    .map(|b| b.unwrap())
    .collect::<Vec<RecordBatch>>()
    .await;

Take

let values: Result<RecordBatch> = dataset.take(&[200, 199, 39, 40, 100], &projection).await;

Vector index

Assume "embeddings" is a FixedSizeListArray

use ::lance::index::vector::VectorIndexParams;

let params = VectorIndexParams::default();
params.num_partitions = 256;
params.num_sub_vectors = 16;

// this will Err if list_size(embeddings) / num_sub_vectors does not meet simd alignment
dataset.create_index(&["embeddings"], IndexType::Vector, None, &params, true).await;

What is Lance?

Lance is an open lakehouse format for multimodal AI. It contains a file format, table format, and catalog spec that allows you to build a complete lakehouse on top of object storage to power your AI workflows.

The key features of Lance include:

  • Expressive hybrid search: Combine vector similarity search, full-text search (BM25), and SQL analytics on the same dataset with accelerated secondary indices.

  • Lightning-fast random access: 100x faster than Parquet or Iceberg for random access without sacrificing scan performance.

  • Native multimodal data support: Store images, videos, audio, text, and embeddings in a single unified format with efficient blob encoding and lazy loading.

  • Data evolution: Efficiently add columns with backfilled values without full table rewrites, perfect for ML feature engineering.

  • Zero-copy versioning: ACID transactions, time travel, and automatic versioning without needing extra infrastructure.

  • Rich ecosystem integrations: Apache Arrow, Pandas, Polars, DuckDB, Apache Spark, Ray, Trino, Apache Flink, and open catalogs (Apache Polaris, Unity Catalog, Apache Gravitino).

For more details, see the full Lance format specification.

Dependencies

~100–150MB
~2.5M SLoC