LakeSail

LakeSail · 2026-05-19T13:45:11.453Z

What if Spark was rewritten in Rust? Well, that's Sail. Besides all the advantages you get from a Rust-native engine, the best part is your Spark code still works as is.

Software Development

Spark rewritten in Rust. A new standard for unified compute, reimagined for modern data and AI infrastructure.

Discover all 15 employees

About us

LakeSail is a cloud-native platform redefining big data processing for the AI driven future. Its innovative, unified open source computation framework, Sail, is built entirely in Rust and runs ~4x faster than Spark while reducing hardware costs by up to 94%, while maintaining Spark compatibility. LakeSail's mission is to unify batch processing, stream processing, and compute-intensive AI workloads into a seamless framework engineered for unparalleled scalability and speed at a fraction of the cost.

Website: https://lakesail.com/
External link for LakeSail
Industry: Software Development
Company size: 11-50 employees
Headquarters: San Francisco
Type: Privately Held
Founded: 2023

Locations

Primary

San Francisco, US

Get directions
C/O Haitham Amin; 50 California Street

Suite 1500

San Francisco, CA 94111, US

Get directions

Employees at LakeSail

See all employees

Updates

LakeSail reposted this
Alexy Khrabrov
1w
Report this post
It was a tremendous honor to represent LakeSail for the first time in my official capacity, as the head of community, at the Data for AI meetup today! Thank to Adi Wabisabi, who organized and moderated, and is Lisa N. Cao 2.0 at Datastrato lol, we had a deep dive into the semantic layer for the Agentic AI Lakehouse of the future. Kudos to my fellow panelists Josh W., Mark Hoerth, and Andrew Madson, and our fearless CEO Shehab Amin who took this photo, and was there along with Everett Roeth and Delly Tamer on our team and Kranti K. Parisa of LaserData, our blazing-fast, Rust-backed streaming partner!
6 Comments

Like Comment Share
LakeSail

1,106 followers
1w
Report this post
Spark accelerators like Photon speed up portions of execution, but are still tied to the JVM. Sail, written in Rust, removes the JVM entirely. The result is drastically better performance with no JVM tuning required. The same familiar Spark API on a Rust-native runtime. That’s Sail.

1 Comment

Like Comment Share
LakeSail

1,106 followers
1w
Report this post
Why fully rebuild Spark in Rust instead of just accelerating it? Spark accelerators speed up parts of execution, but they still inherit Spark's JVM control plane, memory model, shuffle path, and Python serialization costs. You're still tuning a JVM. With Sail, we took a clean-slate approach: a fully Rust-native runtime, with no JVM, no heap tuning, and no GC pauses. Same Spark interface with an entirely new runtime underneath. Read our full breakdown how Sail compares to Spark accelerators here → https://lnkd.in/gwyF8Fq2
Like Comment Share
LakeSail reposted this
Alexy Khrabrov
2w Edited
Report this post
We wrote a book, with Codex, delving deep into LakeSail architecture. Sail is a modern data ecosystem with deep roots in Apache Arrow, Apache DataFusion, and rebuilding the whole Apache Spark ecosystem in Rust. One of the key advantages of AI-native stack is extensibility. We are convening the community to build Sail extensions. A proposal is on the table in lakehq/sail. In order to extend the engine as profoundly efficient as Sail, you need to operate at several levels — physical, logical plans, loading and linking, and performing at the top both in a single node and in cluster mode. The book is written as an exploration of the codebase, of the overall Sail architecture, its use of Arrow and DataFusion, its implementation of SparkConnect protocol, and everything else pertinent to the extensions. https://lnkd.in/gcH9puUX You’ll also learn Rust along the way, seeing it in action, doing heavy lifting.

5 Comments

Like Comment Share
LakeSail reposted this
Adi Wabisabi
3w Edited
Report this post
I'm excited to announce the panel for our next Data for AI on Jun 3rd at Yes SF! We have four experts in the field, who will be speaking on our panel on unifying enterprise-wide data for agents, each covering various dimensions of this challenge at scale. Josh W., LiveRamp: Josh is a Principal Architect at LiveRamp, where he's at the intersection of AI, data and MarTech. In case you missed his last Data for AI presentation, he's built a semantic middle layer that allows LiveRamp's agents to understand the context of data from over a dozen systems. They had to balance cost, security, auditability, and many other factors to make this initiative a success. Previously, he's held senior roles at Highnote, Coinbase, Twilio, and Salesforce, and holds a patent on database server access management. Mark Hoerth, Datastrato: Mark leads product at Datastrato, working on Apache Gravitino and the next chapter of open table formats for AI. He joined from Dremio, where he held product and solution architecture roles spanning Apache Iceberg lakehouse deployments and AI semantic search, and led Dremio's efforts security, Iceberg, and Apache Polaris. A Stanford alum based in the Bay Area, Mark is a longtime Silicon Valley startup builder. Andrew Madson, Fivetran: Andrew leads Developer Relations at Fivetran, where he builds programs that help developers and data teams adopt modern data and AI tooling. He's the author of O'Reilly's Apache Polaris: The Definitive Guide, with two more books on the way — AI-Ready Data (Wiley) and Data Transformation (O'Reilly). Andrew previously built DevRel functions at Tobiko Data and Dremio, and he teaches data science and engineering as a graduate professor. Alexy Khrabrov, LakeSail: Dr. Khrabrov is the Head of Community at LakeSail, building the Spark-compatible AI Lakehouse of tomorrow in Rust. He is also the founder of the Community Research Center for Reliable AI at Northeastern University, founder and organizer of AI By the Bay, Bay Area AI, AI Agent SF, the longest-running, deepest technical OSS AI communities, conferences, and meetups in the San Francisco Bay Area. Previously, Alexy was the Director of Open-Source Science at IBM Research, AI Community Architect at Neo4j, Senior Software Engineer at Amazon, and a co-founder and engineer in several Bay Area startups. Spaces are still available, but are running out! Sign up today! https://luma.com/8tvd2xla Thank you to our sponsors, Datastrato, Fivetran & LakeSail for making this event possible! See you there!

Snowflake Summit Side Event: How Production AI Agents Access Data Across the Entire Enterprise · Luma luma.com

3 Comments

Like Comment Share
LakeSail reposted this
Alexy Khrabrov
3w Edited
Report this post
The Apache Iceberg meetup was a blast! Shehab Amin and I gave an overview of LakeSail and its Sail support of iceberg via Engine Contract. A really strong engineering audience, that stayed focused through 2.5 hours of deep technical talks. Thank you Scott Haines for the masterful photo. We’ll be back!
1 Comment

Like Comment Share
LakeSail reposted this
Alexy Khrabrov
3w
Report this post
Come to the Bay Area Apache Iceberg meetup tonight! Shehab Amin and I will talk about LakeSail, a Spark ecosystem rebuilt in Rust, and how our Iceberg support upholds an Engine Contract, building support for Iceberg into the engine. Thank you for hosting us, Zhenni Wu, Scott Haines, Lisa N. Cao, Nathan Yee! https://lnkd.in/gyTnEHyD
3 Comments

Like Comment Share
LakeSail

1,106 followers
3w
Report this post
What if Spark was rewritten in Rust? Well, that's Sail. Besides all the advantages you get from a Rust-native engine, the best part is your Spark code still works as is.

1 Comment

Like Comment Share
LakeSail reposted this
Alexy Khrabrov
3w
Report this post
Apache Spark is an API for big data, and Sail from LakeSail is a new implementation for the Agentic AI era.

Refreshing Spark API for the Agentic Era chiefscientist.org

Like Comment Share

LakeSail

Software Development

Spark rewritten in Rust. A new standard for unified compute, reimagined for modern data and AI infrastructure.

About us

Locations

Employees at LakeSail

Peter Jackson

Mark Herring

Mike Price

Robin Vasan

Updates

Join now to see what you are missing

Similar pages

Tradery Capital

Flarion

Apache DataFusion

Turso

Bluescape

Biztera

Embucket, Inc.

Greptime | The Single Database for Observability on S3

VideoDB

BuildGrowth.ai

Browse jobs

Customer Associate jobs

Manager jobs