08 Jun 25
DuckLake provides a lightweight one-stop solution for a data lake and catalog, similar to Delta Lake with Unity Catalog and Iceberg with Lakekeeper or Polaris, released under the MIT license. It includes an open table format but it’s also a data lakehouse format, meaning that it also contains a catalog to encode the schema of the data stored. It needs a storage layer (both blob storage and block-based storage work) and a catalog database (any SQL-compatible database works). The data files of DuckLake must be stored in Parquet. Similarly to other data lakehouse technologies, DuckLake does not support constraints, keys, or indexes. Currently, it can be exported into a DuckDB database and vanilla Parquet files. You can also use it for a “multiplayer DuckDB” setup with multiple DuckDB instances reading and writing the same dataset.