Apache Iceberg C++ library
You do not need Java to use Apache Iceberg™.
It's an alternative for iceberg-cpp. The library is used by Tea, an open-source extension for Greenplum that allows it to read Iceberg data from S3 compatible storage using HMS and Nessie catalogs.
Source https://iceberg.apache.org/status/
| Data type | Iceberg version | Cxx | Java |
|---|---|---|---|
| boolean | 2 | + | + |
| int | 2 | + | + |
| float | 2 | + | + |
| double | 2 | + | + |
| decimal | 2 | + | + |
| date | 2 | + | + |
| time | 2 | + | + |
| timestamp | 2 | + | + |
| timestamptz | 2 | + | + |
| timestamp_ns | 3 | + | + |
| timestamptz_ns | 3 | + | + |
| string | 2 | + | + |
| uuid | 2 | + | + |
| fixed | 2 | + | + |
| binary | 2 | + | + |
| variant | 3 | - | + |
| list | 2 | + | + |
| map | 2 | - | + |
| struct | 2 | - | + |
| unknown | 3 | + | ? |
Datetime restrictions are defined in Iceberg spec.
For date underlying type is int32. For time* it's int64.
timestamp and timestamptz store microseconds from 1970-01-01 00:00:00.000000.
timestamp_ns and timestamptz_ns store nanoseconds from 1970-01-01 00:00:00.000000000.
tz suffix means the time is adjusted to UTC.
| File format | Cxx | Java |
|---|---|---|
| Parquet | + | + |
| ORC | - | + |
| Puffin | + | + |
| Avro | - | + |
| File IO | Cxx | Java |
|---|---|---|
| Local Filesystem | + | + |
| Hadoop Filesystem | - | + |
| S3 Compatible | + | + |
| GCS Compatible | - | + |
| ADLS Compatible | - | + |
Not implemented
Not implemented
| Operation | Iceberg version | Cxx | Java |
|---|---|---|---|
| Plan with data file | 1,2 | + | + |
| Plan with position deletes | 2 | + | + |
| Plan with equality deletes | 2 | + | + |
| Plan with puffin statistics | 1,2 | - | + |
| Read data file | 1,2 | + | + |
| Read with position deletes | 2 | + | + |
| Read with equality deletes | 2 | + | + |
| Operation | Iceberg version | Cxx | Java |
|---|---|---|---|
| Append data | 1,2 | + | + |
| Write position deletes | 2 | - | + |
| Write equality deletes | 2 | - | + |
| Write deletion vectors | 3 | + | + |
| Table Operation | Nessie | Glue | HMS |
|---|---|---|---|
| listTable | - | - | - |
| createTable | - | - | - |
| dropTable | - | - | - |
| loadTable | +- | - | +- |
| updateTable | - | - | - |
| renameTable | - | - | - |
| tableExists | +- | - | +- |
| createView | - | - | - |
| dropView | - | - | - |
| listView | - | - | - |
| viewExists | - | - | - |
| replaceView | - | - | - |
| renameView | - | - | - |
| listNamespaces | - | - | - |
| createNamespace | - | - | - |
| dropNamespace | - | - | - |
| namespaceExists | - | - | - |
| updateNamespaceProperties | - | - | - |
| loadNamespaceMetadata | - | - | - |
- C++20 compliant compiler
- CMake 3.20 or higher
- OpenSSL
You have to download Apache Arrow dependencies first.
mkdir _deps && cd _deps
git clone --single-branch -b maint-15.0.2 https://github.com/apache/arrow.git
cd arrow && git apply ../../vendor/arrow/fix_c-ares_url.patch && cd ..
./arrow/cpp/thirdparty/download_dependencies.sh ./arrow-thirdpartymkdir _build
cd _build
ln -s ../_deps/arrow-thirdparty arrow-thirdparty
cmake -GNinja ../
ninja
cd tests/
../iceberg/iceberg-cpp-test
../iceberg/common/fs/iceberg_common_fs_test
./iceberg_local_test