This project is entirely derived from LaurentMazare/ocaml-arrow.
This is a reimplementation using the OCaml Standard Library and updated to Apache Arrow version 21 and C++ 17.
Some of the ocaml-arrow features around PPX have not yet been ported.
Add the Apache apt repository
apt install -y -V ca-certificates lsb-release wget
wget https://packages.apache.org/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
apt install -y -V ./apache-arrow-apt-source-latest-$(lsb_release --codename --short).debInstall the Arrow development library for C++
- libarrow-dev - Apache Arrow C++ development libraries and headers
- libparquet-dev - Apache Parquet C++ development libraries and headers
apt install -y libarrow-dev libparquet-devThen create a switch and build the project.
opam switch create . 5.3.0 --deps-only --with-test
dune buildOn Ubuntu Plucky the Apache repository does not yet have a prebuild binary, therefore we must build Apache Arrow from source
Install build dependencies
sudo apt update
sudo apt install -y build-essential cmake git \
libboost-all-dev libssl-dev libcurl4-openssl-dev \
libbz2-dev zlib1g-dev liblz4-dev libzstd-dev \
libsnappy-dev libre2-dev libthrift-devClone Arrow repository
git clone https://github.com/apache/arrow.git
cd arrowCheckout latest stable release (optional, or use main branch)
git checkout apache-arrow-21.0.0Create build directory
cd cpp
mkdir build
cd buildConfigure with CMake (basic options)
cmake .. \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=/usr/local \
-DARROW_COMPUTE=ON \
-DARROW_CSV=ON \
-DARROW_DATASET=ON \
-DARROW_FILESYSTEM=ON \
-DARROW_JSON=ON \
-DARROW_PARQUET=ON \
-DARROW_WITH_SNAPPY=ON \
-DARROW_WITH_ZLIB=ON \
-DARROW_WITH_LZ4=ON \
-DARROW_WITH_ZSTD=ONBuild (use all CPU cores)
make -j$(nproc)Install
sudo make installUpdate library cache
sudo ldconfigFor extra tests, do git clone https://github.com/apache/parquet-testing test/test-data and all the test data will be parsed using this library.