Bayesian segmentation of imaging-based spatial transcriptomics data
Baysor segments imaging-based spatial transcriptomics data using spatial position, local gene composition, and optional prior segmentation masks.
This cpp branch contains the first C++ port of Baysor.
The current goal of this branch is to preserve the core segmentation algorithm
of the current Baysor release line on master (v0.7.1), while improving the
implementation around it:
- native C++17 / CMake build
- substantial performance and memory optimizations
legacyandparquetoutput styles- Parquet / GeoParquet output support
- direct
experiment.xeniuminput resolution - documented Xenium workflow via
xeniumranger import-segmentation - the
run,preview, andsegfreesubcommands in one native binary
Future C++ releases may diverge algorithmically, but this first release is intended as a faithful C++ implementation of the current Baysor algorithm with a more efficient runtime and broader modern I/O support.
The main CLI entrypoint is:
./build/baysor run --helpExample datasets and runnable commands:
User-facing documentation for this branch:
- Algorithmic continuity: follows the Baysor
v0.7.1segmentation algorithmic line while reimplementing it in C++. - Performance work: reduces memory pressure in clustering, segmentation, NCV computation, and Loom writing, and improves Parquet loading.
- Modern output support: keeps the familiar
legacybundle and adds aparquetbundle with Parquet / GeoParquet tables and a 10x-style HDF5 count matrix. - Xenium workflow: accepts
experiment.xeniumdirectly and documents the recommended Xenium Explorer handoff throughxeniumranger import-segmentation. - Volumetric support: includes 3D handling and polygon output for datasets such as STARmap.
Install CMake, Ninja, a C++17 toolchain, plus the libraries required by
find_package() in CMakeLists.txt. Versions are kept
intentionally broad for package-manager builds:
| Dependency | Version note |
|---|---|
| CMake | >= 3.20 |
| C++ compiler | C++17 compiler; GCC 9.4.0 and Visual Studio 2022 are known to work |
| Ninja | Recent Ninja; 1.10.0 is known to work |
| Eigen3 | >= 3.3 |
| OpenMP | C++ OpenMP target; GCC OpenMP 4.5 is known to work |
| spdlog | Not pinned; 1.5.0 is known to work |
| CGAL | Not pinned; 5.0.2 is known to work |
| Arrow / Parquet | Not pinned; 19.0.1 is known to work; Arrow must include compute, CSV, and Parquet support |
| HDF5 | Not pinned; 1.10.x is known to work |
| nlohmann_json | Not pinned; 3.7.3 is known to work |
| libtiff | Not pinned; 4.1.0 is known to work |
Several header-only dependencies are fetched automatically by CMake with pinned
tags: aarand v1.0.2, CppKmeans v3.1.1, subpar v0.3.1,
knncolle v2.3.0, CppIrlba v2.0.2, and umappp v2.0.1.
After dependencies are installed, use the same command on Linux, macOS, and Windows:
cmake -P cmake/build_and_install.cmakeThis configures an end-user build: optimized, tests off, and installed to
./install/bin. Platform-specific prerequisite commands are in
docs/installation.md. Windows uses vcpkg when
VCPKG_ROOT is set; Linux and macOS use system packages by default.
Run the installed binary with:
./install/bin/baysor --helpDetailed installation instructions are in docs/installation.md.
If you find Baysor useful for your publication, please cite:
Petukhov V, Xu RJ, Soldatov RA, Cadinu P, Khodosevich K, Moffitt JR & Kharchenko PV.
Cell segmentation in imaging-based spatial transcriptomics.
Nat Biotechnol (2021). https://doi.org/10.1038/s41587-021-01044-w