Skip to content

BUILD: Drop bundled v3p/dcmtk and modernize vil_dicom for DCMTK 3.6+#1045

Merged
hjmjohnson merged 10 commits into
vxl:masterfrom
hjmjohnson:feature/dcmtk-modernize
Jun 2, 2026
Merged

BUILD: Drop bundled v3p/dcmtk and modernize vil_dicom for DCMTK 3.6+#1045
hjmjohnson merged 10 commits into
vxl:masterfrom
hjmjohnson:feature/dcmtk-modernize

Conversation

@hjmjohnson

Copy link
Copy Markdown
Contributor

Drop the bundled v3p/dcmtk (236 files, ~94k LOC) and modernize vil_dicom to read DICOM through an externally-installed DCMTK ≥ 3.6.5 via find_package(DCMTK CONFIG). Closes #550, #768, #769, #816 by retiring the 23-year-old vendored fork and porting the reader to the modern public API.

Motivation

The bundled v3p/dcmtk dates to 2002 and has not built against any DCMTK ≥ 3.6 for roughly a decade. The relevant issues (#550, #768, #769, #816) were filed by distro/packaging maintainers (Fedora, FreeBSD ports, vcpkg) reporting that vil_dicom cannot be built against system DCMTK because of structural API changes (e.g. DcmInputStreamFactory::ident() becoming pure-virtual, DiInputPixelTemplate's ctor growing from 6 to 9 parameters). No PR has addressed those reports in the seven years since the first one.

This PR makes the modernized reader build and run cleanly against DCMTK 3.7.0 and drops the vendored copy entirely.

What changed

Five focused commits:

Commit Subject
2c74c205 BUILD: Remove vendored v3p/dcmtk (236 files, ~94k LOC)
46dbe80c BUILD: Use find_package(DCMTK CONFIG) for external DCMTK
ab4be2fb BUG: Port vil_dicom to the DCMTK 3.6+ public API (fixes #550, #769)
c6014eaf BUG: Read via DcmFileFormat::loadFile() to dodge static-lib ABI skew
be7dee62 ENH: Add minimal direct vil_dicom smoke test

Highlights:

  • config/cmake/Modules/FindDCMTK.cmake is replaced by find_package(DCMTK CONFIG REQUIRED). core/vil links against the imported targets DCMTK::dcmimgle, DCMTK::dcmdata, DCMTK::ofstd.
  • All nine DCMTK headers in the reader move to the namespaced form (<dcmtk/dcmdata/dcfilefo.h>, etc.).
  • The 9-argument DiInputPixelTemplate ctor (DiDocument*, ..., DcmFileCache*, fragment&) is used; vil's reader constructs a DiDocument over the loaded DcmFileFormat and passes a stack DcmFileCache with a zero fragment index.
  • VXL_BUILD_DCMTK (which was only ever set by the deleted v3p subdir) is gone; contrib/mul/vimt's _BUILD_DCMTK guard now keys on VXL_USE_DCMTK.
Why the input stream is no longer subclassed (commit c6014ea)

The first port (ab4be2fb) used the historical vil_dicom_stream_input subclass of DcmInputStream. That compiled cleanly but reliably segfaulted on the very first virtual dispatch from DcmFileFormat::readUntilTag. The vtable pointer of the input-stream object read as the binary image base instead of the expected DcmInputStream vtable.

Root cause: DCMTK's static libraries are built with -fvisibility=hidden plus a private set of POSIX feature-test macros (_DARWIN_C_SOURCE, _XOPEN_SOURCE_EXTENDED, _BSD_SOURCE, _BSD_COMPAT, _OSF_SOURCE) that DCMTK does not propagate via INTERFACE_COMPILE_DEFINITIONS on its imported targets. A consumer that derives from DcmInputStream / DcmProducer therefore sees a slightly different class layout than the one frozen in libdcmdata.a, and the resulting ABI skew corrupts the vtable on the first virtual dispatch.

Trying to mirror DCMTK's internal flags from the consumer side (matching CMAKE_CXX_STANDARD, adding the POSIX macros, enabling -fvisibility=hidden -fvisibility-inlines-hidden on the affected sources) did not reliably fix the crash and is undocumented / platform-specific.

The robust fix is to keep every DCMTK-derived object inside libdcmdata.a: spool the vil_stream buffer to a mkstemp() temp file, hand the path to DcmFileFormat::loadFile(), and unlink() immediately. loadFile() uses DCMTK's own DcmInputFileStream entirely inside the library — no DcmInputStream-derived types remain on the consumer side, eliminating the ABI hazard.

The vil_dicom_stream_{factory,producer,input} classes and their header are deleted along with this change.

Testing

vxl ships six tiny DICOM fixtures in core/vil/tests/file_read_data/ (sizes 988–2028 bytes) plus matching baselines that have been there for years. test_file_format_read.cxx already contained a DICOM test stanza gated on HAS_DCMTK; it just had never been exercised because no vxl build was wired against DCMTK in modern memory. With this PR both that stanza and a new dedicated vil_test_dicom smoke test pass.

Build configuration used locally:

cmake -G Ninja $SRC \
  -DVXL_BUILD_CORE_IMAGING=ON \
  -DVXL_USE_DCMTK=ON \
  -DDCMTK_DIR=/path/to/dcmtk-install \
  -DCMAKE_BUILD_TYPE=Release \
  -DBUILD_TESTING=ON
ninja vil_test_all
ctest -R "vil_test_(file_format_read|dicom)" -V

Results against DCMTK 3.7.0 on macOS arm64 (Apple Clang, -O3, -std=c++17):

1/2 Test #197: vil_test_file_format_read ........   Passed    0.75 sec
2/2 Test #198: vil_test_dicom ...................   Passed    0.01 sec
100% tests passed, 0 tests failed out of 2

The 6 DICOM cases inside vil_test_file_format_read (16/8-bit greyscale uncompressed variants and 12-bit float uncompressed) all pass. The new vil_test_dicom asserts 7 invariants (fixture opens, make_input_image returns a resource, format is VIL_PIXEL_FORMAT_UINT_16, single plane, non-zero dimensions, view decodes).

Compatibility

The bundled v3p/dcmtk dates to 2002 and predates the modern DCMTK
namespaced-include layout and ABI. It is no longer useful as a
fallback for vil's DICOM reader, which is being modernized against
externally-installed DCMTK (>= 3.6.5) in subsequent commits.

* Drop v3p/dcmtk/ (236 files).
* Drop the VXL_FORCE_V3P_DCMTK CMake option and the
  add_subdirectory(dcmtk) call in v3p/CMakeLists.txt.
* Drop the v3p/dcmtk/osconfig.h reference from test_build_info.
* Rename contrib/mul/vimt's VXL_BUILD_DCMTK guard to VXL_USE_DCMTK.
  VXL_BUILD_DCMTK was only ever set by the deleted v3p/dcmtk subdir,
  so vimt_load.cxx's DICOM path now keys on the same user-facing
  option that core/vil uses.
Modern DCMTK (>= 3.6.0) ships DCMTKConfig.cmake and DCMTKTargets.cmake,
exporting DCMTK::* imported targets with INTERFACE_INCLUDE_DIRECTORIES
and INTERFACE_LINK_LIBRARIES already wired. The hand-rolled module-mode
FindDCMTK.cmake predates that and only worked against installs that
exposed the legacy DCMTK_INCLUDE_DIR / DCMTK_LIBRARIES variables.

* Delete config/cmake/Modules/FindDCMTK.cmake.
* Replace the include() of that module in core/vil/CMakeLists.txt with
  find_package(DCMTK CONFIG REQUIRED).
* Link vil against DCMTK::dcmimgle, DCMTK::dcmdata, DCMTK::ofstd
  instead of the flat DCMTK_LIBRARIES list; include dirs propagate via
  the imported targets.

Consumers point DCMTK_DIR at an installed DCMTKConfig.cmake (or place
DCMTK on CMAKE_PREFIX_PATH) as with any other modern CMake package.
, vxl#769, vxl#816)

vil_dicom has not built against DCMTK >= 3.6.0 for roughly a decade
because of structural API changes upstream. With the bundled v3p/dcmtk
now gone, vil_dicom must speak the modern DCMTK API. This commit
delivers a port that builds and runs cleanly against DCMTK 3.7.0.

API drift addressed:

* Namespaced headers. DCMTK 3.6+ ships its headers under
  dcmtk/<library>/<header>.h. All references in vil_dicom.cxx are
  updated to the namespaced form.

* DiInputPixelTemplate signature change (vxl#550). The constructor went
  from 6 to 9 parameters and now reads pixel data through a DiDocument
  rather than a raw DcmPixelData*. The reader constructs a DiDocument
  wrapping the already-loaded DcmFileFormat at the read site and
  threads it through read_pixels_into_buffer and convert_src_type.
  vil only loads a single 2D slice, so frame_count=1,
  frame_size=num_samples, with a stack DcmFileCache and a zero
  fragment index.

Input-stream wiring:

The historical approach -- subclass DcmInputStream / DcmProducer /
DcmInputStreamFactory inside vil to adapt a vil_stream -- compiles
cleanly against DCMTK 3.6+ once eos()/avail()/read()/skip()/putback()
adopt offile_off_t and DcmInputStreamFactory::ident() is implemented
(vxl#769). It segfaults at runtime, however: DCMTK's static libraries
are built with -fvisibility=hidden plus a private set of POSIX
feature-test macros (_DARWIN_C_SOURCE, _XOPEN_SOURCE_EXTENDED,
_BSD_SOURCE, _BSD_COMPAT, _OSF_SOURCE) that DCMTK does not propagate
via INTERFACE_COMPILE_DEFINITIONS on its imported targets. A consumer
that derives from DcmInputStream therefore sees a slightly different
class layout than the one frozen in libdcmdata.a, corrupting the
input-stream vtable on the first virtual dispatch from
DcmFileFormat::readUntilTag. Mirroring DCMTK's internal flags from
the consumer side is fragile and platform-specific.

The robust fix is to keep every DCMTK-derived object inside
libdcmdata.a: spool the vil_stream buffer to a mkstemp() temp file,
hand the path to DcmFileFormat::loadFile(), and unlink the temp file
immediately. loadFile() uses DCMTK's own DcmInputFileStream entirely
inside the library, so no DcmInputStream-derived types remain on the
consumer side and the ABI hazard is eliminated. The vil_dicom_stream
{factory,producer,input} classes and their header are dropped.

Build verified against DCMTK 3.7.0 on macOS arm64 Apple Clang
Release -O3 -std=c++17. core/vil's full test suite (84 cases in
test_file_format_read including 6 DICOM cases) passes; see the
follow-up commit that adds a focused vil_dicom smoke test.
test_dicom drives vil_dicom_file_format::make_input_image against the
988-byte 16-bit DICOM fixture that has been shipped in
core/vil/tests/file_read_data/ for years but never exercised, since
no vxl build was wired against DCMTK until now. It complements the
DICOM cases already in test_file_format_read by exercising the
reader directly (bypassing the full vil_load multi-format probe),
giving a focused regression signal if the DCMTK port drifts again.

The new test_dicom is registered under CTest as vil_test_dicom only
when VXL_USE_DCMTK is ON. When DCMTK support is off, the test source
compiles to a stub that prints a skip message, so the test driver
keeps a stable shape across configurations.
@hjmjohnson hjmjohnson force-pushed the feature/dcmtk-modernize branch from be7dee6 to aa60cf7 Compare May 11, 2026 19:09
@hjmjohnson hjmjohnson marked this pull request as ready for review May 11, 2026 19:10
@hjmjohnson hjmjohnson requested a review from TimCootes May 11, 2026 19:10
@hjmjohnson

Copy link
Copy Markdown
Contributor Author

@TimCootes This will need testing for your use cases. I just updated to the best of the 23-year-old internal tests.

@TimCootes

Copy link
Copy Markdown
Contributor

Thanks for preparing this, Hans, that's really helpful. I, or a colleague, will try it out soon.

@hjmjohnson

Copy link
Copy Markdown
Contributor Author

@TimCootes Did this update work for you?

@TimCootes

Copy link
Copy Markdown
Contributor

It works for some images but not for others. My colleague is currently trying to find out why.

@james-mcm-manchester

Copy link
Copy Markdown
Contributor

@hjmjohnson Thanks for all the updates, some compressed dicoms failed to properly load into memory or decode. I've created a PR on your fork with fixes for those. (hjmjohnson#1)

@hjmjohnson

Copy link
Copy Markdown
Contributor Author

Thanks @james-mcm-manchester — incorporated your encapsulated-DICOM fix in 70c6c8f (your authorship preserved; only your own added MustRead line was re-wrapped for clang-format).

Verified locally against a Homebrew-backed DCMTK build (DCMTK 3.7.0 from /opt/homebrew, which provides the dcmjpeg/dcmjpls components your fix links):

  • vil builds and links cleanly with DCMTK::dcmjpeg DCMTK::dcmjpls added.
  • vil_test_file_format_read — 84/84 passed
  • vil_test_dicom — 7/7 passed

Note: the in-tree suite has no compressed/encapsulated DICOM fixture, so this confirms the new codec path compiles, links, and doesn't regress the uncompressed reads — it doesn't exercise actual JPEG/JPEG-LS decode. If you can contribute a small encapsulated fixture, I'm happy to wire up a decode test.

The vil_dicom_stream header was removed in the DCMTK 3.6+ port; the
forward declaration in vil_dicom.h and the commented-out include in
test_include.cxx were left behind.
The error branch invoked dset.func(key, var) a second time solely to
print its OFCondition text, repeating the dataset lookup and double-
evaluating macro arguments with side effects. Capture the condition
once and reuse it.
A truncated or corrupt encapsulated pixel data fragment can leave
loadAllDataIntoMemory() returning non-EC_Normal while loadFile()
itself succeeded. The previous code dropped that condition and
unlinked the temp file regardless, so downstream decode would later
try to reopen the deleted path. Fold the second condition into the
same OFCondition that already gates the error branch below.
When dset.search(DCM_PixelData,...) returned non-EC_Normal the
constructor left 'pixels' at its initial null value and passed it to
read_pixels_into_buffer, which dereferences pixels->getVR() before any
null check, causing a SIGSEGV on datasets such as DICOMDIR, structured
reports, and RT-struct files.
Per-image registerCodecs()/cleanup() churns process-global DCMTK
state and can race with or tear down codec registrations held by
co-resident DCMTK consumers (ITK, GDCM-adjacent components). The
paired bool flags also leaked the registration on any early return
between register and cleanup.

Replace the conditional per-image register/cleanup with a single
function-local static initializer that registers dcmjpeg, dcmjpls,
and dcmrle exactly once per process. RLE Lossless (EXS_RLELossless)
now decodes correctly; previously usesEncapsulatedFormat() steered
RLE into the baseline-JPEG branch, registering a codec that cannot
decode RLE and silently producing a null image. Registration is
idempotent and thread-safe via C++11 magic statics, so no per-image
cleanup is needed.
@hjmjohnson

Copy link
Copy Markdown
Contributor Author

@james-mcm-manchester. I did an AI code review and fixed a few exposed concerns. Will you please verify that this new code base continues to work for you?

@james-mcm-manchester

Copy link
Copy Markdown
Contributor

@hjmjohnson Thanks very much for those, new codebase builds, tests and runs correctly.

@hjmjohnson hjmjohnson merged commit 30cfa52 into vxl:master Jun 2, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Does not build with new dcmtk: API changed

3 participants