BUILD: Drop bundled v3p/dcmtk and modernize vil_dicom for DCMTK 3.6+#1045
Conversation
The bundled v3p/dcmtk dates to 2002 and predates the modern DCMTK namespaced-include layout and ABI. It is no longer useful as a fallback for vil's DICOM reader, which is being modernized against externally-installed DCMTK (>= 3.6.5) in subsequent commits. * Drop v3p/dcmtk/ (236 files). * Drop the VXL_FORCE_V3P_DCMTK CMake option and the add_subdirectory(dcmtk) call in v3p/CMakeLists.txt. * Drop the v3p/dcmtk/osconfig.h reference from test_build_info. * Rename contrib/mul/vimt's VXL_BUILD_DCMTK guard to VXL_USE_DCMTK. VXL_BUILD_DCMTK was only ever set by the deleted v3p/dcmtk subdir, so vimt_load.cxx's DICOM path now keys on the same user-facing option that core/vil uses.
Modern DCMTK (>= 3.6.0) ships DCMTKConfig.cmake and DCMTKTargets.cmake, exporting DCMTK::* imported targets with INTERFACE_INCLUDE_DIRECTORIES and INTERFACE_LINK_LIBRARIES already wired. The hand-rolled module-mode FindDCMTK.cmake predates that and only worked against installs that exposed the legacy DCMTK_INCLUDE_DIR / DCMTK_LIBRARIES variables. * Delete config/cmake/Modules/FindDCMTK.cmake. * Replace the include() of that module in core/vil/CMakeLists.txt with find_package(DCMTK CONFIG REQUIRED). * Link vil against DCMTK::dcmimgle, DCMTK::dcmdata, DCMTK::ofstd instead of the flat DCMTK_LIBRARIES list; include dirs propagate via the imported targets. Consumers point DCMTK_DIR at an installed DCMTKConfig.cmake (or place DCMTK on CMAKE_PREFIX_PATH) as with any other modern CMake package.
, vxl#769, vxl#816) vil_dicom has not built against DCMTK >= 3.6.0 for roughly a decade because of structural API changes upstream. With the bundled v3p/dcmtk now gone, vil_dicom must speak the modern DCMTK API. This commit delivers a port that builds and runs cleanly against DCMTK 3.7.0. API drift addressed: * Namespaced headers. DCMTK 3.6+ ships its headers under dcmtk/<library>/<header>.h. All references in vil_dicom.cxx are updated to the namespaced form. * DiInputPixelTemplate signature change (vxl#550). The constructor went from 6 to 9 parameters and now reads pixel data through a DiDocument rather than a raw DcmPixelData*. The reader constructs a DiDocument wrapping the already-loaded DcmFileFormat at the read site and threads it through read_pixels_into_buffer and convert_src_type. vil only loads a single 2D slice, so frame_count=1, frame_size=num_samples, with a stack DcmFileCache and a zero fragment index. Input-stream wiring: The historical approach -- subclass DcmInputStream / DcmProducer / DcmInputStreamFactory inside vil to adapt a vil_stream -- compiles cleanly against DCMTK 3.6+ once eos()/avail()/read()/skip()/putback() adopt offile_off_t and DcmInputStreamFactory::ident() is implemented (vxl#769). It segfaults at runtime, however: DCMTK's static libraries are built with -fvisibility=hidden plus a private set of POSIX feature-test macros (_DARWIN_C_SOURCE, _XOPEN_SOURCE_EXTENDED, _BSD_SOURCE, _BSD_COMPAT, _OSF_SOURCE) that DCMTK does not propagate via INTERFACE_COMPILE_DEFINITIONS on its imported targets. A consumer that derives from DcmInputStream therefore sees a slightly different class layout than the one frozen in libdcmdata.a, corrupting the input-stream vtable on the first virtual dispatch from DcmFileFormat::readUntilTag. Mirroring DCMTK's internal flags from the consumer side is fragile and platform-specific. The robust fix is to keep every DCMTK-derived object inside libdcmdata.a: spool the vil_stream buffer to a mkstemp() temp file, hand the path to DcmFileFormat::loadFile(), and unlink the temp file immediately. loadFile() uses DCMTK's own DcmInputFileStream entirely inside the library, so no DcmInputStream-derived types remain on the consumer side and the ABI hazard is eliminated. The vil_dicom_stream {factory,producer,input} classes and their header are dropped. Build verified against DCMTK 3.7.0 on macOS arm64 Apple Clang Release -O3 -std=c++17. core/vil's full test suite (84 cases in test_file_format_read including 6 DICOM cases) passes; see the follow-up commit that adds a focused vil_dicom smoke test.
test_dicom drives vil_dicom_file_format::make_input_image against the 988-byte 16-bit DICOM fixture that has been shipped in core/vil/tests/file_read_data/ for years but never exercised, since no vxl build was wired against DCMTK until now. It complements the DICOM cases already in test_file_format_read by exercising the reader directly (bypassing the full vil_load multi-format probe), giving a focused regression signal if the DCMTK port drifts again. The new test_dicom is registered under CTest as vil_test_dicom only when VXL_USE_DCMTK is ON. When DCMTK support is off, the test source compiles to a stub that prints a skip message, so the test driver keeps a stable shape across configurations.
be7dee6 to
aa60cf7
Compare
|
@TimCootes This will need testing for your use cases. I just updated to the best of the 23-year-old internal tests. |
|
Thanks for preparing this, Hans, that's really helpful. I, or a colleague, will try it out soon. |
|
@TimCootes Did this update work for you? |
|
It works for some images but not for others. My colleague is currently trying to find out why. |
|
@hjmjohnson Thanks for all the updates, some compressed dicoms failed to properly load into memory or decode. I've created a PR on your fork with fixes for those. (hjmjohnson#1) |
|
Thanks @james-mcm-manchester — incorporated your encapsulated-DICOM fix in 70c6c8f (your authorship preserved; only your own added Verified locally against a Homebrew-backed DCMTK build (DCMTK 3.7.0 from
Note: the in-tree suite has no compressed/encapsulated DICOM fixture, so this confirms the new codec path compiles, links, and doesn't regress the uncompressed reads — it doesn't exercise actual JPEG/JPEG-LS decode. If you can contribute a small encapsulated fixture, I'm happy to wire up a decode test. |
The vil_dicom_stream header was removed in the DCMTK 3.6+ port; the forward declaration in vil_dicom.h and the commented-out include in test_include.cxx were left behind.
The error branch invoked dset.func(key, var) a second time solely to print its OFCondition text, repeating the dataset lookup and double- evaluating macro arguments with side effects. Capture the condition once and reuse it.
A truncated or corrupt encapsulated pixel data fragment can leave loadAllDataIntoMemory() returning non-EC_Normal while loadFile() itself succeeded. The previous code dropped that condition and unlinked the temp file regardless, so downstream decode would later try to reopen the deleted path. Fold the second condition into the same OFCondition that already gates the error branch below.
When dset.search(DCM_PixelData,...) returned non-EC_Normal the constructor left 'pixels' at its initial null value and passed it to read_pixels_into_buffer, which dereferences pixels->getVR() before any null check, causing a SIGSEGV on datasets such as DICOMDIR, structured reports, and RT-struct files.
Per-image registerCodecs()/cleanup() churns process-global DCMTK state and can race with or tear down codec registrations held by co-resident DCMTK consumers (ITK, GDCM-adjacent components). The paired bool flags also leaked the registration on any early return between register and cleanup. Replace the conditional per-image register/cleanup with a single function-local static initializer that registers dcmjpeg, dcmjpls, and dcmrle exactly once per process. RLE Lossless (EXS_RLELossless) now decodes correctly; previously usesEncapsulatedFormat() steered RLE into the baseline-JPEG branch, registering a codec that cannot decode RLE and silently producing a null image. Registration is idempotent and thread-safe via C++11 magic statics, so no per-image cleanup is needed.
|
@james-mcm-manchester. I did an AI code review and fixed a few exposed concerns. Will you please verify that this new code base continues to work for you? |
|
@hjmjohnson Thanks very much for those, new codebase builds, tests and runs correctly. |
Drop the bundled
v3p/dcmtk(236 files, ~94k LOC) and modernizevil_dicomto read DICOM through an externally-installed DCMTK ≥ 3.6.5 viafind_package(DCMTK CONFIG). Closes #550, #768, #769, #816 by retiring the 23-year-old vendored fork and porting the reader to the modern public API.Motivation
The bundled
v3p/dcmtkdates to 2002 and has not built against any DCMTK ≥ 3.6 for roughly a decade. The relevant issues (#550, #768, #769, #816) were filed by distro/packaging maintainers (Fedora, FreeBSD ports, vcpkg) reporting thatvil_dicomcannot be built against system DCMTK because of structural API changes (e.g.DcmInputStreamFactory::ident()becoming pure-virtual,DiInputPixelTemplate's ctor growing from 6 to 9 parameters). No PR has addressed those reports in the seven years since the first one.This PR makes the modernized reader build and run cleanly against DCMTK 3.7.0 and drops the vendored copy entirely.
What changed
Five focused commits:
2c74c205BUILD:Remove vendoredv3p/dcmtk(236 files, ~94k LOC)46dbe80cBUILD:Usefind_package(DCMTK CONFIG)for external DCMTKab4be2fbBUG:Portvil_dicomto the DCMTK 3.6+ public API (fixes #550, #769)c6014eafBUG:Read viaDcmFileFormat::loadFile()to dodge static-lib ABI skewbe7dee62ENH:Add minimal directvil_dicomsmoke testHighlights:
config/cmake/Modules/FindDCMTK.cmakeis replaced byfind_package(DCMTK CONFIG REQUIRED).core/villinks against the imported targetsDCMTK::dcmimgle,DCMTK::dcmdata,DCMTK::ofstd.<dcmtk/dcmdata/dcfilefo.h>, etc.).DiInputPixelTemplatector (DiDocument*,..., DcmFileCache*, fragment&) is used; vil's reader constructs aDiDocumentover the loadedDcmFileFormatand passes a stackDcmFileCachewith a zero fragment index.VXL_BUILD_DCMTK(which was only ever set by the deleted v3p subdir) is gone;contrib/mul/vimt's_BUILD_DCMTKguard now keys onVXL_USE_DCMTK.Why the input stream is no longer subclassed (commit c6014ea)
The first port (
ab4be2fb) used the historicalvil_dicom_stream_inputsubclass ofDcmInputStream. That compiled cleanly but reliably segfaulted on the very first virtual dispatch fromDcmFileFormat::readUntilTag. The vtable pointer of the input-stream object read as the binary image base instead of the expectedDcmInputStreamvtable.Root cause: DCMTK's static libraries are built with
-fvisibility=hiddenplus a private set of POSIX feature-test macros (_DARWIN_C_SOURCE,_XOPEN_SOURCE_EXTENDED,_BSD_SOURCE,_BSD_COMPAT,_OSF_SOURCE) that DCMTK does not propagate viaINTERFACE_COMPILE_DEFINITIONSon its imported targets. A consumer that derives fromDcmInputStream/DcmProducertherefore sees a slightly different class layout than the one frozen inlibdcmdata.a, and the resulting ABI skew corrupts the vtable on the first virtual dispatch.Trying to mirror DCMTK's internal flags from the consumer side (matching
CMAKE_CXX_STANDARD, adding the POSIX macros, enabling-fvisibility=hidden -fvisibility-inlines-hiddenon the affected sources) did not reliably fix the crash and is undocumented / platform-specific.The robust fix is to keep every DCMTK-derived object inside
libdcmdata.a: spool thevil_streambuffer to amkstemp()temp file, hand the path toDcmFileFormat::loadFile(), andunlink()immediately.loadFile()uses DCMTK's ownDcmInputFileStreamentirely inside the library — no DcmInputStream-derived types remain on the consumer side, eliminating the ABI hazard.The
vil_dicom_stream_{factory,producer,input}classes and their header are deleted along with this change.Testing
vxl ships six tiny DICOM fixtures in
core/vil/tests/file_read_data/(sizes 988–2028 bytes) plus matching baselines that have been there for years.test_file_format_read.cxxalready contained a DICOM test stanza gated onHAS_DCMTK; it just had never been exercised because no vxl build was wired against DCMTK in modern memory. With this PR both that stanza and a new dedicatedvil_test_dicomsmoke test pass.Build configuration used locally:
Results against DCMTK 3.7.0 on macOS arm64 (Apple Clang,
-O3,-std=c++17):The 6 DICOM cases inside
vil_test_file_format_read(16/8-bit greyscale uncompressed variants and 12-bit float uncompressed) all pass. The newvil_test_dicomasserts 7 invariants (fixture opens,make_input_imagereturns a resource, format isVIL_PIXEL_FORMAT_UINT_16, single plane, non-zero dimensions, view decodes).Compatibility
VXL_USE_DCMTKis OFF by default, matching the existing surface area. Builds that did not enable it are unaffected.VXL_USE_DCMTKpreviously expected the legacyDCMTK_INCLUDE_DIR/DCMTK_LIBRARIESmodule-mode variables. They now pointDCMTK_DIRat an installedDCMTKConfig.cmake(or put DCMTK onCMAKE_PREFIX_PATH), matching every modern CMake package.VXL_FORCE_V3P_DCMTKto use the bundled copy will fail to configure with a clearfind_packageerror. Those builds should install a real DCMTK ≥ 3.6.5 or disableVXL_USE_DCMTK.