Skip to content

Releases: bsc-pm/dlb

Version 3.6.0

17 Sep 10:30

Choose a tag to compare

Added

  • Initial support for GPU TALP metrics (includes NVIDIA and AMD support)
  • Added a new and more robust support for MPI Fortran 2008 bindings
  • New binary dlb_mpi to check affinity in MPI environments
  • Flags --gpu-affinity and --uuid for dlb and dlb_mpi to show
    GPU visibility
  • New instrumentation events for OMPT callbacks
  • New API DLB_DROM_SetProcessMaskStr to set masks using a human-readable input
  • New DLB Python bindings

Changed

  • --talp-output-file now creates missing directories if able
  • DLB_TALP_CollectPOPMetrics can be called now from non-MPI apps

Fixed

  • Fixed TALP Global region not being started if no MPI or OpenMP
  • Replace PAPI reset calls with regular reads to improve performance
  • Add several PAPI init checks
  • Fixed several LeWI features in async mode

Version 3.5.3

08 Sep 15:14

Choose a tag to compare

Added

  • Added documentation for DLB_DROM_FLAGS_NONE argument
  • Other minor documentation corrections

Changed

  • Remove MPI Fortran 2008 bindings check at configure time; bindings remain
    disabled pending new interception method

Fixed

  • Fixed compilation error with GCC 15
  • Fixed errors in the OpenMP thread manager during initialization
  • Fixed some OpenMP thread manager logic for SMT systems
  • Fixed bug in DLB_MonitoringRegionReset; region was not being removed from
    an internal list of open regions
  • Stop instrumentation after MPI_Finalize to avoid unwanted interactions with
    external libraries

Version 3.5.2

02 May 17:01

Choose a tag to compare

Added

  • Add dlb_mpi binary to display CPU affinity and MPI rank
  • Add more verbose messages for OMPT events
  • Add implicit-task-end OMPT event to add consistency to Cray OpenMP

Fixed

  • Fixed several errors with CPU topology parsing
  • Removed deprecated options and struct members in examples

Version 3.5.1

05 Mar 17:43

Choose a tag to compare

Added

  • Add --disable-sphinx-doc, --disable-doxygen, and --disable-pandoc
    to configure script to disable the automatic detection of each tool

Fixed

  • Fixed several bugs with CPU affinity masks detection and parsing
  • POP metrics conditional printing improved for MPI/OpenMP detection
  • Fix some compilation errors with clang-19, nvc, and nvfortran
  • Improved documentation in user guide
  • Several other minor fixes

TALP- Pages 3.5.1

  • Bugfix: Enable the generation of OpenMP only scaling tables
  • Bugfix: Disable warning for chained assignments in newer pandas versions

Version 3.5.0

03 Dec 14:47

Choose a tag to compare

Added

  • Asynchronous support for classic LeWI
  • Several SMT enhancements for LeWI policies
  • Allowed to override lewi classic/mask with --lewi-affinity
  • TALP POP metrics now includes experimental OpenMP hybrid metrics
  • TALP global region is now exposed in the API
  • TALP-Pages, a new tool for Continuous Performance Monitoring in static HTML pages
  • Add flag --talp-region-select to filter active regions
  • SLURM integration via dlb_taskset
  • CMake config for other projects to link with DLB
  • Several examples and documentation reworked
  • DLB version information can be accessed though the API

Changed

  • --talp-summary has been simplified and now pop-metrics also includes raw
    metrics if using an output file, and process metrics now includes node
    identifiers
  • TALP now only stores monitoring regions in shared memory if
    --talp-external-profiler is set
  • TALP output structure has been reworked
  • TALP main region is now called "Global"

Fixed

  • LeWI mask now correctly supports threads blocked in MPI calls while pinned to
    multiple CPUs
  • Add sanity checks for hardware counters in TALP
  • Print JSON and CSV files in the proper locale

Deprecated

  • --talp-summary values for pop-raw and node are deprecated
  • TALP output format XML is now deprecated
  • --talp-regions-per-proc flag is deprecated for a new experimental
    --shm-size-multiplier flag
  • Several fields in dlb_monitor_t are now deprecated
  • Several fields in dlb_pop_metrics_t are now deprecated
  • DLB_MonitoringRegionGetMPIRegion deprecated in favor of
    DLB_MonitoringRegionGetGlobal
  • DLB_Stats_GetCpuStateIdle functionality no longer provided
  • DLB_Stats_GetCpuStateOwned functionality no longer provided
  • DLB_Stats_GetCpuStateGuested functionality no longer provided

Version 3.4.1

16 Aug 14:18

Choose a tag to compare

Fixed

  • Fix an error in the shared memory alignment that was causing
    segmentation faults when compiling with -march=native
  • Avoid registering role shifting callbacks for other non-related
    OpenMP thread managers
  • Update examples with supported options
  • Fix some parameters in the Fortran'08 interface
  • Be more resilient if PAPI fails to initialize
  • Enhance compatibility in other systems
  • Quote string names in csv files
  • Several other minor fixes

Version 3.4

22 Dec 17:23

Choose a tag to compare

Added

  • PAPI support for TALP metrics
  • libdlb_mpic.so and libdlb_mpic_*.so are C MPI only libraries
    that may be built using --enable-c-mpi-library at configure time
  • Functions to reset, stop, start and report monitoring regions now
    accept the special argument DLB_MPI_REGION for the implicit region
  • Function DLB_TALP_QueryPOPNodeMetrics for third-party applications
    to query pop metrics. Requires --talp-external-profiler.
  • Named barriers and several API functions to manage them
  • Added --lewi-barrier and --lewi-barrier-select to fine-tune
    which barriers activate LeWI.
  • Added --lewi-color to select specific key only for LeWI

Changed

  • libdlb_mpif.so and libdlb_mpif_*.so are no longer built by default,
    only if --enable-fortran-mpi-library is set at configure time
  • Flag --quiet now only suppresses INFO and VERBOSE, added new flag
    --silent to keep the old functionality to suppress all messages
  • Refactor DLB_TALP_CollectNodeMetrics to
    DLB_TALP_CollectPOPNodeMetrics and add communication efficiency
  • TALP now appends to CSV files if they already exist

Fixed

  • Fixed wrong generated code for MPI_Initialized and MPI_Finalized

Deprecated

  • --lewi-ompt no longer accepts "mpi" nor "aggressive" as values.
    Automatic LeWI via synchronization calls is now done with
    --lewi-mpi-calls for MPI and --lewi-barrier or
    --lewi-barrier-select for DLB Barriers.

Version 3.3.1

18 May 08:29

Choose a tag to compare

Fixed

  • Fixed wrong generated code for MPI_Initialized and MPI_Finalized

Version 3.3

16 May 09:26

Choose a tag to compare

Added

  • Free agent and Role-shift OMPT thread managers to support LeWI with both
    implementations
  • Flag --ompt-thread-manager to select which OpenMP implementation to use
  • MPI Fortran 2008 bindings
  • TALP flag to generate file in different output formats --talp-output-file
  • New TALP collective functions to gather and compute metrics:
    DLB_TALP_CollectPOPMetrics and DLB_TALP_CollectNodeMetrics

Changed

  • libdlb_mpi.so and libdlb_mpi_*.so have now both C and Fortran MPI symbols

Fixed

  • Fixed DROM pre-initialization if child had empty cpuset affinity
  • Fixed --lewi-max-parallelism
  • Fixed several TALP bugs
  • Fixed some finalization errors during MPI finalize
  • Fixed cpuset parsing when provided a non-contiguous mask

Version 3.2

20 Apr 16:40

Choose a tag to compare

Added

  • Flag --verbose to enable all verbose modes
  • Flag --talp-summary=pop-raw to print raw POP metrics
  • Flag --lewi-respect-cpuset to allow LeWI to use CPUs not yet registered

Changed

  • DROM can now steal all CPUs from one process
  • DROM can now inherit a subset of CPUs from other process
  • DLB_DROM_SetProcessMask to oneself does not longer require a DLB_pollDROM
  • DLB_Lend in OpenMP applications now invokes the OpenMP runtime to change
    the number of threads

Fixed

  • Fixed TALP regions enabled or registered only on some processes
  • Fixed minor option parsing