Skip to content

Tags: tpatki/ovis

Tags

OVIS-4.3.8

Toggle OVIS-4.3.8's commit message
OVIS-4.3.8

* Numerous bug fixes
* Multi-threaded low-level Zap transport event handers
* Command line option support in configuration files
* Summary set, transport, producer, and thread statistics
* Kokkos Appmon store
* Darshan store
* Non-blocking event logging
* Netlink notifier stream sampler

v4.3.8

Toggle v4.3.8's commit message
fix build for boost-dependence when not in /usr

OVIS-4.3.7

Toggle OVIS-4.3.7's commit message
This is OVIS-4.3.7 Release

New Features:
* Improved LDMSD Streams Performance
* Improved ib_verbs backward compatability
* Per-device procnet sampler
* Per-device ibmad sampler
* AMD GPU sampler
* Per-mount Lustre samplers
* Various reliability and resiliency improvements

Fixes:
* LDMSD Streams Memory Leak fixes
* Resolved confusing uGNI error messages on exit
* Fixed store rename issues in CSV store

OVIS-4.3.6

Toggle OVIS-4.3.6's commit message
OVIS-4.3.6 Release tag

Features:

* prdcr_stat command to report ldmsd producer statistics
* set_stat command to report active ldmsd set counts and memory usage
* Support for multi-step slurm jobs in the PAPI sampler
  - the app_id in the metric set is now the step id.
* Partial support for multi-step slurm jobs in the Slurm sampler
  - the app_id in the metric set is now the step id.
* TimescaleDB storage plugin

Bug Fixes:

* Fix spinning IO thread bug in the socket transport
* Fix build failure for older OFA (ib_verbs) libraries
* Fix build failure for missing openssl when auth enabled
* Fix use after free bug in RBD cleanup
* Fix RBD leak in the set delete path
* Fix potential deadlock in Zap RDMA

OVIS-4.3.5

Toggle OVIS-4.3.5's commit message
This is the OVIS-4.3.5 G/A Release

This release includes the following features and fixes:

* Compatability with OVIS-4.3.3 and OVIS-4.3.4
* Support for the Maestro load balancer
* Allow root user to access ldmsd configuration objects
  regardless of euid/egid of the process
* Zap socket performance improvements
* Zap fabric performance and resiliency improvements
* Zap RDMA support for OmniPath
* Zap uGNI resiliency improvements
* Fix LDMS Streams Service data loss on process exit
* Metric set permission handling improvements
* Fixes for memory leaks and uninitialized data found by
  static analysis tools
* Numerous build and packaging improvements

OVIS-4.3.4

Toggle OVIS-4.3.4's commit message
This is the OVIS-4.3.4 G/A Release

Significant testing on the socket, RDMA, and uGNI transports has been
done with Socket and uGNI scaling to three levels of aggregation and
30,000 sets in the aggregate.

The RDMA transport has been tested to a few thousands of sets.

The fabric transport should be considered Alpha and is suitable
for development, but not deployment at this time.

This release includes the following new features

* LDMS Transport performance statistics (ldmsd_controller xprt_stats command)
* Zap Thread utilization tracking (ldmsd_controller thread_stats command)
* uGNI resliency improvements to aid with resource error handling
* Packaging updates and github automation to help with tarball generation and release tagging
* A reference counting service has been implemented that supports 'named references'. In debug mode (when REF_TRACK is defined), references are tracked (function name, and line number) when they are taken and when they are released, and individual reference counts are kept for each name. This makes it easier to debug reference tracking during development.
* The new ref_t reference counting mechanism has been added to struct ldms_set and struct ldms_rbuf_desc in support of a robust set-delete capability
* An "end-to-end" protocol has been added for deleting metric sets. When an ldmsd deletes a set, each peer that has a memory handle on the set is notified. The set resources are not freed until all peers acknowledge that they have received the delete notification.
* A service (zap_zerr2errno) has been added to consistently map Zap errors to Unix errno
* Updates to the lustre2_client sampler to support newer version of Lustre

OVIS-4.3.4-beta.1

Toggle OVIS-4.3.4-beta.1's commit message
This is the OVIS-4.3.4 release tag

OVIS-4.3.4-alpha.1

Toggle OVIS-4.3.4-alpha.1's commit message
This release includes the following updates and fixes:

* Packaging updates and github automation to help with tarball generation and release tagging
* Fixes for issues found by static analysis tools
* The JSON parser had a memory leak that on the socket transport could leak as much as 1MB per message
* A service (zap_zerr2errno) has been added to consistently map Zap errors to Unix errno
* A reference counting service has been implemented that supports 'named references'. In debug mode (when REF_TRACK is defined), references are tracked (function name, and line number) when they are taken and when they are released, and individual reference counts are kept for each name. This makes it easier to debug reference tracking during development.
* The new ref_t reference counting mechanism has been added to struct ldms_set and struct ldms_rbuf_desc in support of a robust set-delete capability
* An "end-to-end" protocol has been added for deleting metric sets. When an ldmsd deletes a set, each peer that has a memory handle on the set is notified. The set resources are not freed until all peers acknowledge that they have received the delete notification.
* LDMS transport 'telemetry' data has been added that tracks statistics on the primary transport operations DIR, LOOKUP, UPDATE, SEND, and RECV. The intent is to determine when/if an ldmsd becomes overloaded, underutilized, etc...
* Zap uGNI Transport fixes
  * Ensure socket is closed in uGNI transport
  * Destroy the Cdm in the uGNI transport
  * Refactor Zap uGNI disconnect path
  * Aggressively flush incomplete RdmaPost descriptors.
  * Add more detailed error handling in Zap uGNI
  * Added a thread to subscribe to and report errors on the uGNI transport.
  * Make certain that GNI_EpUnbind does not fail. This ensures that NTT resources held by the endpoint are released.

OVIS-4.3.4-alpha.0

Toggle OVIS-4.3.4-alpha.0's commit message
base tag of 4.3.4 for git-enabled versioning

OVIS-4.3.3

Toggle OVIS-4.3.3's commit message
Fix compilation warnings for `-O3 -Wall -Werror`