Update to work with Pydantic upgrade of mt-metadata #389

kujaku11 · 2025-12-06T18:37:44Z

mt-metadata has been updated to use Pydantic under the hood. This PR will update Aurora to operate using the updated versions of mt-metadata and mth5.

All tests pass
Migrate to pytest

local tests show: 30 failed, 46 passed, 41 warnings, 1 error in 360.48s (0:06:00)

Introduces a minimal conftest.py with fixtures for creating and cleaning up synthetic MTH5 files, and configures pytest to filter noisy warnings. Adds a pytest-based test that writes a TF object to a zrr file, reads it back, and asserts array equality, ensuring xdist safety and proper cleanup.

Added new pytest-based synthetic tests for Aurora and MTH5 processing, including feature weighting, Fourier coefficients, decimation, STFT agreement, and frequency band definition. Enhanced conftest.py with fixtures for synthetic test paths, file cleanup, and monkeypatching to sanitize provenance comments, improving test isolation and reliability.

Updated tests to use the same num_samples_window value when manually specifying band_edges, ensuring alignment with FFT harmonics and consistent transfer function results. Also removed unnecessary skip/xfail markers from pytest-based tests.

Introduces worker-safe pytest fixtures for synthetic MTH5 test files, replacing direct calls to file creation functions in tests. Updates processing helpers and all synthetic tests to accept or use these fixtures, improving test isolation, parallelism, and reliability. Also adds support for passing custom MTH5 file paths to processing helpers.

Removed unittest-based synthetic test modules and replaced them with pytest equivalents for metadata and multi-run tests. This improves test maintainability and integration with modern Python testing workflows.

Removed the unittest-based transfer function kernel test and added comprehensive pytest suites for ApodizationWindow and WindowingScheme classes. The new tests cover window generation, properties, taper families, sliding window operations, FFT, edge cases, and integration workflows, improving coverage and compatibility with pytest-xdist.

Introduces comprehensive tests for nan_to_mean, handle_nan, and time_axis_match functions in aurora.time_series.xarray_helpers. Covers edge cases, multiple channels, time axis mismatches, and data integrity, optimized for pytest-xdist parallel execution.

Deleted test_apodization_window.py, test_windowing_scheme.py, and test_xarray_helpers.py from the tests/time_series directory. These files contained unit tests for apodization windows, windowing schemes, and xarray helpers, respectively.

Introduces a comprehensive pytest test suite for the aurora.transfer_function.cross_power module. Tests cover channel name generation, transfer function computation, mathematical properties, edge cases, data integrity, numerical stability, return value characteristics, and consistency across calls. Optimized for parallel execution with pytest-xdist.

Added a comprehensive pytest suite for aurora.transfer_function.regression.helper_functions covering rme_beta, simple_solve_tf, and direct_solve_tf, including edge cases, mathematical properties, and data integrity. Removed the unittest-based cross_power test file to focus on regression testing for helper functions.

Introduces a comprehensive pytest suite for the RegressionEstimator base class, covering initialization, OLS estimation, QR decomposition, underdetermined systems, input type handling, xarray conversion, data validation, numerical stability, edge cases, data integrity, deterministic behavior, mathematical properties, and return value checks. These tests ensure correctness, robustness, and compatibility with various data types and scenarios.

Deleted test_base.py and test_helper_functions.py from tests/transfer_function/regression. These files contained unit tests for regression estimators and helper functions, possibly as part of a test suite cleanup or migration.

Introduces a new, fully refactored Parkfield test suite in tests/parkfield/test_parkfield_pytest.py, organized into multiple test classes with 25+ focused tests covering calibration, single-station and remote-reference processing, data integrity, and numerical validation. Adds extensive reusable fixtures to tests/conftest.py for efficient resource management and pytest-xdist compatibility. Includes a detailed REFACTORING_SUMMARY.md documenting the migration from three monolithic test files to a single, maintainable, and parallelizable suite with improved coverage and maintainability.

Updated attribute names from station/ch1/ch2 to station_1/channel_1/channel_2 in feature_weights.py and related test code for consistency. Improved logging for feature type and validation. Adjusted test deserialization logic to handle nested dicts and removed xfail marker from feature weighting test.

Updated docs/examples/dataset_definition.ipynb to use Windows-style paths, added 'channel_nomenclature.keyword', replaced nulls with empty strings, and changed 'units' from 'counts' to 'digital counts'. Also updated import paths, output examples, and warning messages for better Windows compatibility and current metadata conventions. Dropped Python 3.9 from test matrix in .github/workflows/tests.yaml.

Ensure CONFIG_PATH directory exists before saving JSON configs in make_processing_configs.py. Update test_decimation_methods_agree and test_stft_methods_agree to accept synthetic_test_paths argument for improved test setup.

Removed unnecessary close_open_files calls from test fixtures and helpers. Updated windowing scheme fixture to use actual data length. Improved exception handling in single-station processing test. Skipped EMTFXML export test due to known bug and clarified skip reason. Updated kernel dataset structure tests to check DataFrame contents instead of attributes. Refined numerical validation tests to check only impedance elements and verify transfer function shape using DataArray dimensions. Minor docstring and comment improvements for clarity.

Refactored comparison_plots.py to always show plots and close figures after saving, with improved logging. Commented out warning filters in conftest.py to allow all warnings during tests. Added a note in test_parkfield_pytest.py to implement impedance comparison tests.

Set matplotlib to non-interactive 'Agg' backend in test configuration to prevent blocking during tests. Refactor Parkfield MTH5 test fixtures to create and cache a master file once per session, then copy it to worker-specific directories for parallel test execution, reducing redundant downloads and avoiding file handle conflicts.

Introduced class-scoped pytest fixtures to process transfer functions once per test class and reuse them across multiple tests. This reduces redundant processing and significantly speeds up test execution, especially for expensive operations. Updated all relevant tests to use the new fixtures instead of reprocessing data.

Changed the scope of the 'parkfield_kernel_dataset_ss' and 'parkfield_kernel_dataset_rr' pytest fixtures to 'class' to optimize test setup and teardown for these resources.

Deleted test_calibrate_parkfield.py, test_process_parkfield_run.py, and test_process_parkfield_run_rr.py as part of test suite cleanup. Updated test_parkfield_pytest.py to set fixture scope to 'class' for config_ss and config_rr to improve test performance.

Refactored tests in test_processing_pytest.py and test_multi_run_pytest.py to use class-scoped pytest fixtures, caching expensive processing calls and reducing redundant computation. This significantly decreases CI runtime by sharing processed results across related tests, while maintaining test coverage and compatibility with parallel execution. Added documentation and comments to clarify which tests cannot be optimized due to inherent requirements.

Introduces methods to numerically compare transfer functions, sigma_e, and sigma_s between ZFile objects, including interpolation for mismatched periods. Updates Parkfield test to assert transfer function similarity using the new comparison utility. There is still an issue with the filters and channel metadata that is different causing the transfer function to be incorrect.

if the min_num_stft_windows is set to None the the value become 0, and therefore logic allows for 0 windows per decimation level which raises an error. Set default to 1, which eliminates 0 windows.

Update the parkfield_h5_master fixture to cache the Parkfield MTH5 file in a persistent directory (~/.cache/aurora/parkfield) instead of a temporary directory. This avoids repeated downloads across test sessions and improves test efficiency. The parkfield_h5_path fixture is updated to reflect this persistent caching approach.

Introduces a vectorized implementation of the pass_band function in mt_metadata/timeseries/filters/filter_base.py using numpy stride tricks for significant performance improvement. Adds detailed profiling, analysis, and optimization documentation, including benchmarking scripts, performance summaries, and an automated application script under tests/parkfield/. Also includes profiling data and supporting files to validate and communicate the optimization impact.

Updated the TransferFunctionKernel to set survey metadata only if not already present and to use the Survey object from the dataset. Also changed the way runs_processed is determined, now using unique runs from the dataset dataframe. Minor formatting and comment improvements were made in process_mth5.py.

Introduces tests/synthetic/test_fourier_coefficients_discrete.py with comprehensive discrete tests for the Fourier Coefficients workflow, including file validation, FC creation, storage, and processing for various synthetic MTH5 test files. Also updates test_fourier_coefficients_pytest.py to comment out test1 from the test file paths.

Expanded the synthetic Fourier Coefficient test to include detailed subtests for file validation, RunSummary, KernelDataset, config creation, FC addition, readback, and processing. Added error logging and explicit KeyError in transfer_function_kernel.py for missing channel components. Updated triage utility to also triage processed_date for more robust TF comparison.

Deleted test_fourier_coefficients_discrete.py and replaced its coverage by refactoring test_fourier_coefficients_pytest.py. The new test uses pytest parameterization to run the Fourier Coefficient workflow for each synthetic MTH5 file, improving maintainability and enabling parallel execution. All validation and workflow steps are now consolidated in a single, parameterized test.

Introduces a new pytest-based test suite for the MATLAB Z-file reader, including fixtures, parameterized tests, and integration tests. Also adds a sample .zrr test file for use in testing. Minor docstring formatting fix in triage.py.

Updated the test workflow to run pytest with automatic parallelization using pytest-xdist. Added pytest-xdist, pytest-subtests, and pytest-benchmark to the test dependencies in pyproject.toml to support parallel testing, subtests, and benchmarking.

kkappler and others added 30 commits November 21, 2025 15:30

pydantic -- fix imports

dd739f0

local tests show: 30 failed, 46 passed, 41 warnings, 1 error in 360.48s (0:06:00)

Update processing_configuration_template.json

e27f34b

stop auto testing until we address all tests locally.

4b740c2

Update spectrogram_helpers.py

714bcf6

Update test_issue_139.py

1eff691

Update config_creator.py

0a14739

updating precommit

d0bbde0

Create test_transfer_function_kernel_pytest.py

67ee871

Migrate synthetic tests from unittest to pytest

0d12513

Removed unittest-based synthetic test modules and replaced them with pytest equivalents for metadata and multi-run tests. This improves test maintainability and integration with modern Python testing workflows.

Remove time series test files

0c2a8da

Deleted test_apodization_window.py, test_windowing_scheme.py, and test_xarray_helpers.py from the tests/time_series directory. These files contained unit tests for apodization windows, windowing schemes, and xarray helpers, respectively.

Remove regression test files for transfer function

03627ee

Deleted test_base.py and test_helper_functions.py from tests/transfer_function/regression. These files contained unit tests for regression estimators and helper functions, possibly as part of a test suite cleanup or migration.

Update tests.yaml

477998f

Update tests.yaml

5cc6d4b

skipping notebooks for now

4d35d16

Fix config save and update test signatures

862ec08

Ensure CONFIG_PATH directory exists before saving JSON configs in make_processing_configs.py. Update test_decimation_methods_agree and test_stft_methods_agree to accept synthetic_test_paths argument for improved test setup.

fix filter additons to use new add_filter method

56ab23a

force run_id in metadata

61bb118

update python version info, add some pytest helpers

1098555

kujaku11 added 27 commits December 6, 2025 12:57

Update test_parkfield_pytest.py

19eef50

Set fixture scope to class for kernel dataset tests

f97161b

Changed the scope of the 'parkfield_kernel_dataset_ss' and 'parkfield_kernel_dataset_rr' pytest fixtures to 'class' to optimize test setup and teardown for these resources.

Update processing_configuration_template.json

e077b6f

updating how survey metadata is filled

1f4c864

changed default of None to 1

fd3e9b4

if the min_num_stft_windows is set to None the the value become 0, and therefore logic allows for 0 windows per decimation level which raises an error. Set default to 1, which eliminates 0 windows.

Update edf_weights.py

ffb37fc

fixing bugs with feature weighting

438ba49

Update feature_weights.py

f20b578

updating logging messages

861bf3b

Update test_parkfield_pytest.py

784a2d1

removing sandbox test files.

201ebfa

Add pytest suite for MATLAB Z-file reader

bd83c89

Introduces a new pytest-based test suite for the MATLAB Z-file reader, including fixtures, parameterized tests, and integration tests. Also adds a sample .zrr test file for use in testing. Minor docstring formatting fix in triage.py.

Update tests.yaml

d79a21e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update to work with Pydantic upgrade of mt-metadata #389

Update to work with Pydantic upgrade of mt-metadata #389

Uh oh!

kujaku11 commented Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Update to work with Pydantic upgrade of mt-metadata #389

Are you sure you want to change the base?

Update to work with Pydantic upgrade of mt-metadata #389

Uh oh!

Conversation

kujaku11 commented Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants