Releases: LLNL/merlin
Releases · LLNL/merlin
Version 2.0.0b2
[2.0.0b2]
Fixed
- Bug where the worker launch using tcsh shell wasn't working
Version 2.0.0b1
[2.0.0b1]
Added
- New
merlin databasecommand to interact with new database functionality- When running locally, SQLite will be used as the database. Otherwise your current results backend will be used
merlin database info: prints some basic information about the databasemerlin database get: allows you to retrieve and print entries in the databasemerlin database delete: allows you to delete entries in the database
- Added
db_scripts/folder containing several new files all pertaining to database interactiondata_models: a module that houses dataclasses that define the format of the data that's stored in Merlin's database.db_commands: an interface for user commands ofmerlin databaseto be processedmerlin_db: houses theMerlinDatabaseclass, used as the main point of contact for interactions with the databaseentities/: A folder containing modules that define a structured interface for interacting with persisted data.entity_managers/: A folder containing classes responsible for managing high-level database operations across all entities.
- Added
backends/folder containing a new OOP way to interact with results backend databasesresults_backend: houses an abstract classResultsBackendthat defines what every supported backend implement in Merlinredis/: A folder containing theRedisBackendclass that defines specific interactions with the Redis databasesqlite/: A folder containing theSQLiteBackendclass that defines specific interactions with the SQLite databasebackend_factory: houses a factory classMerlinBackendFactorythat initializes an appropriateResultsBackendinstance
- Added
monitors/folder containing a refactored, OOP approach to handling themerlin monitorcommandcelery_monitor: houses theCeleryMonitorclass a concrete subclass ofTaskServerMonitorfor monitoring Celery task serversmonitor_factory: houses a factory classMonitorFactorythat initializes an appropriateTaskServerMonitorinstancemonitor: houses theMonitorclass, used as the top-level point of interaction for the monitor commandtask_server_monitor: houses theTaskServerMonitorABC class, which serves as a common interface for monitoring task servers
- New worker related classes:
MerlinWorker: base class for defining task server workersCeleryWorker: implementation ofMerlinWorkerspecifically for Celery workersWorkerFactory: to help determine which task server worker to useMerlinWorkerHandler: base class for managing launching, stopping, and querying multiple workersCeleryWorkerHandler: implementation ofMerlinWorkerHandlerspecifically for manager Celery workersWorkerHandlerFactory: to help determine which task server handler to use
- A new celery task called
mark_run_as_completethat is automatically added to the task queue associated with the final step in a workflow - Ability to filter database queries for the
get all-*anddelete all-*commands - New
MerlinBaseFactoryclass to help enable future plugins for backends, monitors, status renderers, etc. - Ability to turn off the auto-restart functionality of the monitor with
--no-restart - Tests for the monitor files
Changed
- Maestro version requirement is now at minimum 1.1.10 for status renderer changes
- The
BackendFactory,MonitorFactory, andStatusRendererFactoryclasses all now inherit fromMerlinBaseFactory - Launching workers is now handled through worker classes rather than functions in the
celeryadapter.pyfile
Version 1.13.0
[1.13.0]
Added
- API documentation for Merlin's core codebase
- Added support for Python 3.12 and 3.13
- Added additional tests for the
merlin runandmerlin purgecommands - Aliased types to represent different types of pytest fixtures
- New test condition
StepFinishedFilesCountto help search forMERLIN_FINISHEDfiles in output workspaces - Added "Unit-tests" GitHub action to run the unit test suite
- Added
CeleryTaskManagercontext manager to the test suite to ensure tasks are safely purged from queues if tests fail - Added
command-tests,workflow-tests, andintegration-teststo the Makefile - Added tests and docs for the new
merlin configoptions - Python 3.8 now requires
orderly-set==5.3.0to avoid a bug with the deepdiff library - New step 'Reinstall pip to avoid vendored package corruption' to CI workflow jobs that use pip
- New GitHub actions to reduce common code in CI
- COPYRIGHT file for ownership details
- New check for copyright headers in the Makefile
- A page in the docs explaining the
feature_demoexample - Unit tests for the
spec/folder - Batch block now supports placeholder entries
- Automatic task retry for Celery's
BackendStoreError
Changed
- The
merlin configcommand:- Now defaults to the LaunchIT setup
- No longer required to have configuration named
app.yaml - New subcommands:
create: Creates a new configuration fileupdate-broker: Updates thebrokersection of the configuration fileupdate-backend: Updates theresults_backendsection of the configuration fileuse: Point your active configuration to a new configuration file
- The
merlin servercommand no longer modifies the~/.merlin/app.yamlfile by default. Instead, it modifies the./merlin_server/app.yamlfile. - Dropped support for Python 3.7
- Celery settings have been updated to try to improve resiliency
- Ported all distributed tests of the integration test suite to pytest
- There is now a
commands/directory and aworkflows/directory under the integration suite to house these tests - Removed the "Distributed-tests" GitHub action as these tests will now be run under "Integration-tests"
- There is now a
- Removed
e2e-distributed*definitions from the Makefile - CI to use new actions
- Copyright headers in all files
- These now point to the LICENSE and COPYRIGHT files
- LICENSE: Legal permissions (e.g., MIT terms)
- COPYRIGHT: Ownership, institutional metadata
- Make commands that change version/copyright year have been modified
- Refactored the
main.pymodule so that it's broken into smaller, more-manageable pieces
Fixed
- Running Merlin locally no longer requires an
app.yamlconfiguration file - Removed dead lgtm link
- Potential security vulnerabilities related to logging
- Bug where the
--task-statusand--return-codefilters ofmerlin detailed-statusonly accepted filters in all caps - Bug where absolute path was required in the broker password field
- Bug where potential studies list was not alphabetically sorted when running
merlin status <yaml>
Version 1.13.0b2
[1.13.0b2]
Added
- Ability to turn off the auto-restart functionality of the monitor with
--no-restart - Tests for the monitor files
Changed
- Refactored the
main.pymodule so that it's broken into smaller, more-manageable pieces
Version 1.13.0b1
[1.13.0b1]
Added
- API documentation for Merlin's core codebase
- New
merlin databasecommand to interact with new database functionality- When running locally, SQLite will be used as the database. Otherwise your current results backend will be used
merlin database info: prints some basic information about the databasemerlin database get: allows you to retrieve and print entries in the databasemerlin database delete: allows you to delete entries in the database
- Added
db_scripts/folder containing several new files all pertaining to database interactiondata_models: a module that houses dataclasses that define the format of the data that's stored in Merlin's database.db_commands: an interface for user commands ofmerlin databaseto be processedmerlin_db: houses theMerlinDatabaseclass, used as the main point of contact for interactions with the databaseentities/: A folder containing modules that define a structured interface for interacting with persisted data.entity_managers/: A folder containing classes responsible for managing high-level database operations across all entities.
- Added
backends/folder containing a new OOP way to interact with results backend databasesresults_backend: houses an abstract classResultsBackendthat defines what every supported backend implement in Merlinredis/: A folder containing theRedisBackendclass that defines specific interactions with the Redis databasesqlite/: A folder containing theSQLiteBackendclass that defines specific interactions with the SQLite databasebackend_factory: houses a factory classMerlinBackendFactorythat initializes an appropriateResultsBackendinstance
- Added
monitors/folder containing a refactored, OOP approach to handling themerlin monitorcommandcelery_monitor: houses theCeleryMonitorclass a concrete subclass ofTaskServerMonitorfor monitoring Celery task serversmonitor_factory: houses a factory classMonitorFactorythat initializes an appropriateTaskServerMonitorinstancemonitor: houses theMonitorclass, used as the top-level point of interaction for the monitor commandtask_server_monitor: houses theTaskServerMonitorABC class, which serves as a common interface for monitoring task servers
- A new celery task called
mark_run_as_completethat is automatically added to the task queue associated with the final step in a workflow - Added support for Python 3.12 and 3.13
- Added additional tests for the
merlin runandmerlin purgecommands - Aliased types to represent different types of pytest fixtures
- New test condition
StepFinishedFilesCountto help search forMERLIN_FINISHEDfiles in output workspaces - Added "Unit-tests" GitHub action to run the unit test suite
- Added
CeleryTaskManagercontext manager to the test suite to ensure tasks are safely purged from queues if tests fail - Added
command-tests,workflow-tests, andintegration-teststo the Makefile - Added tests and docs for the new
merlin configoptions - Python 3.8 now requires
orderly-set==5.3.0to avoid a bug with the deepdiff library - New step 'Reinstall pip to avoid vendored package corruption' to CI workflow jobs that use pip
- New GitHub actions to reduce common code in CI
- COPYRIGHT file for ownership details
- New check for copyright headers in the Makefile
Changed
- Updated the
merlin monitorcommand- it will now attempt to restart workflows automatically if a workflow is hanging
- it utilizes an object oriented approach in the backend now
- Celery's default settings have been updated to add:
interval_max: 300-> tasks will retry for up to 5 minutes instead of 1 minute like it previously was- new
broker_transport_options:socket_timeout: 300-> increases the socket timeout to 5 minutes instead of the default 2 minutesretry_policy: {timeout: 600}-> sets the maximum amount of time that Celery will keep trying to connect to the broker to 10 minutes
broker_connection_timeout: 60-> establishing a connection to the broker will not timeout for an entire minute now instead of the previous 4 seconds- new generic backend settings:
result_backend_always_retry: True-> backend will now auto-retry on the event of recoverable exceptionsresult_backend_max_retries: 20-> maximum number of retries in the event of recoverable exceptions
- new Redis specific settings:
redis_retry_on_timeout: True-> retries read/write operations on TimeoutError to the Redis serverredis_socket_connect_timeout: 300-> 5 minute socket timeout for connections to Redisredis_socket_timeout: 300-> 5 minute socket timeout for read/write operations to Redisredis_socket_keepalive: True-> socket TCP keepalive to keep connections healthy to the Redis server
- The
merlin configcommand:- Now defaults to the LaunchIT setup
- No longer required to have configuration named
app.yaml - New subcommands:
create: Creates a new configuration fileupdate-broker: Updates thebrokersection of the configuration fileupdate-backend: Updates theresults_backendsection of the configuration fileuse: Point your active configuration to a new configuration file
- The
merlin servercommand no longer modifies the~/.merlin/app.yamlfile by default. Instead, it modifies the./merlin_server/app.yamlfile. - Dropped support for Python 3.7
- Ported all distributed tests of the integration test suite to pytest
- There is now a
commands/directory and aworkflows/directory under the integration suite to house these tests - Removed the "Distributed-tests" GitHub action as these tests will now be run under "Integration-tests"
- There is now a
- Removed
e2e-distributed*definitions from the Makefile - Modified GitHub CI to use shared testing servers hosted by LaunchIT rather than the jackalope server
- CI to use new actions
- Copyright headers in all files
- These now point to the LICENSE and COPYRIGHT files
- LICENSE: Legal permissions (e.g., MIT terms)
- COPYRIGHT: Ownership, institutional metadata
- Make commands that change version/copyright year have been modified
Fixed
- Running Merlin locally no longer requires an
app.yamlconfiguration file - Removed dead lgtm link
- Potential security vulnerabilities related to logging
Deprecated
- The
--stepsargument of themerlin monitorcommand is now deprecated and will be removed in Version 1.14.0.
Version 1.12.2
[1.12.2]
Added
- Conflict handler option to the
dict_deep_mergefunction inutils.py - Ability to add module-specific pytest fixtures
- Added fixtures specifically for testing status functionality
- Added tests for reading and writing status files, and status conflict handling
- Added tests for the
dict_deep_mergefunction - Pytest-mock as a dependency for the test suite (necessary for using mocks and fixtures in the same test)
- New github action test to make sure target branch has been merged into the source first, so we know histories are ok
- Check in the status commands to make sure we're not pulling statuses from nested workspaces
- Added
setuptoolsas a requirement for python 3.12 to recognize thepkg_resourceslibrary - Patch to celery results backend to stop ChordErrors being raised and breaking workflows when a single task fails
- New step return code
$(MERLIN_RAISE_ERROR)to force an error to be raised by a task (mainly for testing)- Added description of this to docs
- New test to ensure a single failed task won't break a workflow
- Several new unit tests for the following subdirectories:
merlin/common/merlin/config/merlin/examples/merlin/server/
- Context managers for the
conftest.pyfile to ensure safe spin up and shutdown of fixturesRedisServerManager: context to help with starting/stopping a redis server for testsCeleryWorkersManager: context to help with starting/stopping workers for tests
- Ability to copy and print the
Configobject frommerlin/config/__init__.py - Equality method to the
ContainerFormatConfigandContainerConfigobjects frommerlin/server/server_util.py
Changed
merlin infois cleaner and gives python package info- merlin version now prints with every banner message
- Applying filters for
merlin detailed-statuswill now log debug statements instead of warnings - Modified the unit tests for the
merlin statuscommand to use pytest rather than unittest - Added fixtures for
merlin statustests that copy the workspace to a temporary directory so you can see exactly what's run in a test - Batch block and workers now allow for variables to be used in node settings
- Task id is now the path to the directory
- Split the
start_serverandconfig_serverfunctions ofmerlin/server/server_commands.pyinto multiple functions to make testing easier - Split the
create_server_configfunction ofmerlin/server/server_config.pyinto two functions to make testing easier - Combined
set_snapshot_secondsandset_snapshot_changesmethods ofRedisConfiginto one methodset_snapshot
Fixed
- Bugfix for output of
merlin example openfoam_wf_singularity - A bug with the CHANGELOG detection test when the target branch isn't in the ci runner history
- Link to Merlin banner in readme
- Issue with escape sequences in ascii art (caught by python 3.12)
- Bug where Flux wasn't identifying total number of nodes on an allocation
- Not supporting Flux versions below 0.17.0
Version 1.12.2b1
[1.12.2b1]
Added
- Conflict handler option to the
dict_deep_mergefunction inutils.py - Ability to add module-specific pytest fixtures
- Added fixtures specifically for testing status functionality
- Added tests for reading and writing status files, and status conflict handling
- Added tests for the
dict_deep_mergefunction - Pytest-mock as a dependency for the test suite (necessary for using mocks and fixtures in the same test)
- New github action test to make sure target branch has been merged into the source first, so we know histories are ok
- Check in the status commands to make sure we're not pulling statuses from nested workspaces
- Added
setuptoolsas a requirement for python 3.12 to recognize thepkg_resourceslibrary - Patch to celery results backend to stop ChordErrors being raised and breaking workflows when a single task fails
- New step return code
$(MERLIN_RAISE_ERROR)to force an error to be raised by a task (mainly for testing)- Added description of this to docs
- New test to ensure a single failed task won't break a workflow
Changed
merlin infois cleaner and gives python package info- merlin version now prints with every banner message
- Applying filters for
merlin detailed-statuswill now log debug statements instead of warnings - Modified the unit tests for the
merlin statuscommand to use pytest rather than unittest - Added fixtures for
merlin statustests that copy the workspace to a temporary directory so you can see exactly what's run in a test - Batch block and workers now allow for variables to be used in node settings
- Task id is now the path to the directory
Fixed
- Bugfix for output of
merlin example openfoam_wf_singularity - A bug with the CHANGELOG detection test when the target branch isn't in the ci runner history
- Link to Merlin banner in readme
- Issue with escape sequences in ascii art (caught by python 3.12)
- Bug where Flux wasn't identifying total number of nodes on an allocation
- Not supporting Flux versions below 0.17.0
Version 1.12.1
[1.12.1]
Added
- New Priority.RETRY value for the Celery task priorities. This will be the new highest priority.
- Support for the status command to handle multiple workers on the same step
- Documentation on how to run cross-node workflows with a containerized server (
merlin server)
Changed
- Modified some tests in
test_status.pyandtest_detailed_status.pyto accommodate bugfixes for the status commands
Fixed
- Bugfixes for the status commands:
- Fixed "DRY RUN" naming convention so that it outputs in the progress bar properly
- Fixed issue where a step that was run with one sample would delete the status file upon condensing
- Fixed issue where multiple workers processing the same step would break the status file and cause the workflow to crash
- Added a catch for the JSONDecodeError that would potentially crash a run
- Added a FileLock to the status write in
_update_status_file()ofMerlinStepRecordto avoid potential race conditions (potentially related to JSONDecodeError above) - Added in
export MANPAGER="less -r"call behind the scenes fordetailed-statusto fix ASCII error
Version 1.12.0
[1.12.0]
Added
- A new command
merlin queue-infothat will print the status of your celery queues- By default this will only pull information from active queues
- There are options to look for specific queues (
--specific-queues), queues defined in certain spec files (--spec; this is the same functionality as themerlin statuscommand prior to this update), and queues attached to certain steps (--steps) - Queue info can be dumped to outfiles with
--dump
- A new command
merlin detailed-statusthat displays task-by-task status information about your study- This has options to filter by return code, task queues, task statuses, and workers
- You can set a limit on the number of tasks to display
- There are 3 options to modify the output display
- Docs for all of the monitoring commands
- New file
merlin/study/status.pydedicated to work relating to the status command- Contains the Status and DetailedStatus classes
- New file
merlin/study/status_renderers.pydedicated to formatting the output for the detailed-status command - New file
merlin/common/dumper.pycontaining a Dumper object to help dump output to outfiles - Study name and parameter info now stored in the DAG and MerlinStep objects
- Added functions to
merlin/display.pythat help display status information:display_task_by_task_statushandles the display for themerlin detailed-statuscommanddisplay_status_summaryhandles the display for themerlin statuscommanddisplay_progress_bargenerates and displays a progress bar
- Added new methods to the MerlinSpec class:
- get_worker_step_map()
- get_queue_step_relationship()
- get_tasks_per_step()
- get_step_param_map()
- Added methods to the MerlinStepRecord class to mark status changes for tasks as they run (follows Maestro's StepRecord format mostly)
- Added methods to the Step class:
- establish_params()
- name_no_params()
- Added a property paramater_labels to the MerlinStudy class
- Added two new utility functions:
- dict_deep_merge() that deep merges two dicts into one
- ws_time_to_dt() that converts a workspace timestring (YYYYMMDD-HHMMSS) to a datetime object
- A new celery task
condense_status_filesto be called when sets of samples finish - Added a celery config setting
worker_cancel_long_running_tasks_on_connection_losssince this functionality is about to change in the next version of celery - Tests for the Status and DetailedStatus classes
- this required adding a decent amount of test files to help with the tests; these can be found under the tests/unit/study/status_test_files directory
- Pytest fixtures in the
conftest.pyfile of the integration test suite- NOTE: an export command
export LC_ALL='C'had to be added to fix a bug in the WEAVE CI. This can be removed when we resolve this issue for themerlin servercommand
- NOTE: an export command
- Tests for the
celeryadapter.pymodule - New CeleryTestWorkersManager context to help with starting/stopping workers for tests
Changed
- Reformatted the entire
merlin statuscommand- Now accepts both spec files and workspace directories as arguments
- Removed the --steps flag
- Replaced the --csv flag with the --dump flag
- New functionality:
- Shows step_by_step progress bar for tasks
- Displays a summary of task statuses below the progress bar
- Split the
add_chains_to_chordfunction inmerlin/common/tasks.pyinto two functions:get_1d_chainwhich converts a 2D list of chains into a 1D listlaunch_chainwhich launches the 1D chain
- Pulled the needs_merlin_expansion() method out of the Step class and made it a function instead
- Removed
tabulate_infofunction; replaced with tabulate from the tabulate library - Moved
verify_filepathandverify_dirpathfrommerlin/main.pytomerlin/utils.py - The entire documentation has been ported to MkDocs and re-organized
- Dark Mode
- New "Getting Started" example for a simple setup tutorial
- More detail on configuration instructions
- There's now a full page on installation instructions
- More detail on explaining the spec file
- More detail with the CLI page
- New "Running Studies" page to explain different ways to run studies, restart them, and accomplish command line substitution
- New "Interpreting Output" page to help users understand how the output workspace is generated in more detail
- New "Examples" page has been added
- Updated "FAQ" page to include more links to helpful locations throughout the documentation
- Set up a place to store API docs
- New "Contact" page with info on reaching Merlin devs
- The Merlin tutorial defaults to using Singularity rather than Docker for the OpenFoam example. Minor tutorial fixes have also been applied.
Fixed
- The
merlin statuscommand so that it's consistent in its output whether using redis or rabbitmq as the broker - The
merlin monitorcommand will now keep an allocation up if the queues are empty and workers are still processing tasks - Add the restart keyword to the specification docs
- Cyclical imports and config imports that could easily cause ci issues
Version 1.11.1
[1.11.1]
Fixed
- Typo in
batch.pythat caused lsf launches to fail (ALL_SGPUSchanged toALL_GPUS)