Tags: IBM/ado
Tags
refactor(core): convert no-priors operator to a RandomWalk sampler (#877 ) * refactor: migrate no-priors sampler * chore: delete legacy example, not needed because now the new sampling is integrated in random walk * docs: add samplers * chore: remove LLM attribution since it has just copy-pasted files * refactor(needs fix): update trim to use no priors sampler from random walk Operation Creation: ❌ FAILED with recursion error Command: uv run ado create operation -f examples/trim/example_yamls/op_pressure.yaml --use-latest space Exit code: 133 The operation started and displayed the discovery space details correctly Ray cluster initialized successfully Failure Details Immediate Failure Symptom: RecursionError: maximum recursion depth exceeded Precise Location: orchestrator/modules/operators/_orchestrate_core.py:43 in log_space_details() Call Stack: File "orchestrator/modules/operators/_orchestrate_core.py", line 117, in _run_operation_harness operation_output: OperationOutput | None = run_closure() File "orchestrator/modules/operators/_general_orchestration.py", line 32, in _run_general_operation_core return operation_function( File "orchestrator/modules/operators/collections.py", line 153, in wrapper return orchestrate_general_operation( File "orchestrator/modules/operators/_general_orchestration.py", line 101, in orchestrate_general_operation log_space_details(discovery_space) File "orchestrator/modules/operators/_orchestrate_core.py", line 43, in log_space_details console.print(discovery_space) Root Cause: The recursion occurs in the Rich library's rendering chain when attempting to print the discovery_space object. The stack trace shows infinite recursion through: rich/console.py → rich/panel.py → rich/padding.py → rich/pretty.py Specifically in pretty.py:489 where repr_str = "".join(str(line) for line in lines) creates a circular reference Additional Observations: The operation created multiple nested sub-operations (visible in the deeply nested error message showing operation identifiers like operation-trim-1.7.1.dev72+gb804c1e11.d20260420-1e3afcbc, operation-trim-1.7.1.dev72+gb804c1e11.d20260420-9a0c5225, etc.) Each sub-operation encountered the same recursion error when trying to log space details The error cascaded through multiple operation levels before the final SIGTRAP signal Conclusion The previous recursion failure still reproduces exactly. The issue is not intermittent—it consistently occurs at the same location (log_space_details()) when the TRIM operator attempts to print the discovery space object using Rich's console rendering. * chore: remove legacy docs about no priors characterization * refactor: remove legacy operator from tests * build: add scipy * build: add scipy pt 2 * chore: remove unused import * docs: rephrase a sentence * refactor: migrate no_priors modules from orchestrator to trim plugin - Move no_priors_parameters.py, no_priors_sampler.py, no_priors_utils.py from orchestrator/core/discoveryspace/ to plugins/operators/trim/src/trim/samplers/ - Update all imports in trim plugin to reference new location - Update module name in operator.py from orchestrator.core.discoveryspace.no_priors_sampler to trim.samplers.no_priors_sampler - Delete old files from orchestrator/core/discoveryspace/ - Delete corresponding test file from tests/core/discoveryspace/ This change encapsulates no_priors functionality within the trim plugin where it belongs. * fix: update trim plugin imports after no_priors migration Update imports in trim operator source files to reference new location: - operator.py: update module name and import - trim_pydantic.py: update NoPriorsParameters import - trim_sampler.py: update no_priors_utils imports - utils/order.py: update get_sampling_indices_multi_dimensional import * test: update trim plugin test imports after no_priors migration Update test imports to reference new module location: - test_high_dimensional_sampling.py: update concatenated_latin_hypercube_sampling import - test_sampling.py: update get_index_list_van_der_corput import * docs: update documentation and remove old no_priors files - Remove old no_priors files from orchestrator/core/discoveryspace/ - Documentation in random-walk.md already references correct new location - Fix markdown line length issues * docs: update docs * fix(test): complete import refactoring * build: remove scipy deps as it is no longer needed * chore: remove legacy content * fix: missing check bug * feat: enable synchronous entity iterator * docs(website): random_walk Reorder sections and header levels. * fix: operationInfo is None Added ternary check: operationInfo.actuatorConfigurationIdentifiers if operationInfo else [] Mirrors the guard already present in the no-priors block (lines 122-126) --------- Signed-off-by: Daniele Lotito <99284466+danielelotito@users.noreply.github.com> Co-authored-by: michaelj <michaelj@ie.ibm.com> Co-authored-by: Michael Johnston <66301584+michael-johnston@users.noreply.github.com>
feat(no-priors): create standalone no-priors characterization operator ( #686) * refactor(operators): extract no-priors characterization from TRIM into separate plugin - Create new no-priors-characterization operator plugin per issue [#683](#683) - Move sampling logic to new plugin - Update TRIM to depend on ado-no-priors-characterization - Add examples and documentation - Fix small issue about edge case in high dim sampling Related to [#683](#683) * docs: Update examples/no-priors-characterization/README.md Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Daniele Lotito <99284466+danielelotito@users.noreply.github.com> * build(description): remove reference to sampling * refactor: improve on some minor style/docs/nesting choices Details: Removed reorder_df_by_importance function (moved to orchestrator/utilities/pandas.py) Optimized set operations for better performance Fixed docstrings to Google format without internal variable references Used unique() instead of set() for NaN-safe handling Removed unnecessary comment Inverted logic in order_df_for_get_index_list_nn_high_dimensional to reduce nesting * feat: Add sort_rows_by_column_names * docs: apply suggestions * docs: apply suggestions * style: format operator.py * style: format utils/__init__.py * style: format high_dimensional_sampling.py * fix(no-priors-characterization): correct samples parameter interpretation The samples parameter now correctly specifies NEW entities to sample, not total desired samples. Previously it subtracted existing measurements, causing crashes when existing measurements exceeded the requested count. Now always samples exactly the requested number of NEW entities, regardless of existing measurements in the space. * refactor: update import to use sort_rows_by_column_names from core utilities * style: address #686 (comment) * docs: address #686 (comment) * docs(prettier): Update examples/no-priors-characterization/README.md Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Daniele Lotito <99284466+danielelotito@users.noreply.github.com> * docs: address #686 (comment) , all algos are fast * refactor: update structure according to https://github.com/IBM/ado/pull/686/changes/BASE..6c85e73df3a0511640ac254be9a953977abceeee#r2918667637 * style(logs): address comment "the error level is the same one that's used for exceptions: either this is an exception worthy occurrence or this should be only a warning" * docs: address #686 (comment) * docs(website): make docs appear * docs: remove text about legacy behavior * build: add missing pyproject.toml sections for custom experiments - Add ado-core dependency - Add build-system configuration with setuptools - Add tool.setuptools_scm with root path - Addresses review comment to follow plugin-development.mdc guidelines * docs: address #686 (comment) * docs: replace markdown preprocessor with symbolic links for no-priors-characterization * docs: add symlinks for no-priors-characterization YAML files * feat: add generatorid parameter to SpacePoint.to_entity() - Add optional generatorid parameter to SpacePoint.to_entity() method - Defaults to 'unk' if not specified for backward compatibility - Pass generatorid to Entity constructor to properly track entity origin - Add docstring explaining the parameter and return value This enables callers to specify the generator that created an entity, fixing the issue where entities were always created with 'unk' as the generatorid. * fix: use generatorid parameter in space_df_connector Update get_list_of_entities_from_df_and_space() to pass 'no_priors_characterization' as generatorid when creating entities. This ensures entities are properly tagged with their generator instead of defaulting to 'unk'. * docs: update README to reflect correct generatorid Replace 'unk' with 'no_priors_characterization' in example output to reflect the actual generatorid now being set correctly. * Merge branch 'main' into HEAD * build(deps): update dependencies (#710) Signed-off-by: DRL NextGen <220003231+DRL-NextGen@users.noreply.github.com> * build(deps): update dependencies (#728) Signed-off-by: DRL NextGen <220003231+DRL-NextGen@users.noreply.github.com> * docs: fix table layout * fix: update imports --------- Signed-off-by: Daniele Lotito <99284466+danielelotito@users.noreply.github.com> Signed-off-by: DRL NextGen <220003231+DRL-NextGen@users.noreply.github.com> Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Co-authored-by: DRL-NextGen <220003231+DRL-NextGen@users.noreply.github.com>
feat(autoconf): Introduce a new recommender for per_device_train_batc… …h_size (#500) * refactor: improve code structure * docs(v3 models): add and document new models trained with autogluon v1.5 Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> * feat: save model card in the model folder * refactor: improve var names * feat: update main pydantic model for new Autogluon model Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * chore: update the pydantic model selector Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * docs(autoconf): Updating the README to address update models Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * docs(autoconf): minor updates to changelog Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * test(autoconf): update to recommender test Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * feat(autoconf): adding avoid oom recommender experiment Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * test(autoconf): Unit test cases for the new recommender Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * feat(autoconf): add optional model version to avoid oom recommender Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * test: Updating integration test to account for new experiment Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> --------- Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> Co-authored-by: Daniele Lotito <daniele.lotito@ibm.com> Co-authored-by: Michael Johnston <66301584+michael-johnston@users.noreply.github.com> Co-authored-by: Daniele Lotito <99284466+danielelotito@users.noreply.github.com>
refactor: use rich instead of IPython's pretty (#474) * refactor: use rich instead of IPython's pretty * refactor: rename rich_output to rich * fix(core): ensure we use the rich representation if available * build: remove dependency on jupyter * refactor: prefer get_rich_repr to Pretty * feat: improve parameterized experiment rendering * feat(utilities): add function to render rich renderable to string * refactor: use new function to render to string * refactor: use render_to_string in tests * refactor(tests): rename tests * fix(core): add back sample store identifier to discovery space resource * docs: update rendered describe outputs * fix(tests): update assertion * fix(schema): prevent first-line description truncation This was due to rich using the width of the terminal to render the string. Since the line already had "Description: " in it, however, the first line of the description could've rendered outside of the space, resulting in truncation * feat: pretty print ints and floats as well * refactor: avoid table expansion * refactor: use heavy table box * refactor: use Text.assemble instead of chaining text * fix: chain text where needed
fix(core): update PropertyValue schema for structured decoding (#350) * feat(core): cli plugins * fix(core): update PropertyValue schema for structured decoding structure decoding methods use json schema of pydantic model. However the JSON type "binary" is not supported standard decoding methods. This change causes the json schema for PropertyValue to output an annotated string type instead of binary. * Apply suggestions from code review Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Michael Johnston <66301584+michael-johnston@users.noreply.github.com> * Revert "Apply suggestions from code review" This reverts commit 22a6817. * refactor: rename to CustomBytes --------- Signed-off-by: Michael Johnston <66301584+michael-johnston@users.noreply.github.com> Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Co-authored-by: Alessandro Pomponio <alessandro.pomponio1@ibm.com>
fix(run_experiment): print request series with use_markup=False (#319) fix(core): Set use_markup=False console_print interprets [] as markup tags so you if use_markup=True do not passing string representations of lists causes list content to be dropped
build(autoconf): pin the required autogluon version (#304) * build(autoconf): fixing the autogluon version for autoconf Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * build(autoconf): fixing the autogluon version for autoconf Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> --------- Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com>
feat(autoconf): introduce autoconf custom experiments (#255) * Squashed 'plugins/custom_experiments/autoconf/' content from commit b6fcc73 git-subtree-dir: plugins/custom_experiments/autoconf git-subtree-split: b6fcc73b0b2b5b75192001b4f17e331a6df67fde * chore: remove pycache files Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * fix: breaking ci and remove unsupported models Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * fix: resolve ci issues Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * fix: updating recommender test Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * docs: remove redundant examples from autoconf readme Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * docs: update to examples Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * Update plugins/custom_experiments/autoconf/autoconf/AutoGluonModels/changelog.md Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * Update plugins/custom_experiments/autoconf/autoconf/AutoGluonModels/README.md Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * Update plugins/custom_experiments/autoconf/autoconf/AutoGluonModels/README.md Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * Update plugins/custom_experiments/autoconf/autoconf/AutoGluonModels/README.md Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * Update plugins/custom_experiments/autoconf/autoconf/min_gpu_recommender.py Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * Update plugins/custom_experiments/autoconf/README.md Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * refactor: improve name of variable in get_model_prediction_and_metadata * refactor: clarify variable naming in recommender.py * docs: improve readability Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> * docs: improve readability Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> * refactor: apply suggestion Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> * refactor: improve variable name * refactor(autoconf): tidy up the JobConfig pydantic model and remove the unused Config class Signed-off-by: Vassilis Vassiliadis <vassilis.vassiliadis@ibm.com> * refactor(autoconf): tidy up the recommend_min_gpu() method The new code iterates the candidate number_gpus starting from the minimum value and stops the first time it predicts that the job would complete successfully. Signed-off-by: Vassilis Vassiliadis <vassilis.vassiliadis@ibm.com> * refactor(autoconf): remove dead code and use log.debug() instead of print Signed-off-by: Vassilis Vassiliadis <vassilis.vassiliadis@ibm.com> * fix: add torch to dependencies Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * test: add autoconf test to tox Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * fix: update plugins/custom_experiments/autoconf/autoconf/utils/config_mapper.py Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * docs: update README to address comments Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * fix: address comments about exception handling Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * fix: breaking style check on pydantic models Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * build(autoconf): change the name of the package to ado-autoconf Signed-off-by: Vassilis Vassiliadis <vassilis.vassiliadis@ibm.com> * docs: fix training options The current setting uses `medium_quality` + `optimize_for_deployment` (even if in the script the optimization happens with predictor.clone_for_deployment) You can use good quali Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> * build(autoconf): specify the packages to include in ado-autoconf Signed-off-by: Vassilis Vassiliadis <vassilis.vassiliadis@ibm.com> * fix: update plugins/custom_experiments/autoconf/README.md Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * docs: update plugins/custom_experiments/autoconf/README.md Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * docs: update plugins/custom_experiments/autoconf/README.md Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * docs: fix folder naming structure Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> * fix: update plugins/custom_experiments/autoconf/autoconf/min_gpu_recommender.py Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * test: install autoconf before test Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * docs(autoconf): Updating the README to address review Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * test(autoconf): Update tox.ini Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> * docs(autoconf): update to paths * build(autoconf): update experiment name Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> * refactor: removed unused method. It was there because predictors from both sklearn and autogluon have this method and at the beginning I was planning to have it inherit from sklearn estimator Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> * refactor: improve readability with list comprehension and displace it in the only script that uses it Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> * refactor: use the logger Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> --------- Signed-off-by: SRIKUMAR VENUGOPAL <srikumarv@ie.ibm.com> Signed-off-by: Srikumar Venugopal <srikumar003@users.noreply.github.com> Signed-off-by: Daniele Lotito <daniele.lotito@ibm.com> Signed-off-by: Vassilis Vassiliadis <vassilis.vassiliadis@ibm.com> Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Co-authored-by: Daniele-Lotito <daniele.lotito@ibm.com> Co-authored-by: Vassilis Vassiliadis <vassilis.vassiliadis@ibm.com>
PreviousNext