Added new initil assumptions with boostings #1359

dmitryglhf · 2025-01-23T14:50:18Z

This is a 🔨 code refactoring.

Summary

New Initial Assumptions: Updated initial assumptions by adding boosting-based solutions (CatBoost, XGBoost, LightGBM).

Comparison table between old and new assumptions (validated on automlbenchmark small dataset 1h8c):

	Metric (mean)	main (Random Forest first)	new (GBM joined with Linear model)
0	auc	0.869263	0.879746
1	acc	0.84667	0.852339
2	balacc	0.805336	0.822745
3	logloss	0.449189	0.377827
4	training_duration, s	242.554	251.445

Context

Closes #1341

pep8speaks · 2025-01-23T14:50:26Z

Hello @dmitryglhf! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2025-01-23 15:41:19 UTC

dmitryglhf · 2025-01-23T14:51:10Z

/fix-pep8

github-actions · 2025-01-23T14:51:13Z

All PEP8 errors has been fixed, thanks ❤️

Comment last updated at Tue, 18 Feb 2025 16:48:09

dmitryglhf · 2025-01-23T14:57:30Z

Discovered three most useful assumptions.

	Metric (mean)	main (Random Forest first)	gbm_linear	xgb_lgbm_linear	rf_in_fork
0	auc	0.869263	0.879746	0.877597	0.878225
1	acc	0.84667	0.852339	0.848727	0.85005
2	balacc	0.805336	0.822745	0.818386	0.819006
3	logloss	0.449189	0.377827	0.392734	0.396282
4	training_duration	242.554	251.445	213.057	246.447

Pipelines and full comparison table

GBM_Linear
XGB_LGBM_Linear
RF_in_Fork

Full comparison table: full_comparison.xlsx

Full tables for each pipeline:
gbm_catboost_new_params.csv
main_rf.csv
rf_in_fork.csv

…FEDOT into new-initial-assumptions

codecov · 2025-01-23T15:06:34Z

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 80.58%. Comparing base (4df6afe) to head (80d8f48).
Report is 13 commits behind head on master.

Files with missing lines	Patch %	Lines
...mplementations/models/boostings_implementations.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1359      +/-   ##
==========================================
+ Coverage   80.15%   80.58%   +0.43%     
==========================================
  Files         146      146              
  Lines       10515    10515              
==========================================
+ Hits         8428     8474      +46     
+ Misses       2087     2041      -46

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dmitryglhf · 2025-01-23T15:27:25Z

@nicl-nno Среди трех начальных приближений стоит оставить все или же только то, которое улучшает метрики больше остальных?

nicl-nno · 2025-01-23T20:52:11Z

Среди трех начальных приближений стоит оставить все или же только то, которое улучшает метрики больше остальных?

Зависит от того, меняется ли лидер при смене группы датасетов. Если везде +-один вариант доминирует - то можно его иоставить.

…itial-assumptions

dmitryglhf · 2025-02-01T13:28:09Z

Looks like FEDOT mostly choosing and modifyinggbm_linear assumption.

Details

Binary classification:

binary_class_datasets = [
    'blood-transf.arff.csv', 'christine.arff.csv', 'jasmine.arff.csv',
    'phoneme.arff.csv', 'sylvine.arff.csv',
]

Multiclass classification:

multi_class_datasets = [
    'car.arff.csv', 'cnae-9.arff.csv', 'dilbert.arff.csv', 'fabert.arff.csv',
    'mfeat-factors.arff.csv', 'segment.arff.csv', 'vehicle.arff.csv'
]

Regression:

regression_datasets = [
    'analcatdata_negotiation.arff.csv', 'bodyfat.arff.csv', 'cleveland.arff.csv', 
    'cloud.arff.csv', 'kin8nm.arff.csv', 'liver-disorders.arff.csv'
]

Logs:
class_compare_datasets.log
regr_compare_datasets.log

nicl-nno · 2025-02-01T14:50:23Z

Looks like FEDOT mostly choosing and modifyinggbm_linear assumption

А это хорошо или плохо?

dmitryglhf · 2025-02-01T15:38:11Z

Looks like FEDOT mostly choosing and modifyinggbm_linear assumption

А это хорошо или плохо?

Это промежуточное сообщение, чтобы не потерять результаты, хотел по нему задать вопрос.
В нем сравнение какое начальное приближение доминирует. В задачах классификации преимущественно выбирается за основу или как итоговое решение первый пайплайн gbm_linear.
В случае с регрессией похоже за основу берется тот же пайплайн, но при этом он сильно меняется. Может быть стоит оставить текущее приближение rfr?

Regression:

regression_datasets = [
    'analcatdata_negotiation.arff.csv', 'bodyfat.arff.csv', 'cleveland.arff.csv', 
    'cloud.arff.csv', 'kin8nm.arff.csv', 'liver-disorders.arff.csv'
]

Logs: regr_compare_datasets.log

…itial-assumptions

Added new initil assumptions with boostings

d55865f

github-actions bot and others added 2 commits January 23, 2025 14:52

Automated autopep8 fixes

5f24a70

Update default catboost params

a89c92e

Merge branch 'new-initial-assumptions' of https://github.com/aimclub/…

5f066a3

…FEDOT into new-initial-assumptions

dmitryglhf added 2 commits January 23, 2025 18:13

Added early_stopping_rounds parameter for CatBoost

5cfae85

od_wait instead of early_stopping_rounds

a554e8b

Selected only gbm_linear asuumption

326d485

dmitryglhf added 2 commits February 1, 2025 13:12

Merge branch 'master' of https://github.com/aimclub/FEDOT into new-in…

3daf188

…itial-assumptions

updated target assign for multi output

80ce88d

dmitryglhf and others added 2 commits February 18, 2025 14:50

updated regression assumptions

14f9ac3

Merge branch 'master' of https://github.com/aimclub/FEDOT into new-in…

80d8f48

…itial-assumptions

dmitryglhf requested a review from nicl-nno February 18, 2025 15:40

nicl-nno approved these changes Feb 18, 2025

View reviewed changes

dmitryglhf merged commit 3b61d42 into master Feb 18, 2025
10 checks passed

dmitryglhf deleted the new-initial-assumptions branch February 18, 2025 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added new initil assumptions with boostings #1359

Added new initil assumptions with boostings #1359

Uh oh!

dmitryglhf commented Jan 23, 2025 •

edited

Loading

Uh oh!

pep8speaks commented Jan 23, 2025 •

edited

Loading

Uh oh!

dmitryglhf commented Jan 23, 2025

Uh oh!

github-actions bot commented Jan 23, 2025 •

edited

Loading

Uh oh!

dmitryglhf commented Jan 23, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jan 23, 2025 •

edited

Loading

Uh oh!

dmitryglhf commented Jan 23, 2025 •

edited

Loading

Uh oh!

nicl-nno commented Jan 23, 2025

Uh oh!

dmitryglhf commented Feb 1, 2025

Uh oh!

nicl-nno commented Feb 1, 2025

Uh oh!

dmitryglhf commented Feb 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Added new initil assumptions with boostings #1359

Added new initil assumptions with boostings #1359

Uh oh!

Conversation

dmitryglhf commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Uh oh!

pep8speaks commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2025-01-23 15:41:19 UTC

Uh oh!

dmitryglhf commented Jan 23, 2025

Uh oh!

github-actions bot commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at Tue, 18 Feb 2025 16:48:09

Uh oh!

dmitryglhf commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dmitryglhf commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nicl-nno commented Jan 23, 2025

Uh oh!

dmitryglhf commented Feb 1, 2025

Uh oh!

nicl-nno commented Feb 1, 2025

Uh oh!

dmitryglhf commented Feb 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dmitryglhf commented Jan 23, 2025 •

edited

Loading

pep8speaks commented Jan 23, 2025 •

edited

Loading

github-actions bot commented Jan 23, 2025 •

edited

Loading

dmitryglhf commented Jan 23, 2025 •

edited

Loading

codecov bot commented Jan 23, 2025 •

edited

Loading

dmitryglhf commented Jan 23, 2025 •

edited

Loading