Added code in data.R to download Laribi2024full.csv from https://zenodo.org/records/12954673
New Laribi2024-figure-data.R runs timings, saving Laribi2024-figure-data.rds. Laribi2024-figure.R reads that and makes
data.R creates standard data set CSV files under data/.
data_meta.R creates data_meta.csv
several_Tasks_data.R reads data set CSV files and writes several_Tasks_data.csv
several_Tasks.R reads that and makes figures:
We see that neither Wasikowski nor RSS is always the best, but both are generally much better than random. Interestingly
- in AZtrees there are a few very large groups, so best RSS is very large for more than 3 folds.
- generally the evaluation metrics are consistent but in respiratory, that is not always the case:
- for 7 folds, RSS is slightly better in both evaluation metrics. (consistent)
- for 8 folds, Wasikowski is slightly better in both evaluation metrics. (consistent)
- for 9 folds, Wasikowski has better mean.sd but RSS has better RSS. (inconsistent)
- for 2 and 6 folds, one evaluation metric says algos are same, other says one is better. (inconsistent)
opt.R is a proof-of-concept for RSS minimization.
stratified_atime_data.R makes stratified_atime.RData
stratified_atime.R makes
- comment: data respiratory, R code.
- sklearn code
- kaggle: notebook, zip, csv, python script.