Skip to content

tdhock/stratified-group-cv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

3 June 2026

Added code in data.R to download Laribi2024full.csv from https://zenodo.org/records/12954673

New Laribi2024-figure-data.R runs timings, saving Laribi2024-figure-data.rds. Laribi2024-figure.R reads that and makes

Laribi2024-figure-rows.pdf

Laribi2024-figure-refs.pdf

11 May 2026

data.R creates standard data set CSV files under data/.

data_meta.R creates data_meta.csv

several_Tasks_data.R reads data set CSV files and writes several_Tasks_data.csv

several_Tasks.R reads that and makes figures:

several_Tasks_wrap.png

several_Tasks_sd.png

several_Tasks.png

several_Tasks_respiratory.png

We see that neither Wasikowski nor RSS is always the best, but both are generally much better than random. Interestingly

  • in AZtrees there are a few very large groups, so best RSS is very large for more than 3 folds.
  • generally the evaluation metrics are consistent but in respiratory, that is not always the case:
    • for 7 folds, RSS is slightly better in both evaluation metrics. (consistent)
    • for 8 folds, Wasikowski is slightly better in both evaluation metrics. (consistent)
    • for 9 folds, Wasikowski has better mean.sd but RSS has better RSS. (inconsistent)
    • for 2 and 6 folds, one evaluation metric says algos are same, other says one is better. (inconsistent)

previous

opt.R is a proof-of-concept for RSS minimization.

stratified_atime_data.R makes stratified_atime.RData

stratified_atime.R makes

stratified_atime_kaggle.png

stratified_atime_sim.png

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors