Fix Kruskal-Wallis test reporting failure with degenerate cases (one observation per group) #512

Copilot · 2025-08-31T05:16:02Z

When a nominal vector has many levels with one observation per level, calling report() on a Kruskal-Wallis test would fail with a data frame construction error because the confidence interval calculation for the effect size would fail when all bootstrapped values are identical.

Problem

The original issue occurred when:

Creating a factor with many levels (e.g., as.factor(1:n))
Each level has only one observation
Running kruskal.test() works fine and is fast
But report() on the result would fail with: "arguments imply differing number of rows: 1, 0"

This happened because effectsize::rank_epsilon_squared() uses bootstrap methods to compute confidence intervals, but when there's one observation per group, all bootstrapped effect sizes equal 1, making CI calculation impossible.

Solution

Added error handling in .report_effectsize_kruskal() to:

Catch CI calculation failures due to degenerate cases
Fallback to ci = NULL when bootstrap CI fails
Report the effect size without confidence intervals in these edge cases
Maintain backward compatibility for normal cases

Before and After

Before (fails):

library("report")
n <- 10
set.seed(123)
df <- data.frame(a = as.factor(1:n), b = rnorm(n))
test <- kruskal.test(df$b, df$a)
report(test)  # Error: arguments imply differing number of rows: 1, 0

After (works):

library("report")
n <- 10
set.seed(123)
df <- data.frame(a = as.factor(1:n), b = rnorm(n))
test <- kruskal.test(df$b, df$a)
result <- report(test, data = df)
print(result)
#> [1] "All values of t are equal to  1 \n Cannot calculate confidence intervals"
#> Effect sizes were labelled following Field's (2013) recommendations.
#> 
#> The Kruskal-Wallis rank sum test testing the difference in ranks between df$b
#> and df$a suggests that the effect is statistically not significant, and large
#> (Kruskal-Wallis chi2 = 9.00, p = 0.437; Epsilon squared (rank) = 1.00)

# Performance test with larger n
n <- 100
set.seed(456)  
df_large <- data.frame(a = as.factor(1:n), b = rnorm(n))
test_large <- kruskal.test(df_large$b, df_large$a)

# This now completes quickly instead of taking a long time
system.time({
    result_large <- report(test_large, data = df_large, verbose = FALSE)
})
#> [1] "All values of t are equal to  1 \n Cannot calculate confidence intervals"
#>    user  system elapsed 
#>   0.161   0.002   0.164

Key Improvements

✅ Performance: Degenerate cases now complete in <1 second vs. failing before
✅ Reliability: No more crashes with data frame construction errors
✅ Graceful degradation: Effect size still reported even when CI can't be calculated
✅ Backward compatibility: Normal cases continue to work with CI as before

Note: For optimal results, users should provide data manually report(test, data = df) or use the formula interface kruskal.test(y ~ group, data = df) as suggested by existing warning messages in the report package.

Fixes #454.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: rempsyc <13123390+rempsyc@users.noreply.github.com>

Initial plan

fc1ff25

Copilot AI assigned Copilot and rempsyc Aug 31, 2025

Copilot started work on behalf of rempsyc August 31, 2025 05:16 View session

Fix Kruskal test reporting failure with degenerate cases

fe1a3fc

Co-authored-by: rempsyc <13123390+rempsyc@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fail to compute report for Kruskal test in some cases~~ Fix Kruskal-Wallis test reporting failure with degenerate cases (one observation per group) Aug 31, 2025

Copilot AI requested a review from rempsyc August 31, 2025 05:40

Copilot finished work on behalf of rempsyc August 31, 2025 05:40

rempsyc added the copilot-setup-full label Sep 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix Kruskal-Wallis test reporting failure with degenerate cases (one observation per group) #512

Fix Kruskal-Wallis test reporting failure with degenerate cases (one observation per group) #512

Uh oh!

Copilot AI commented Aug 31, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Fix Kruskal-Wallis test reporting failure with degenerate cases (one observation per group) #512

Are you sure you want to change the base?

Fix Kruskal-Wallis test reporting failure with degenerate cases (one observation per group) #512

Uh oh!

Conversation

Copilot AI commented Aug 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Before and After

Key Improvements

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Aug 31, 2025 •

edited

Loading