Skip to content

dotnet-test: add Rule #0 (Confirm the Test Target) to ruby/powershell extensions#702

Merged
Evangelink merged 3 commits into
mainfrom
dev/amauryleve/improve-ruby-powershell-test-extensions
Jun 1, 2026
Merged

dotnet-test: add Rule #0 (Confirm the Test Target) to ruby/powershell extensions#702
Evangelink merged 3 commits into
mainfrom
dev/amauryleve/improve-ruby-powershell-test-extensions

Conversation

@Evangelink

@Evangelink Evangelink commented May 28, 2026

Copy link
Copy Markdown
Member

Summary

When the test-generation prompt does not name a specific file ("test the repository", "one core module", "comprehensive suite"), agents frequently target the wrong code — typically the largest upstream module that already has rich existing tests — instead of the newly-added module the user actually wants tested.

Recent benchmark runs showed this is a dominant failure mode for Ruby/PowerShell: the agent burns 50+ turns writing tests for files the verifier never measures, while the real target (a small untracked lib/string_utils.rb or tools/StringUtils.psm1) sits one git status away.

Change

Adds Rule #0: Confirm the Test Target to both ruby.md and powershell.md extensions, ahead of the existing "Rule #1: Investigate the Repo First". The rule:

  • Tells the agent to use git history (git status -s, git ls-files --others --exclude-standard, git log --diff-filter=A --name-only -5) to find the actual target rather than guessing from repo size.
  • Documents a Test Placement Contract — RSpec scopes to spec/, Pester scopes to whatever directory the harness passes to Invoke-Pester -Path; tests placed elsewhere are invisible.
  • Adds a First-Test Sanity Loop: write one test, run --dry-run / -PassThru to confirm discovery > 0, fix LoadError / Import-Module issues before expanding. Catches placement mistakes on turn 1.

This complements an msbench-side benchmark fix that aligns the affected prompts with their verifier scope; the skill change provides defense-in-depth for real-world vague prompts.

Validation

  • dotnet run --project eng/skill-validator/src -- check --plugin ./plugins/dotnet-test✅ All checks passed (22 skills, 11 agents, 1 plugin).
  • Pure prepend: existing Rule Initial documentation and validation workflow #1 and all subsequent sections are unchanged.
  • +89 lines, 0 deletions across two files.

… extensions

When the prompt does not name a specific file ("test the repository", "one
core module", "comprehensive suite"), agents frequently target the wrong
code — typically the largest upstream module that already has rich existing
tests — instead of the newly-added module the user actually wants tested.
Recent benchmark runs (top5-{ruby,powershell}-*-{simple,complex}) showed
this is a dominant failure mode for Ruby/PowerShell: the agent burns 50+
turns writing tests for files the verifier never measures, while the real
target (a small untracked `lib/string_utils.rb` or `tools/StringUtils.psm1`)
sits one `git status` away.

Adds a Rule #0 to both ruby.md and powershell.md, ahead of the existing
"Rule #1: Investigate the Repo First". The rule:

- Tells the agent to use git history (`git status -s`,
  `git ls-files --others --exclude-standard`,
  `git log --diff-filter=A --name-only -5`) to find the actual target
  rather than guessing from repo size.
- Documents a Test Placement Contract — RSpec scopes to `spec/`,
  Pester scopes to whatever directory the harness passes to
  `Invoke-Pester -Path`; tests placed elsewhere are invisible.
- Adds a First-Test Sanity Loop: write one test, run --dry-run /
  -PassThru to confirm discovery > 0, fix LoadError / Import-Module
  issues before expanding. Catches placement mistakes on turn 1.

Both files validate cleanly under skill-validator. The existing Rule #1
and all subsequent sections are unchanged — this is a pure prepend.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 28, 2026 16:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “Rule #0: Confirm the Test Target” to the Ruby and PowerShell code-testing extensions to reduce a common failure mode where agents write tests for the wrong (often large, already-tested) parts of a repo when prompts are vague.

Changes:

  • Prepend a Ruby Rule #0 describing git-based target discovery, test placement expectations (RSpec/Minitest), and a first-test discovery sanity check.
  • Prepend a PowerShell Rule #0 describing git-based target discovery, Pester test placement expectations, and a first-test discovery sanity check.
Show a summary per file
File Description
plugins/dotnet-test/skills/code-testing-extensions/extensions/ruby.md Adds Rule #0 to guide correct Ruby test target selection and early spec discovery validation.
plugins/dotnet-test/skills/code-testing-extensions/extensions/powershell.md Adds Rule #0 to guide correct PowerShell/Pester test target selection and early test discovery validation.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment thread plugins/dotnet-test/skills/code-testing-extensions/extensions/powershell.md Outdated
@github-actions

Copy link
Copy Markdown
Contributor

Skill Coverage Report

Plugin Skill Covered Coverage

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 28, 2026 17:10

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 3

Comment thread plugins/dotnet-test/skills/code-testing-extensions/extensions/ruby.md Outdated
- Rephrase Rule #0's discovery preamble in ruby.md and powershell.md to
  call out the commands as the read-only exception to Rule #1, removing
  the apparent contradiction between 'before planning' and Rule #1's
  'before writing any test or running any command'.
- Broaden the spec_helper require grep to match leading whitespace and
  `require_relative`, avoiding false negatives that send the agent to
  the wrong target.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Evangelink

Copy link
Copy Markdown
Member Author

/evaluate

@Evangelink Evangelink enabled auto-merge (squash) June 1, 2026 07:44
github-actions Bot added a commit that referenced this pull request Jun 1, 2026
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Skill Validation Results

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

▶ Sessions Visualisation -- interactive replay of all evaluation sessions

@Evangelink Evangelink merged commit 2b7fcdd into main Jun 1, 2026
37 checks passed
@Evangelink Evangelink deleted the dev/amauryleve/improve-ruby-powershell-test-extensions branch June 1, 2026 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants