Skip to content

[Infra] Add timeout info to dashboard#119

Merged
JanKrivanek merged 2 commits into
mainfrom
dev/jankrivanek/timeout-dashboard
Feb 25, 2026
Merged

[Infra] Add timeout info to dashboard#119
JanKrivanek merged 2 commits into
mainfrom
dev/jankrivanek/timeout-dashboard

Conversation

@JanKrivanek

Copy link
Copy Markdown
Member

Sample

image

Copilot AI review requested due to automatic review settings February 25, 2026 11:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds comprehensive timeout tracking and visualization to the skill-validator system. When scenarios exceed their configured timeout, this information is now captured, propagated through the evaluation pipeline, and displayed in both console output and the dashboard UI.

Changes:

  • Added TimedOut boolean property to RunMetrics and ScenarioComparison models
  • Enhanced AgentRunner to catch TimeoutException, record it as an event, and set the timeout flag
  • Updated console reporter and markdown reporter to show timeout indicators with ⏰ emoji
  • Modified dashboard to display timed-out scenarios with red diamond markers and a summary card

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
eng/skill-validator/src/Models/Models.cs Added TimedOut property to RunMetrics and ScenarioComparison
eng/skill-validator/src/Services/AgentRunner.cs Added TimeoutException handler to capture timeout events and set TimedOut flag
eng/skill-validator/src/Commands/ValidateCommand.cs Added timeout propagation logic and warning output for timed-out scenarios
eng/skill-validator/src/Services/Reporter.cs Added timeout indicators in console and markdown output with ⏰ emoji
eng/dashboard/generate-benchmark-data.ps1 Added timedOut flag propagation to benchmark data entries
eng/dashboard/dashboard.js Added timeout visualization with red diamond markers, summary card, and tooltips
eng/dashboard/dashboard.html Added --timeout CSS variable for consistent timeout color (#f85149)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread eng/skill-validator/src/Services/AgentRunner.cs
Comment thread eng/skill-validator/src/Commands/ValidateCommand.cs
# Conflicts:
#	eng/skill-validator/src/Services/Reporter.cs
@JanKrivanek JanKrivanek changed the title Add timeout info to dashboard [Infra] Add timeout info to dashboard Feb 25, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Skill Validation Results

Skill Scenario Baseline With Skill Δ Skills Loaded Verdict
csharp-scripts Test a C# language feature with a script 3.0/5 5.0/5 +2.0 ✅ csharp-scripts; tools: skill, create, edit
dotnet-pinvoke Generate LibraryImport declaration from C header (.NET 8+) 4.0/5 5.0/5 +1.0 ✅ dotnet-pinvoke; tools: skill
dotnet-pinvoke Generate LibraryImport declaration from C header (.NET Framework) 5.0/5 5.0/5 0.0 ✅ dotnet-pinvoke; tools: skill
eval-performance Analyze MSBuild evaluation performance issues 4.0/5 5.0/5 +1.0 ✅ eval-performance; tools: skill
check-bin-obj-clash Diagnose bin/obj output path clashes 3.5/5 5.0/5 +1.5 ✅ check-bin-obj-clash; binlog-generation; tools: skill, edit
build-parallelism Analyze build parallelism bottlenecks 1.0/5 ⏰ timeout 2.5/5 ⏰ timeout +1.5 ✅ build-parallelism; binlog-generation; binlog-failure-analysis; tools: skill, task, glob
binlog-failure-analysis Diagnose build failures from binlog only (no source files) 2.5/5 ⏰ timeout 3.0/5 ⏰ timeout +0.5 ✅ binlog-failure-analysis; tools: skill, load_binlog, edit, view
msbuild-modernization Modernize legacy project to SDK-style 5.0/5 5.0/5 0.0 ✅ msbuild-modernization; tools: skill
msbuild-antipatterns Review MSBuild files for anti-patterns and style issues 5.0/5 4.5/5 -0.5 ✅ msbuild-antipatterns; tools: skill, glob, edit, bash
incremental-build Analyze incremental build issues 3.0/5 4.5/5 +1.5 ✅ incremental-build; tools: skill, edit, bash
build-perf-diagnostics Analyze analyzer performance impact on builds 5.0/5 4.5/5 ⏰ timeout -0.5 ✅ binlog-generation; build-perf-diagnostics; binlog-failure-analysis; build-perf-baseline; tools: skill
binlog-generation Build project with /bl flag 1.5/5 5.0/5 +3.5 ✅ binlog-generation; tools: skill, read_bash
binlog-generation Build with /bl in PowerShell 3.0/5 5.0/5 +2.0 ✅ binlog-generation; tools: skill
binlog-generation Build multiple configurations with unique binlogs 2.5/5 5.0/5 +2.5 ✅ binlog-generation; tools: skill
including-generated-files Diagnose generated file inclusion failure 3.0/5 5.0/5 +2.0 ✅ including-generated-files; tools: skill

timeout — run hit the scenario timeout limit; scoring may be impacted by aborting model execution before it could produce its full output

Model: claude-opus-4.6 | Judge: claude-opus-4.6

Full results

@JanKrivanek JanKrivanek merged commit e2da803 into main Feb 25, 2026
9 checks passed
@JanKrivanek JanKrivanek deleted the dev/jankrivanek/timeout-dashboard branch February 25, 2026 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants