[None][test] remove outdated model in perf test by ruodil · Pull Request #14992 · NVIDIA/TensorRT-LLM

ruodil · 2026-06-05T05:31:54Z

Summary by CodeRabbit

Tests
- Updated performance testing configurations for large language model evaluation with new model variants and optimized GPU resource allocation strategies.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>

coderabbitai · 2026-06-05T05:33:01Z

📝 Walkthrough

Walkthrough

This PR updates the LLM performance test configuration by swapping the common model entry tested across all GPU counts from a Qwen 3 BF16 variant to Llama 3.1 FP8, and updates the parallelization strategy for Minimax M2.5 FP8 on 8-GPU systems from tensor parallelism to expert parallelism.

Changes

LLM Performance Test Configuration

Layer / File(s)	Summary
Common model test swap in all-GPU conditions `tests/integration/test_lists/qa/llm_perf_core.yml`	Condition 1 removes `qwen3_0.6b` BF16 `test_perf` entry and adds `llama_v3.1_8b_instruct_fp8` FP8 `test_perf` entry with 128/128 input/output lengths.
Minimax FP8 parallelization strategy update `tests/integration/test_lists/qa/llm_perf_core.yml`	For 8-GPU system configuration, `minimax_m2.5_fp8` `test_perf` entries switch from tensor parallelism (`tp:8`) to expert parallelism (`ep:8`) across min-latency and max-throughput variants.

Possibly related PRs

NVIDIA/TensorRT-LLM#14613: Both PRs modify tests/integration/test_lists/qa/llm_perf_core.yml to switch minimax_m2.5_fp8 8-GPU perf test entries from tp: 8 to ep: 8 (including the same maxbs variants), aligning on the same EP=8 change.
NVIDIA/TensorRT-LLM#14749: Both PRs modify the LLM performance test matrix in tests/integration/test_lists/qa/llm_perf_core.yml by changing/removing specific model test_perf entries and GPU-tier configuration details.
NVIDIA/TensorRT-LLM#14952: Both PRs modify the same tests/integration/test_lists/qa/llm_perf_core.yml performance test matrix, including swapping the "all GPUs" common model entries (e.g., qwen3_0.6b vs llama_v3.1_8b_instruct_fp8) and adjusting related minimax_m2.5_fp8 per-condition parameters, so the changes overlap directly.

Suggested reviewers

yufeiwu-nv
StanleySun639
leslie-fang25

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is entirely a template with no actual content provided. All required sections (Description, Test Coverage) are empty placeholders.	Fill in the Description section explaining why the outdated model was removed and its impact. Add Test Coverage section listing relevant tests that validate this change.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change—removing an outdated model from a performance test configuration file.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ruodil · 2026-06-05T05:34:14Z

/bot skip --comment "skip test as just modifying test cases"

tensorrt-cicd · 2026-06-05T05:41:03Z

PR_Github #52264 [ skip ] triggered by Bot. Commit: 2f289a3 Link to invocation

tensorrt-cicd · 2026-06-05T05:46:38Z

PR_Github #52264 [ skip ] completed with state SUCCESS. Commit: 2f289a3
Skipping testing for commit 2f289a3

Link to invocation

Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com> Signed-off-by: NVFB <186336021+NVFB@users.noreply.github.com>

Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>

remove outdated model in perf test

2f289a3

Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>

ruodil requested a review from a team as a code owner June 5, 2026 05:31

github-actions Bot assigned ruodil Jun 5, 2026

ruodil requested a review from yufeiwu-nv June 5, 2026 05:34

ruodil enabled auto-merge (squash) June 5, 2026 05:34

yufeiwu-nv approved these changes Jun 5, 2026

View reviewed changes

ruodil merged commit f0ca418 into NVIDIA:main Jun 5, 2026
13 of 15 checks passed

2ez4bz pushed a commit to 2ez4bz/TensorRT-LLM that referenced this pull request Jun 8, 2026

[None][test] remove outdated model in perf test (NVIDIA#14992)

a7c8b24

Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][test] remove outdated model in perf test#14992

[None][test] remove outdated model in perf test#14992
ruodil merged 1 commit into
NVIDIA:mainfrom
ruodil:user/ruodil/perf

ruodil commented Jun 5, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

ruodil commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ruodil commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

ruodil commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

tensorrt-cicd commented Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ruodil commented Jun 5, 2026 •

edited

Loading

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading