Skip to content

[None][test] remove outdated model in perf test#14992

Merged
ruodil merged 1 commit into
NVIDIA:mainfrom
ruodil:user/ruodil/perf
Jun 5, 2026
Merged

[None][test] remove outdated model in perf test#14992
ruodil merged 1 commit into
NVIDIA:mainfrom
ruodil:user/ruodil/perf

Conversation

@ruodil

@ruodil ruodil commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • Tests
    • Updated performance testing configurations for large language model evaluation with new model variants and optimized GPU resource allocation strategies.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
@ruodil ruodil requested a review from a team as a code owner June 5, 2026 05:31
@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR updates the LLM performance test configuration by swapping the common model entry tested across all GPU counts from a Qwen 3 BF16 variant to Llama 3.1 FP8, and updates the parallelization strategy for Minimax M2.5 FP8 on 8-GPU systems from tensor parallelism to expert parallelism.

Changes

LLM Performance Test Configuration

Layer / File(s) Summary
Common model test swap in all-GPU conditions
tests/integration/test_lists/qa/llm_perf_core.yml
Condition 1 removes qwen3_0.6b BF16 test_perf entry and adds llama_v3.1_8b_instruct_fp8 FP8 test_perf entry with 128/128 input/output lengths.
Minimax FP8 parallelization strategy update
tests/integration/test_lists/qa/llm_perf_core.yml
For 8-GPU system configuration, minimax_m2.5_fp8 test_perf entries switch from tensor parallelism (tp:8) to expert parallelism (ep:8) across min-latency and max-throughput variants.

Possibly related PRs

  • NVIDIA/TensorRT-LLM#14613: Both PRs modify tests/integration/test_lists/qa/llm_perf_core.yml to switch minimax_m2.5_fp8 8-GPU perf test entries from tp: 8 to ep: 8 (including the same maxbs variants), aligning on the same EP=8 change.
  • NVIDIA/TensorRT-LLM#14749: Both PRs modify the LLM performance test matrix in tests/integration/test_lists/qa/llm_perf_core.yml by changing/removing specific model test_perf entries and GPU-tier configuration details.
  • NVIDIA/TensorRT-LLM#14952: Both PRs modify the same tests/integration/test_lists/qa/llm_perf_core.yml performance test matrix, including swapping the "all GPUs" common model entries (e.g., qwen3_0.6b vs llama_v3.1_8b_instruct_fp8) and adjusting related minimax_m2.5_fp8 per-condition parameters, so the changes overlap directly.

Suggested reviewers

  • yufeiwu-nv
  • StanleySun639
  • leslie-fang25

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is entirely a template with no actual content provided. All required sections (Description, Test Coverage) are empty placeholders. Fill in the Description section explaining why the outdated model was removed and its impact. Add Test Coverage section listing relevant tests that validate this change.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change—removing an outdated model from a performance test configuration file.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@ruodil

ruodil commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator Author

/bot skip --comment "skip test as just modifying test cases"

@ruodil ruodil requested a review from yufeiwu-nv June 5, 2026 05:34
@ruodil ruodil enabled auto-merge (squash) June 5, 2026 05:34
@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #52264 [ skip ] triggered by Bot. Commit: 2f289a3 Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #52264 [ skip ] completed with state SUCCESS. Commit: 2f289a3
Skipping testing for commit 2f289a3

Link to invocation

@ruodil ruodil merged commit f0ca418 into NVIDIA:main Jun 5, 2026
13 of 15 checks passed
fbxai pushed a commit to fbxai/TensorRT-LLM that referenced this pull request Jun 5, 2026
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Signed-off-by: NVFB <186336021+NVFB@users.noreply.github.com>
2ez4bz pushed a commit to 2ez4bz/TensorRT-LLM that referenced this pull request Jun 8, 2026
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants