-
Notifications
You must be signed in to change notification settings - Fork 889
Enable opt-6.7b benchmark on inf2 #2400
Enable opt-6.7b benchmark on inf2 #2400
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2400 +/- ##
=======================================
Coverage 71.89% 71.89%
=======================================
Files 78 78
Lines 3654 3654
Branches 58 58
=======================================
Hits 2627 2627
Misses 1023 1023
Partials 4 4 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
08f2d89 to
4563876
Compare
4563876 to
8be8b60
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have different mar files for each batch size?
|
For inferentia2, we'll need to trace the model separately to support different batch sizes. Here, the model is being traced at model load time using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unblocking
Description
Enable benchmarking for the
opt-6.7bmodel on inferentia2 based on the inf2 example: #2399Model archives:
Type of change
Feature/Issue validation/testing
inf2-opt-benchmark-testBenchmark results
TorchServe Benchmark on neuronx
Date: 2023-06-22 08:44:16
TorchServe Version: inf2-opt-benchmark-test
scripted_mode_opt_6.7b_neuronx_batch_1
scripted_mode_opt_6.7b_neuronx_batch_2
scripted_mode_opt_6.7b_neuronx_batch_4
scripted_mode_opt_6.7b_neuronx_batch_8