-
Notifications
You must be signed in to change notification settings - Fork 27
feat: add reasoning support #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughReasoning configuration support was introduced across multiple providers and models. New structs and fields were added to represent reasoning parameters, with validation and provider-specific mapping logic. Provider implementations now validate and process reasoning configs, adjust request payloads accordingly, and enhance logging. Streaming and response models were updated to accommodate reasoning fields. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Provider
participant Model
Client->>Provider: Send ChatCompletionRequest (with optional reasoning)
Provider->>Provider: Validate reasoning config
alt Invalid reasoning
Provider-->>Client: Return BAD_REQUEST
else Valid or absent reasoning
Provider->>Model: Map reasoning config to provider-specific format
Model-->>Provider: Return response
Provider-->>Client: Return ChatCompletionResponse (with/without reasoning)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (3)
src/providers/vertexai/models.rs (3)
336-339: Consider reducing debug logging verbosity.While the debug logging is helpful during development, the extensive logging (5 debug statements for reasoning processing) might be excessive in production. Consider consolidating or reducing the verbosity.
- tracing::debug!( - "🔄 Converting ChatCompletionRequest to GeminiChatRequest, reasoning: {:?}", - req.reasoning - ); // ... other code ... let thinking_config = req .reasoning .as_ref() .and_then(|r| { - tracing::debug!("📝 Processing reasoning config for thinkingConfig: {:?}", r); r.to_gemini_thinking_budget() }) .map(|budget| { - tracing::debug!("🎛️ Creating ThinkingConfig with budget: {} tokens", budget); ThinkingConfig { thinking_budget: Some(budget), } }); - tracing::debug!("🔧 Final thinking_config: {:?}", thinking_config); + if thinking_config.is_some() { + tracing::debug!("Reasoning enabled with thinking_config: {:?}", thinking_config); + }Also applies to: 414-429, 473-477
555-556: TODO: Implement reasoning extraction from Vertex AI responses.The TODO comment indicates that reasoning extraction from Vertex AI responses is not yet implemented.
Would you like me to help implement the logic to extract thinking content from Vertex AI responses or create an issue to track this task?
623-624: TODO: Implement reasoning extraction from streaming responses.The TODO comment indicates that reasoning extraction from Vertex AI streaming responses is not yet implemented.
Would you like me to help implement the logic to extract thinking content from streaming responses or create an issue to track this task?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
src/models/chat.rs(3 hunks)src/models/streaming.rs(1 hunks)src/pipelines/otel.rs(1 hunks)src/providers/anthropic/models.rs(3 hunks)src/providers/anthropic/provider.rs(3 hunks)src/providers/azure/provider.rs(4 hunks)src/providers/bedrock/models.rs(2 hunks)src/providers/bedrock/test.rs(5 hunks)src/providers/openai/provider.rs(2 hunks)src/providers/vertexai/models.rs(8 hunks)src/providers/vertexai/provider.rs(2 hunks)src/providers/vertexai/tests.rs(14 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/providers/openai/provider.rs (1)
src/providers/azure/provider.rs (1)
from(25-35)
🔇 Additional comments (27)
src/models/streaming.rs (1)
24-25: LGTM! Reasoning field properly added to streaming delta.The
reasoningfield addition is well-implemented with appropriate serialization behavior and consistent with other optional fields in the struct. This enables reasoning content to be captured during streaming chat completions.src/providers/bedrock/test.rs (5)
126-126: LGTM! Test updated for new reasoning field.The test correctly includes the new
reasoningfield set toNone, maintaining compatibility with the extendedChatCompletionRequeststructure.
241-241: LGTM! Test updated for new reasoning field.Consistent with other test updates, the
reasoningfield is properly set toNone.
367-367: LGTM! Test updated for new reasoning field.The AI21 chat completion test is updated consistently with the new field structure.
442-442: LGTM! Test updated for new reasoning field.The ARN test maintains consistency with the new
ChatCompletionRequeststructure.
494-494: LGTM! Test updated for new reasoning field.The inference profile test is properly updated with the new reasoning field.
src/pipelines/otel.rs (1)
154-154: LGTM! Reasoning data properly captured in streaming traces.The implementation correctly propagates reasoning content from streaming chunks to the accumulated completion, ensuring reasoning data is included in OpenTelemetry traces. This aligns with the pattern used for other optional fields in the same method.
src/providers/bedrock/models.rs (2)
127-127: LGTM! Reasoning field properly initialized for Titan responses.The
reasoningfield is correctly set toNonein the Titan response conversion, maintaining compatibility with the extendedChatCompletionChoicestructure while indicating that reasoning extraction is not yet implemented for Titan models.
316-316: LGTM! Reasoning field properly initialized for AI21 responses.Consistent with the Titan implementation, the
reasoningfield is appropriately set toNonefor AI21 response conversions, maintaining structural compatibility.src/providers/vertexai/tests.rs (1)
285-285: LGTM! Test requests correctly updated with the new reasoning field.All test cases have been consistently updated to include
reasoning: Nonein theirChatCompletionRequestinstances, properly reflecting the new optional field added to the request structure.Also applies to: 533-533, 674-674, 801-801, 854-854, 892-892, 975-975, 1084-1084, 1176-1176, 1226-1226, 1445-1445, 1504-1504, 1550-1550, 1684-1684
src/providers/vertexai/provider.rs (2)
140-169: LGTM! Comprehensive reasoning validation and logging.The implementation properly validates the reasoning configuration, returns appropriate error codes on validation failure, and provides detailed logging at different levels (error, info, debug) for debugging and monitoring purposes.
207-219: LGTM! Enhanced request debugging capabilities.The added logging provides excellent visibility into the request being sent, including pretty-printed JSON and specific generation/thinking config details. This will be invaluable for debugging reasoning-related issues.
src/providers/anthropic/provider.rs (1)
69-74: LGTM! Proper handling of reasoning exclusion flag.The code correctly extracts the
excludeflag from the reasoning config and passes it to the response conversion method, allowing for proper filtering of reasoning content in the response.Also applies to: 98-100
src/providers/azure/provider.rs (2)
16-36: LGTM! Well-structured Azure-specific request handling.The
AzureChatCompletionRequestwrapper correctly transforms the generic reasoning config into Azure'sreasoning_effortfield format, properly clearing the original field to avoid duplication.
117-118: LGTM! Proper request transformation for Azure API.The code correctly converts the generic request to Azure-specific format before sending to the API.
Also applies to: 123-123
src/providers/openai/provider.rs (2)
15-35: LGTM! Consistent implementation with Azure provider.The
OpenAIChatCompletionRequestwrapper follows the same pattern as the Azure provider, correctly transforming the reasoning config into OpenAI'sreasoning_effortfield format.
83-84: LGTM! Proper request transformation.The code correctly converts the request to OpenAI-specific format before sending to the API.
Also applies to: 89-89
src/providers/anthropic/models.rs (3)
93-115: LGTM! Clean implementation of reasoning prompt injection.The code correctly extracts the system message and conditionally appends the reasoning prompt, handling both cases where a system message exists or not.
209-240: LGTM! Well-structured conversion method.The
into_chat_completionmethod correctly handles the optional exclusion of reasoning content and properly constructs theChatCompletionresponse.
270-274: LGTM! Clean backward compatibility implementation.The
Fromtrait implementation correctly delegates to the newinto_chat_completionmethod with reasoning included by default.src/models/chat.rs (5)
15-24: LGTM! Well-structured reasoning configuration.The
ReasoningConfigstruct is properly designed with optional fields and clear documentation.
26-43: LGTM! Robust validation logic.The validation correctly handles the prioritization of
max_tokensovereffortand provides clear error messages.
45-57: LGTM! Correct OpenAI effort conversion.The method properly prioritizes
max_tokensand filters empty effort values.
64-82: LGTM! Well-designed prompt generation.The method provides appropriate prompts for different effort levels and correctly prioritizes
max_tokens.
123-125: LGTM! Proper integration of reasoning fields.The reasoning fields are correctly added as optional with appropriate serialization controls.
Also applies to: 155-157
src/providers/vertexai/models.rs (2)
75-79: LGTM! Properly structured ThinkingConfig.The struct correctly uses camelCase renaming for Vertex AI API compatibility.
97-98: LGTM! Correct integration of thinking_config field.The field is properly added with appropriate serde attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to b55860a in 3 minutes and 12 seconds. Click for details.
- Reviewed
797lines of code in12files - Skipped
0files when reviewing. - Skipped posting
8draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/models/chat.rs:27
- Draft comment:
In thevalidate()method of ReasoningConfig, when both 'effort' and 'max_tokens' are provided, the code simply logs a warning and prioritizes max_tokens. It may help to document this behavior clearly, so users understand that 'effort' is ignored when 'max_tokens' is specified. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The behavior is already well documented in multiple places throughout the code. The warning message is clear, the field comments explain it, and every relevant method has a comment about prioritizing max_tokens. Adding more documentation would be redundant. This seems like a case where the reviewer didn't thoroughly read the existing documentation. I could be wrong about what constitutes sufficient documentation - maybe users would benefit from even more explicit documentation in a different location. The existing documentation appears in all relevant places where a developer would look - the field definition, the validation method, and each conversion method. Additional documentation would be redundant. The comment should be deleted as the behavior is already well documented throughout the code in multiple relevant locations.
2. src/models/chat.rs:155
- Draft comment:
A new optional 'reasoning' field has been added to ChatCompletionChoice. Ensure that all downstream consumers gracefully handle a None value for this field. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
3. src/models/streaming.rs:24
- Draft comment:
The addition of a 'reasoning' field to ChoiceDelta is consistent with the chat model changes. Consider adding tests to verify that streaming responses correctly propagate reasoning information. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
4. src/providers/bedrock/models.rs:127
- Draft comment:
The conversion for TitanChatCompletionResponse sets the reasoning field as None. Consider whether reasoning information should be extracted and propagated consistently, similar to other providers. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
5. src/providers/vertexai/models.rs:416
- Draft comment:
In the GeminiChatRequest conversion, the reasoning config is converted to a ThinkingConfig, which is good. Note the TODO comments for extracting thinking content from responses—this should be addressed in future work for full reasoning support. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
6. src/providers/vertexai/provider.rs:116
- Draft comment:
The provider’s location validation (validate_location) correctly filters out unacceptable characters. Ensure that the error message is clear to users so they can correct the configuration if needed. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
7. src/providers/openai/provider.rs:25
- Draft comment:
The conversion for OpenAI requests removes the 'reasoning' field and maps it to 'reasoning_effort'. Ensure that this choice is documented so users understand the differences in reasoning support across providers. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
8. src/providers/vertexai/tests.rs:310
- Draft comment:
The test suite covers many conversion and schema cases. Consider adding tests that explicitly verify reasoning config behavior to ensure consistency across providers. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
Workflow ID: wflow_foZeXnXhj9eKk1uH
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
src/providers/anthropic/models.rs
Outdated
| } | ||
| } | ||
| // Fallback to heuristic if no markers found | ||
| if text.contains("Let me think") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extraction logic for reasoning in extract_reasoning_from_content() uses a heuristic that checks for certain phrases. Consider making these string checks case-insensitive to capture more variations.
| if text.contains("Let me think") | |
| if text.to_lowercase().contains("let me think") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 14e0f98 in 1 minute and 30 seconds. Click for details.
- Reviewed
283lines of code in7files - Skipped
0files when reviewing. - Skipped posting
9draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/vertexai/tests.rs:34
- Draft comment:
Consider refactoring cassette setup logic (lines 34-43) into a helper module to DRY up repeated code. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. src/providers/vertexai/tests.rs:85
- Draft comment:
Using an unsafe block to set the environment variable (line 85) might lead to race conditions in tests; consider dependency injection or a safer alternative for test configuration. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. src/providers/vertexai/tests.rs:215
- Draft comment:
In the run_test_with_quota_retry helper, consider logging detailed error information for non-TOO_MANY_REQUESTS errors for easier debugging. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. src/providers/vertexai/tests.rs:737
- Draft comment:
Tests like 'test_auth_config_precedence' and 'test_auth_config_credentials_only' only construct the provider without assertions; adding explicit checks for expected authentication behavior would improve test rigor. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. src/providers/vertexai/tests.rs:860
- Draft comment:
In 'test_generation_config_limits', consider asserting that max_output_tokens is set as expected for completeness. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. src/providers/vertexai/tests.rs:1187
- Draft comment:
In 'test_gemini_request_with_array_content', confirm that concatenating array content with a space (resulting in 'Part 1 Part 2') is the intended behavior. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. src/providers/vertexai/tests.rs:1290
- Draft comment:
Schema conversion tests are very thorough. Ensure that unsupported types (e.g. 'null') consistently fallback to STRING as expected. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. src/providers/vertexai/tests.rs:140
- Draft comment:
Provider config tests use location strings (e.g. 'us-central'). Verify that these meet the expected format as validated in 'validate_location' to avoid misconfiguration. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
9. src/providers/vertexai/tests.rs:1608
- Draft comment:
Overall, tests are comprehensive and well-documented. Minor improvements in error logging and test code reuse can further enhance maintainability. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_NQFRR6Kl85hhjfZb
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 5746528 in 1 minute and 21 seconds. Click for details.
- Reviewed
207lines of code in4files - Skipped
0files when reviewing. - Skipped posting
4draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/anthropic/models.rs:209
- Draft comment:
Removed the 'exclude_reasoning' parameter from into_chat_completion. Verify that reasoning prompts (if provided in the request) are correctly incorporated (likely via the system message) and that no fallback behavior is lost. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =40%<= threshold50%The comment is asking the author to verify that a specific behavior is maintained after a parameter removal. This is against the rules as it asks for verification of intended behavior. However, it does point out a specific change and its potential impact, which could be useful for the author to consider. The comment could be rephrased to suggest checking the behavior without directly asking for verification.
2. src/providers/anthropic/provider.rs:66
- Draft comment:
Extraction of the 'exclude_reasoning' flag has been removed. Ensure that the reasoning configuration is now fully handled in the request conversion, and that responses include reasoning info as intended. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment is asking the author to ensure that the reasoning configuration is handled correctly after a change. This is a violation of the rule against asking the author to ensure behavior is intended or tested. The comment does not provide a specific suggestion or ask for a specific test to be written, which would have been acceptable.
3. src/providers/bedrock/test.rs:123
- Draft comment:
Test payloads are updated to include 'reasoning: None'. This is good for default behavior, but consider adding tests for non-null reasoning configurations to verify the new reasoning support. - Reason this comment was not posted:
Comment looked like it was already resolved.
4. src/providers/vertexai/tests.rs:646
- Draft comment:
Explicitly setting 'reasoning: None' in ChatCompletionRequest reflects the new optional parameter. It would be beneficial to add tests with a non-null reasoning config to ensure reasoning prompts are processed correctly. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_LXOUrf7m0a74ldQG
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 624ca5f in 56 seconds. Click for details.
- Reviewed
63lines of code in2files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/anthropic/models.rs:209
- Draft comment:
Good refactor: replacing the dedicated conversion method with a From trait implementation simplifies the conversion of AnthropicChatCompletionResponse to ChatCompletion and ensures idiomatic usage. Check that the conversion for content (via .into()) handles all cases appropriately. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =50%<= threshold50%The comment starts with a purely informative statement about the refactor, which is not allowed. However, it ends with a suggestion to check that the conversion for content handles all cases appropriately, which is a valid suggestion. The comment could be improved by focusing only on the suggestion part.
2. src/providers/anthropic/provider.rs:92
- Draft comment:
Using the From trait via .into() instead of an explicit into_chat_completion() call improves clarity and consistency in conversion. This aligns with the refactored models module. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_8nkcC458Dyjk502C
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 6620ab0 in 51 seconds. Click for details.
- Reviewed
39lines of code in3files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/anthropic/provider.rs:45
- Draft comment:
Good update replacing eprintln! with tracing::error! for invalid reasoning config. It improves consistency with structured logging. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%This comment is purely informative, praising the update without providing any actionable feedback or suggestions. It doesn't align with the rules as it doesn't offer a specific code suggestion or highlight a potential issue.
2. src/providers/azure/provider.rs:84
- Draft comment:
Nice improvement: switching from eprintln! to tracing::error! for logging errors on invalid reasoning config. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%This comment is purely informative, as it only praises the change without providing any actionable feedback or suggestions. It doesn't ask for confirmation or suggest improvements.
3. src/providers/openai/provider.rs:77
- Draft comment:
Replace eprintln! with tracing::error! here as well. This ensures uniform and structured error logging across providers. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_nNyTbPJflceqbhW6
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
src/providers/bedrock/test.rs (4)
241-241: Same as Line 126: cover non-None reasoning for Titan/Nova via Bedrock.Currently
reasoning: Nonedoesn’t validate request transformation or expected errors. Please add at least one test usingSome(ReasoningConfig { ... })for this path too.
367-367: Same as Line 126: cover non-None reasoning for AI21 via Bedrock.Add a test with
reasoning: Some(...)to assert explicit behavior (ignored or rejected with a clear error).
442-442: Same as Line 126: ensure ARN-based Bedrock requests behave correctly with reasoning: Some(...).For full-ARN model identifiers, add a test that sets
reasoning: Some(...)to confirm there’s no unintended model transformation and behavior is explicit.
494-494: Same as Line 126: cover non-None reasoning for inference profile identifiers.Add a
reasoning: Some(...)case here to verify that inference profile IDs remain untransformed and the provider behavior is explicit.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/providers/anthropic/models.rs(3 hunks)src/providers/anthropic/provider.rs(3 hunks)src/providers/bedrock/test.rs(5 hunks)src/providers/vertexai/tests.rs(14 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/providers/vertexai/tests.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- src/providers/anthropic/models.rs
- src/providers/anthropic/provider.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
nirga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@galzilber make sure to go over the bots comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed 1e2a553 in 1 minute and 55 seconds. Click for details.
- Reviewed
369lines of code in1files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/providers/bedrock/test.rs:657
- Draft comment:
When both effort and max_tokens are set, the comment notes a log warning. Consider capturing and asserting the warning output for completeness. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% The comment points out a gap in test coverage - we're not verifying the warning behavior that's documented in the code comment. However, testing log output would require additional test infrastructure and mocking that may be complex. The current test already verifies the core functionality (that validation succeeds). The warning is a secondary concern. The comment assumes that testing log output is worth the additional complexity. But logging is often considered an implementation detail that doesn't need exhaustive testing. While logging may be an implementation detail, the warning seems to be part of the documented behavior based on the code comment. Not testing it leaves a gap between documented and verified behavior. The comment identifies a real gap between documented and tested behavior, but the value of adding log testing infrastructure may not justify the effort.
2. src/providers/bedrock/test.rs:797
- Draft comment:
The appended reasoning prompt differs from the one used when no system message exists. Confirm this behavior is intended and document it if so. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment is asking the PR author to confirm their intention and to document the behavior, which violates the rules. It does not provide a specific code suggestion or ask for a specific test to be written. Therefore, this comment should be removed.
Workflow ID: wflow_diBRHo4s8YOcLabg
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| assert!( | ||
| result.is_ok(), | ||
| "AI21 chat completion with reasoning failed: {:?}", | ||
| result.err() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding tests for non-empty 'exclude' values in ReasoningConfig to cover all parameter scenarios.
| } | ||
|
|
||
| #[test] | ||
| fn test_reasoning_config_to_thinking_prompt() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of only checking that to_thinking_prompt() returns Some(_), consider asserting the actual output to validate the prompt content.
| let provider_config = get_test_provider_config("us-west-2", "anthropic_chat_completion"); | ||
| let provider = BedrockProvider::new(&provider_config); | ||
|
|
||
| let payload = ChatCompletionRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple tests create ChatCompletionRequest with many repetitive None fields. Refactor using a helper to DRY up test payload creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (8)
src/providers/bedrock/test.rs (8)
241-242: Add Titan reasoning coverage (positive or negative).Titan chat completion still uses
reasoning: None. If Titan supports reasoning, add a positive test similar to the Anthropic/AI21 cases. If it doesn’t, add a negative test asserting a clear error/ignore behavior. I can draft either version if you confirm intended Titan support.
507-555: Good Anthropic reasoning test; align region/model to avoid fragility.Test intent is solid. To reduce confusion with the static replay fixture (which uses an Anthropic dummy at us-east-2 and a different model), align the region for consistency:
- let provider_config = get_test_provider_config("us-west-2", "anthropic_chat_completion"); + let provider_config = get_test_provider_config("us-east-2", "anthropic_chat_completion");Optional: either switch the payload model to the dummy’s model or add a matching dummy for this model to make request/response alignment explicit.
558-605: Max-tokens Anthropic test: same region/model alignment nit.Same suggestion as above; align region/model with the replay event or add a corresponding dummy to keep fixtures consistent.
608-655: AI21 reasoning test added — consider asserting transformation.This exercises the happy path. If feasible, add an assertion at the transformation layer (e.g., provider’s AI21 request body builder) to ensure reasoning fields are included. If no hook exists, consider adding a small unit for the mapper similar to the Anthropic transformation test.
658-671: Test name/comment mentions logging but no log assertion.The test comment says “Should not error but should log a warning,” yet no log is asserted. Either:
- Add a log assertion using a capture (e.g., tracing-subscriber with a test writer), or
- Update the comment to avoid implying a log assertion.
I can wire up log capture if desired.
709-740: Strengthen prompt assertions and priority semantics.Currently only
is_some()is checked. Recommend asserting expected contents and that max_tokens overrides effort:
- For high/medium/low, assert distinct substrings (e.g., “Think through this step-by-step…” for high; “Consider this problem thoughtfully” for medium).
- For the max_tokens case, assert the prompt contains a token-budget hint and does not rely on the effort-specific phrasing.
This will better lock the contract.
743-794: Nice: direct Anthropic transformation assert.Good coverage verifying reasoning prompt injection into the system message. Consider also checking that the message does not duplicate if applied twice.
796-863: Good: preserves existing system + appends reasoning.Solid test ensuring concatenation behavior. You may also want to assert ordering (original system message precedes reasoning prompt).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/providers/bedrock/test.rs(6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (4)
src/providers/bedrock/test.rs (4)
126-127: Baseline OK; reasoning covered elsewhere.Keeping
reasoning: Nonehere preserves the baseline path. The new reasoning tests underarn_testsexercise the feature. No change needed in this test.
367-368: LGTM: keep baseline AI21 path without reasoning.The dedicated AI21 reasoning test below provides coverage; this baseline can remain None.
442-443: LGTM: ARN transformation test stays focused.Leaving
reasoning: Noneis appropriate here to keep the test focused on identifier handling.
494-496: LGTM: inference-profile identifier test stays focused.Same rationale—keeping
reasoning: Noneisolates the transformation concern under test.
Important
Add reasoning support to chat completions with validation, configuration, and logging enhancements across multiple providers.
ReasoningConfigtochat.rsfor optional reasoning configurations (effort, max tokens, exclusion).ChatCompletionRequestinchat.rsto includereasoningfield.anthropic/models.rs,azure/provider.rs,openai/provider.rs, andvertexai/models.rs.validate()inReasoningConfigto ensure valid configurations.anthropic/provider.rs,azure/provider.rs,openai/provider.rs, andvertexai/provider.rs.anthropic/provider.rs,azure/provider.rs,openai/provider.rs, andvertexai/provider.rs.bedrock/test.rsandvertexai/tests.rs.This description was created by
for 1e2a553. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Bug Fixes
Chores