feat: add reasoning support #63

galzilber · 2025-08-07T14:34:39Z

Important

Add reasoning support to chat completions with validation, configuration, and logging enhancements across multiple providers.

New Features:
- Add ReasoningConfig to chat.rs for optional reasoning configurations (effort, max tokens, exclusion).
- Update ChatCompletionRequest in chat.rs to include reasoning field.
- Enhance provider integrations (OpenAI, Azure, Anthropic, VertexAI) to process reasoning parameters.
- Modify request formats and prompts in anthropic/models.rs, azure/provider.rs, openai/provider.rs, and vertexai/models.rs.
Validation and Error Handling:
- Implement validate() in ReasoningConfig to ensure valid configurations.
- Log errors for invalid configurations in anthropic/provider.rs, azure/provider.rs, openai/provider.rs, and vertexai/provider.rs.
Logging:
- Add detailed logging for reasoning configurations in anthropic/provider.rs, azure/provider.rs, openai/provider.rs, and vertexai/provider.rs.
Tests:
- Add tests for reasoning configurations in bedrock/test.rs and vertexai/tests.rs.
- Validate reasoning prompt transformations and error handling.

^{This description was created by}^{for 1e2a553. You can customize this summary. It will automatically update as commits are pushed.}

Summary by CodeRabbit

New Features
- Added support for optional reasoning configurations (effort level, max tokens, exclusion) in chat completion requests.
- Enhanced provider integrations (OpenAI, Azure, Anthropic, VertexAI) to validate and process reasoning parameters, including updated request formats and prompt modifications.
- Chat responses may now include reasoning details when available.
Bug Fixes
- Improved validation and error handling for invalid reasoning configurations, returning clear error responses.
Chores
- Added detailed logging and tracing for reasoning configurations and request payloads to improve transparency and debugging.
- Expanded test coverage to validate reasoning configuration handling across multiple providers.

coderabbitai · 2025-08-07T14:34:47Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Reasoning configuration support was introduced across multiple providers and models. New structs and fields were added to represent reasoning parameters, with validation and provider-specific mapping logic. Provider implementations now validate and process reasoning configs, adjust request payloads accordingly, and enhance logging. Streaming and response models were updated to accommodate reasoning fields.

Changes

Cohort / File(s)	Change Summary
ReasoningConfig Model & Integration `src/models/chat.rs`	Added `ReasoningConfig` struct with validation and mapping methods; extended `ChatCompletionRequest` with optional `reasoning` field.
Streaming Model Update `src/models/streaming.rs`	Added optional `reasoning` field to `ChoiceDelta` struct.
Anthropic Provider & Models `src/providers/anthropic/models.rs`, `src/providers/anthropic/provider.rs`	Appended reasoning prompt to system message if provided; validated and logged reasoning config in provider logic.
Azure Provider `src/providers/azure/provider.rs`	Introduced `AzureChatCompletionRequest` with `reasoning_effort`; converted and validated reasoning config; adjusted payload serialization and logging.
OpenAI Provider `src/providers/openai/provider.rs`	Added `OpenAIChatCompletionRequest` with `reasoning_effort`; converted and validated reasoning config; updated payload structure and logging.
VertexAI Provider & Models `src/providers/vertexai/models.rs`, `src/providers/vertexai/provider.rs`	Introduced `ThinkingConfig` and integrated into `GenerationConfig`; extracted and validated thinking budget from reasoning config; enhanced logging and updated streaming chunk model.
Test Updates `src/providers/bedrock/test.rs`, `src/providers/vertexai/tests.rs`	Added `reasoning: None` field initialization to `ChatCompletionRequest` in multiple test cases for consistency; added new tests validating reasoning config behavior and transformations.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Provider
    participant Model

    Client->>Provider: Send ChatCompletionRequest (with optional reasoning)
    Provider->>Provider: Validate reasoning config
    alt Invalid reasoning
        Provider-->>Client: Return BAD_REQUEST
    else Valid or absent reasoning
        Provider->>Model: Map reasoning config to provider-specific format
        Model-->>Provider: Return response
        Provider-->>Client: Return ChatCompletionResponse (with/without reasoning)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Suggested reviewers

nirga

Poem

A clever rabbit paused to think,
"Should my reasoning be high or low or pink?"
With configs and prompts now in the mix,
Providers adapt with brand new tricks.
So hop along, review this code—
For thoughtful chats, we’ve paved the road!
🐇✨

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch gz/add-resoning-support

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (3)

src/providers/vertexai/models.rs (3)
336-339: Consider reducing debug logging verbosity.

While the debug logging is helpful during development, the extensive logging (5 debug statements for reasoning processing) might be excessive in production. Consider consolidating or reducing the verbosity.
-        tracing::debug!(
-            "🔄 Converting ChatCompletionRequest to GeminiChatRequest, reasoning: {:?}",
-            req.reasoning
-        );
         // ... other code ...
         let thinking_config = req
             .reasoning
             .as_ref()
             .and_then(|r| {
-                tracing::debug!("📝 Processing reasoning config for thinkingConfig: {:?}", r);
                 r.to_gemini_thinking_budget()
             })
             .map(|budget| {
-                tracing::debug!("🎛️ Creating ThinkingConfig with budget: {} tokens", budget);
                 ThinkingConfig {
                     thinking_budget: Some(budget),
                 }
             });

-        tracing::debug!("🔧 Final thinking_config: {:?}", thinking_config);
+        if thinking_config.is_some() {
+            tracing::debug!("Reasoning enabled with thinking_config: {:?}", thinking_config);
+        }
Also applies to: 414-429, 473-477

555-556: TODO: Implement reasoning extraction from Vertex AI responses.

The TODO comment indicates that reasoning extraction from Vertex AI responses is not yet implemented.

Would you like me to help implement the logic to extract thinking content from Vertex AI responses or create an issue to track this task?

623-624: TODO: Implement reasoning extraction from streaming responses.

The TODO comment indicates that reasoning extraction from Vertex AI streaming responses is not yet implemented.

Would you like me to help implement the logic to extract thinking content from streaming responses or create an issue to track this task?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7af3a85 and b55860a.

📒 Files selected for processing (12)

src/models/chat.rs (3 hunks)
src/models/streaming.rs (1 hunks)
src/pipelines/otel.rs (1 hunks)
src/providers/anthropic/models.rs (3 hunks)
src/providers/anthropic/provider.rs (3 hunks)
src/providers/azure/provider.rs (4 hunks)
src/providers/bedrock/models.rs (2 hunks)
src/providers/bedrock/test.rs (5 hunks)
src/providers/openai/provider.rs (2 hunks)
src/providers/vertexai/models.rs (8 hunks)
src/providers/vertexai/provider.rs (2 hunks)
src/providers/vertexai/tests.rs (14 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

src/providers/openai/provider.rs (1)

src/providers/azure/provider.rs (1)

from (25-35)

🔇 Additional comments (27)

src/models/streaming.rs (1)

24-25: LGTM! Reasoning field properly added to streaming delta.

The reasoning field addition is well-implemented with appropriate serialization behavior and consistent with other optional fields in the struct. This enables reasoning content to be captured during streaming chat completions.

src/providers/bedrock/test.rs (5)

126-126: LGTM! Test updated for new reasoning field.

The test correctly includes the new reasoning field set to None, maintaining compatibility with the extended ChatCompletionRequest structure.

241-241: LGTM! Test updated for new reasoning field.

Consistent with other test updates, the reasoning field is properly set to None.

367-367: LGTM! Test updated for new reasoning field.

The AI21 chat completion test is updated consistently with the new field structure.

442-442: LGTM! Test updated for new reasoning field.

The ARN test maintains consistency with the new ChatCompletionRequest structure.

494-494: LGTM! Test updated for new reasoning field.

The inference profile test is properly updated with the new reasoning field.

src/pipelines/otel.rs (1)

154-154: LGTM! Reasoning data properly captured in streaming traces.

The implementation correctly propagates reasoning content from streaming chunks to the accumulated completion, ensuring reasoning data is included in OpenTelemetry traces. This aligns with the pattern used for other optional fields in the same method.

src/providers/bedrock/models.rs (2)

127-127: LGTM! Reasoning field properly initialized for Titan responses.

The reasoning field is correctly set to None in the Titan response conversion, maintaining compatibility with the extended ChatCompletionChoice structure while indicating that reasoning extraction is not yet implemented for Titan models.

316-316: LGTM! Reasoning field properly initialized for AI21 responses.

Consistent with the Titan implementation, the reasoning field is appropriately set to None for AI21 response conversions, maintaining structural compatibility.

src/providers/vertexai/tests.rs (1)

285-285: LGTM! Test requests correctly updated with the new reasoning field.

All test cases have been consistently updated to include reasoning: None in their ChatCompletionRequest instances, properly reflecting the new optional field added to the request structure.

Also applies to: 533-533, 674-674, 801-801, 854-854, 892-892, 975-975, 1084-1084, 1176-1176, 1226-1226, 1445-1445, 1504-1504, 1550-1550, 1684-1684

src/providers/vertexai/provider.rs (2)

140-169: LGTM! Comprehensive reasoning validation and logging.

The implementation properly validates the reasoning configuration, returns appropriate error codes on validation failure, and provides detailed logging at different levels (error, info, debug) for debugging and monitoring purposes.

207-219: LGTM! Enhanced request debugging capabilities.

The added logging provides excellent visibility into the request being sent, including pretty-printed JSON and specific generation/thinking config details. This will be invaluable for debugging reasoning-related issues.

src/providers/anthropic/provider.rs (1)

69-74: LGTM! Proper handling of reasoning exclusion flag.

The code correctly extracts the exclude flag from the reasoning config and passes it to the response conversion method, allowing for proper filtering of reasoning content in the response.

Also applies to: 98-100

src/providers/azure/provider.rs (2)

16-36: LGTM! Well-structured Azure-specific request handling.

The AzureChatCompletionRequest wrapper correctly transforms the generic reasoning config into Azure's reasoning_effort field format, properly clearing the original field to avoid duplication.

117-118: LGTM! Proper request transformation for Azure API.

The code correctly converts the generic request to Azure-specific format before sending to the API.

Also applies to: 123-123

src/providers/openai/provider.rs (2)

15-35: LGTM! Consistent implementation with Azure provider.

The OpenAIChatCompletionRequest wrapper follows the same pattern as the Azure provider, correctly transforming the reasoning config into OpenAI's reasoning_effort field format.

83-84: LGTM! Proper request transformation.

The code correctly converts the request to OpenAI-specific format before sending to the API.

Also applies to: 89-89

src/providers/anthropic/models.rs (3)

93-115: LGTM! Clean implementation of reasoning prompt injection.

The code correctly extracts the system message and conditionally appends the reasoning prompt, handling both cases where a system message exists or not.

209-240: LGTM! Well-structured conversion method.

The into_chat_completion method correctly handles the optional exclusion of reasoning content and properly constructs the ChatCompletion response.

270-274: LGTM! Clean backward compatibility implementation.

The From trait implementation correctly delegates to the new into_chat_completion method with reasoning included by default.

src/models/chat.rs (5)

15-24: LGTM! Well-structured reasoning configuration.

The ReasoningConfig struct is properly designed with optional fields and clear documentation.

26-43: LGTM! Robust validation logic.

The validation correctly handles the prioritization of max_tokens over effort and provides clear error messages.

45-57: LGTM! Correct OpenAI effort conversion.

The method properly prioritizes max_tokens and filters empty effort values.

64-82: LGTM! Well-designed prompt generation.

The method provides appropriate prompts for different effort levels and correctly prioritizes max_tokens.

123-125: LGTM! Proper integration of reasoning fields.

The reasoning fields are correctly added as optional with appropriate serialization controls.

Also applies to: 155-157

src/providers/vertexai/models.rs (2)

75-79: LGTM! Properly structured ThinkingConfig.

The struct correctly uses camelCase renaming for Vertex AI API compatibility.

97-98: LGTM! Correct integration of thinking_config field.

The field is properly added with appropriate serde attributes.

src/models/chat.rs

src/providers/anthropic/models.rs

src/providers/anthropic/provider.rs

src/providers/azure/provider.rs

src/providers/openai/provider.rs

ellipsis-dev

Caution

Changes requested ❌

Reviewed everything up to b55860a in 3 minutes and 12 seconds. Click for details.

Reviewed 797 lines of code in 12 files
Skipped 0 files when reviewing.
Skipped posting 8 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. src/models/chat.rs:27

Draft comment:
In the validate() method of ReasoningConfig, when both 'effort' and 'max_tokens' are provided, the code simply logs a warning and prioritizes max_tokens. It may help to document this behavior clearly, so users understand that 'effort' is ignored when 'max_tokens' is specified.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The behavior is already well documented in multiple places throughout the code. The warning message is clear, the field comments explain it, and every relevant method has a comment about prioritizing max_tokens. Adding more documentation would be redundant. This seems like a case where the reviewer didn't thoroughly read the existing documentation. I could be wrong about what constitutes sufficient documentation - maybe users would benefit from even more explicit documentation in a different location. The existing documentation appears in all relevant places where a developer would look - the field definition, the validation method, and each conversion method. Additional documentation would be redundant. The comment should be deleted as the behavior is already well documented throughout the code in multiple relevant locations.

2. src/models/chat.rs:155

Draft comment:
A new optional 'reasoning' field has been added to ChatCompletionChoice. Ensure that all downstream consumers gracefully handle a None value for this field.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

3. src/models/streaming.rs:24

Draft comment:
The addition of a 'reasoning' field to ChoiceDelta is consistent with the chat model changes. Consider adding tests to verify that streaming responses correctly propagate reasoning information.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

4. src/providers/bedrock/models.rs:127

Draft comment:
The conversion for TitanChatCompletionResponse sets the reasoning field as None. Consider whether reasoning information should be extracted and propagated consistently, similar to other providers.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

5. src/providers/vertexai/models.rs:416

Draft comment:
In the GeminiChatRequest conversion, the reasoning config is converted to a ThinkingConfig, which is good. Note the TODO comments for extracting thinking content from responses—this should be addressed in future work for full reasoning support.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

6. src/providers/vertexai/provider.rs:116

Draft comment:
The provider’s location validation (validate_location) correctly filters out unacceptable characters. Ensure that the error message is clear to users so they can correct the configuration if needed.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

7. src/providers/openai/provider.rs:25

Draft comment:
The conversion for OpenAI requests removes the 'reasoning' field and maps it to 'reasoning_effort'. Ensure that this choice is documented so users understand the differences in reasoning support across providers.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

8. src/providers/vertexai/tests.rs:310

Draft comment:
The test suite covers many conversion and schema cases. Consider adding tests that explicitly verify reasoning config behavior to ensure consistency across providers.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_foZeXnXhj9eKk1uH

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev · 2025-08-07T14:37:56Z

src/providers/anthropic/models.rs

+                    }
+                }
+                // Fallback to heuristic if no markers found
+                if text.contains("Let me think")


The extraction logic for reasoning in extract_reasoning_from_content() uses a heuristic that checks for certain phrases. Consider making these string checks case-insensitive to capture more variations.

Suggested change

if text.contains("Let me think")

if text.to_lowercase().contains("let me think")

src/providers/anthropic/provider.rs

src/providers/azure/provider.rs

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 14e0f98 in 1 minute and 30 seconds. Click for details.

Reviewed 283 lines of code in 7 files
Skipped 0 files when reviewing.
Skipped posting 9 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. src/providers/vertexai/tests.rs:34

Draft comment:
Consider refactoring cassette setup logic (lines 34-43) into a helper module to DRY up repeated code.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

2. src/providers/vertexai/tests.rs:85

Draft comment:
Using an unsafe block to set the environment variable (line 85) might lead to race conditions in tests; consider dependency injection or a safer alternative for test configuration.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

3. src/providers/vertexai/tests.rs:215

Draft comment:
In the run_test_with_quota_retry helper, consider logging detailed error information for non-TOO_MANY_REQUESTS errors for easier debugging.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

4. src/providers/vertexai/tests.rs:737

Draft comment:
Tests like 'test_auth_config_precedence' and 'test_auth_config_credentials_only' only construct the provider without assertions; adding explicit checks for expected authentication behavior would improve test rigor.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

5. src/providers/vertexai/tests.rs:860

Draft comment:
In 'test_generation_config_limits', consider asserting that max_output_tokens is set as expected for completeness.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

6. src/providers/vertexai/tests.rs:1187

Draft comment:
In 'test_gemini_request_with_array_content', confirm that concatenating array content with a space (resulting in 'Part 1 Part 2') is the intended behavior.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

7. src/providers/vertexai/tests.rs:1290

Draft comment:
Schema conversion tests are very thorough. Ensure that unsupported types (e.g. 'null') consistently fallback to STRING as expected.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

8. src/providers/vertexai/tests.rs:140

Draft comment:
Provider config tests use location strings (e.g. 'us-central'). Verify that these meet the expected format as validated in 'validate_location' to avoid misconfiguration.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

9. src/providers/vertexai/tests.rs:1608

Draft comment:
Overall, tests are comprehensive and well-documented. Minor improvements in error logging and test code reuse can further enhance maintainability.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_NQFRR6Kl85hhjfZb

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 5746528 in 1 minute and 21 seconds. Click for details.

Reviewed 207 lines of code in 4 files
Skipped 0 files when reviewing.
Skipped posting 4 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. src/providers/anthropic/models.rs:209

Draft comment:
Removed the 'exclude_reasoning' parameter from into_chat_completion. Verify that reasoning prompts (if provided in the request) are correctly incorporated (likely via the system message) and that no fallback behavior is lost.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 40% <= threshold 50% The comment is asking the author to verify that a specific behavior is maintained after a parameter removal. This is against the rules as it asks for verification of intended behavior. However, it does point out a specific change and its potential impact, which could be useful for the author to consider. The comment could be rephrased to suggest checking the behavior without directly asking for verification.

2. src/providers/anthropic/provider.rs:66

Draft comment:
Extraction of the 'exclude_reasoning' flag has been removed. Ensure that the reasoning configuration is now fully handled in the request conversion, and that responses include reasoning info as intended.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the author to ensure that the reasoning configuration is handled correctly after a change. This is a violation of the rule against asking the author to ensure behavior is intended or tested. The comment does not provide a specific suggestion or ask for a specific test to be written, which would have been acceptable.

3. src/providers/bedrock/test.rs:123

Draft comment:
Test payloads are updated to include 'reasoning: None'. This is good for default behavior, but consider adding tests for non-null reasoning configurations to verify the new reasoning support.
Reason this comment was not posted:
Comment looked like it was already resolved.

4. src/providers/vertexai/tests.rs:646

Draft comment:
Explicitly setting 'reasoning: None' in ChatCompletionRequest reflects the new optional parameter. It would be beneficial to add tests with a non-null reasoning config to ensure reasoning prompts are processed correctly.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_LXOUrf7m0a74ldQG

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 624ca5f in 56 seconds. Click for details.

Reviewed 63 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. src/providers/anthropic/models.rs:209

Draft comment:
Good refactor: replacing the dedicated conversion method with a From trait implementation simplifies the conversion of AnthropicChatCompletionResponse to ChatCompletion and ensures idiomatic usage. Check that the conversion for content (via .into()) handles all cases appropriately.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 50% <= threshold 50% The comment starts with a purely informative statement about the refactor, which is not allowed. However, it ends with a suggestion to check that the conversion for content handles all cases appropriately, which is a valid suggestion. The comment could be improved by focusing only on the suggestion part.

2. src/providers/anthropic/provider.rs:92

Draft comment:
Using the From trait via .into() instead of an explicit into_chat_completion() call improves clarity and consistency in conversion. This aligns with the refactored models module.
Reason this comment was not posted:
Comment looked like it was already resolved.

Workflow ID: wflow_8nkcC458Dyjk502C

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 6620ab0 in 51 seconds. Click for details.

Reviewed 39 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 3 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. src/providers/anthropic/provider.rs:45

Draft comment:
Good update replacing eprintln! with tracing::error! for invalid reasoning config. It improves consistency with structured logging.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, praising the update without providing any actionable feedback or suggestions. It doesn't align with the rules as it doesn't offer a specific code suggestion or highlight a potential issue.

2. src/providers/azure/provider.rs:84

Draft comment:
Nice improvement: switching from eprintln! to tracing::error! for logging errors on invalid reasoning config.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, as it only praises the change without providing any actionable feedback or suggestions. It doesn't ask for confirmation or suggest improvements.

3. src/providers/openai/provider.rs:77

Draft comment:
Replace eprintln! with tracing::error! here as well. This ensures uniform and structured error logging across providers.
Reason this comment was not posted:
Comment looked like it was already resolved.

Workflow ID: wflow_nNyTbPJflceqbhW6

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

src/providers/bedrock/test.rs (4)

241-241: Same as Line 126: cover non-None reasoning for Titan/Nova via Bedrock.

Currently reasoning: None doesn’t validate request transformation or expected errors. Please add at least one test using Some(ReasoningConfig { ... }) for this path too.

367-367: Same as Line 126: cover non-None reasoning for AI21 via Bedrock.

Add a test with reasoning: Some(...) to assert explicit behavior (ignored or rejected with a clear error).

442-442: Same as Line 126: ensure ARN-based Bedrock requests behave correctly with reasoning: Some(...).

For full-ARN model identifiers, add a test that sets reasoning: Some(...) to confirm there’s no unintended model transformation and behavior is explicit.

494-494: Same as Line 126: cover non-None reasoning for inference profile identifiers.

Add a reasoning: Some(...) case here to verify that inference profile IDs remain untransformed and the provider behavior is explicit.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14e0f98 and 5746528.

📒 Files selected for processing (4)

src/providers/anthropic/models.rs (3 hunks)
src/providers/anthropic/provider.rs (3 hunks)
src/providers/bedrock/test.rs (5 hunks)
src/providers/vertexai/tests.rs (14 hunks)

✅ Files skipped from review due to trivial changes (1)

src/providers/vertexai/tests.rs

🚧 Files skipped from review as they are similar to previous changes (2)

src/providers/anthropic/models.rs
src/providers/anthropic/provider.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

src/providers/bedrock/test.rs

nirga

@galzilber make sure to go over the bots comments

ellipsis-dev

Caution

Changes requested ❌

Reviewed 1e2a553 in 1 minute and 55 seconds. Click for details.

Reviewed 369 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. src/providers/bedrock/test.rs:657

Draft comment:
When both effort and max_tokens are set, the comment notes a log warning. Consider capturing and asserting the warning output for completeness.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% The comment points out a gap in test coverage - we're not verifying the warning behavior that's documented in the code comment. However, testing log output would require additional test infrastructure and mocking that may be complex. The current test already verifies the core functionality (that validation succeeds). The warning is a secondary concern. The comment assumes that testing log output is worth the additional complexity. But logging is often considered an implementation detail that doesn't need exhaustive testing. While logging may be an implementation detail, the warning seems to be part of the documented behavior based on the code comment. Not testing it leaves a gap between documented and verified behavior. The comment identifies a real gap between documented and tested behavior, but the value of adding log testing infrastructure may not justify the effort.

2. src/providers/bedrock/test.rs:797

Draft comment:
The appended reasoning prompt differs from the one used when no system message exists. Confirm this behavior is intended and document it if so.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to confirm their intention and to document the behavior, which violates the rules. It does not provide a specific code suggestion or ask for a specific test to be written. Therefore, this comment should be removed.

Workflow ID: wflow_diBRHo4s8YOcLabg

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev · 2025-08-10T10:12:50Z

src/providers/bedrock/test.rs

+        assert!(
+            result.is_ok(),
+            "AI21 chat completion with reasoning failed: {:?}",
+            result.err()


Consider adding tests for non-empty 'exclude' values in ReasoningConfig to cover all parameter scenarios.

ellipsis-dev · 2025-08-10T10:12:50Z

src/providers/bedrock/test.rs

+    }
+
+    #[test]
+    fn test_reasoning_config_to_thinking_prompt() {


Instead of only checking that to_thinking_prompt() returns Some(_), consider asserting the actual output to validate the prompt content.

ellipsis-dev · 2025-08-10T10:12:50Z

src/providers/bedrock/test.rs

+        let provider_config = get_test_provider_config("us-west-2", "anthropic_chat_completion");
+        let provider = BedrockProvider::new(&provider_config);
+
+        let payload = ChatCompletionRequest {


Multiple tests create ChatCompletionRequest with many repetitive None fields. Refactor using a helper to DRY up test payload creation.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (8)

src/providers/bedrock/test.rs (8)
241-242: Add Titan reasoning coverage (positive or negative).

Titan chat completion still uses reasoning: None. If Titan supports reasoning, add a positive test similar to the Anthropic/AI21 cases. If it doesn’t, add a negative test asserting a clear error/ignore behavior. I can draft either version if you confirm intended Titan support.

507-555: Good Anthropic reasoning test; align region/model to avoid fragility.

Test intent is solid. To reduce confusion with the static replay fixture (which uses an Anthropic dummy at us-east-2 and a different model), align the region for consistency:
-        let provider_config = get_test_provider_config("us-west-2", "anthropic_chat_completion");
+        let provider_config = get_test_provider_config("us-east-2", "anthropic_chat_completion");
Optional: either switch the payload model to the dummy’s model or add a matching dummy for this model to make request/response alignment explicit.

558-605: Max-tokens Anthropic test: same region/model alignment nit.

Same suggestion as above; align region/model with the replay event or add a corresponding dummy to keep fixtures consistent.

608-655: AI21 reasoning test added — consider asserting transformation.

This exercises the happy path. If feasible, add an assertion at the transformation layer (e.g., provider’s AI21 request body builder) to ensure reasoning fields are included. If no hook exists, consider adding a small unit for the mapper similar to the Anthropic transformation test.

658-671: Test name/comment mentions logging but no log assertion.

The test comment says “Should not error but should log a warning,” yet no log is asserted. Either:

Add a log assertion using a capture (e.g., tracing-subscriber with a test writer), or

Update the comment to avoid implying a log assertion.

I can wire up log capture if desired.

709-740: Strengthen prompt assertions and priority semantics.

Currently only is_some() is checked. Recommend asserting expected contents and that max_tokens overrides effort:

For high/medium/low, assert distinct substrings (e.g., “Think through this step-by-step…” for high; “Consider this problem thoughtfully” for medium).

For the max_tokens case, assert the prompt contains a token-budget hint and does not rely on the effort-specific phrasing.

This will better lock the contract.

743-794: Nice: direct Anthropic transformation assert.

Good coverage verifying reasoning prompt injection into the system message. Consider also checking that the message does not duplicate if applied twice.

796-863: Good: preserves existing system + appends reasoning.

Solid test ensuring concatenation behavior. You may also want to assert ordering (original system message precedes reasoning prompt).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6620ab0 and 1e2a553.

📒 Files selected for processing (1)

src/providers/bedrock/test.rs (6 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (4)

src/providers/bedrock/test.rs (4)

126-127: Baseline OK; reasoning covered elsewhere.

Keeping reasoning: None here preserves the baseline path. The new reasoning tests under arn_tests exercise the feature. No change needed in this test.

367-368: LGTM: keep baseline AI21 path without reasoning.

The dedicated AI21 reasoning test below provides coverage; this baseline can remain None.

442-443: LGTM: ARN transformation test stays focused.

Leaving reasoning: None is appropriate here to keep the test focused on identifier handling.

494-496: LGTM: inference-profile identifier test stays focused.

Same rationale—keeping reasoning: None isolates the transformation concern under test.

galzilber added 2 commits August 7, 2025 17:33

feat: add reasoning support

0fea923

fmt

b55860a

coderabbitai bot reviewed Aug 7, 2025

View reviewed changes

ellipsis-dev bot reviewed Aug 7, 2025

View reviewed changes

remove not in use

14e0f98

ellipsis-dev bot reviewed Aug 10, 2025

View reviewed changes

remove more code

5746528

ellipsis-dev bot reviewed Aug 10, 2025

View reviewed changes

galzilber added 2 commits August 10, 2025 11:08

remove unused code

624ca5f

remove printer

6620ab0

ellipsis-dev bot reviewed Aug 10, 2025

View reviewed changes

coderabbitai bot reviewed Aug 10, 2025

View reviewed changes

src/providers/bedrock/test.rs Show resolved Hide resolved

nirga approved these changes Aug 10, 2025

View reviewed changes

Create tests for the new affort

1e2a553

ellipsis-dev bot reviewed Aug 10, 2025

View reviewed changes

coderabbitai bot reviewed Aug 10, 2025

View reviewed changes

galzilber merged commit 4b0f484 into main Aug 10, 2025
3 checks passed

coderabbitai bot mentioned this pull request Aug 10, 2025

fix: make effort optional for gemini #65

Merged

This was referenced Sep 7, 2025

fix: max_tokens change to max_completion_tokens #68

Merged

fix: refactor reasoning algo #69

Open

	if text.contains("Let me think")
	if text.to_lowercase().contains("let me think")

feat: add reasoning support #63

feat: add reasoning support #63

Uh oh!

Conversation

galzilber commented Aug 7, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nirga left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Aug 10, 2025

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Aug 10, 2025

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Aug 10, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

galzilber commented Aug 7, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 7, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)