-
-
Notifications
You must be signed in to change notification settings - Fork 35.8k
Fix REST sensor charset handling to respect Content-Type header #148223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the REST sensor (and binary_sensor) to honor the charset parameter in the Content-Type header when decoding HTTP responses, falling back to a configured encoding only if no charset is provided.
- Enhance the aiohttp mock in tests to expose a
charsetproperty and default to auto-detection whenencoding=None. - Update
rest/data.pyto callresponse.text()without forcing an encoding if the server provided a charset. - Add test cases covering auto-detection, fallback, and override scenarios.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/test_util/aiohttp.py | Add charset property and adjust text()/json() to match aiohttp’s auto-detect behavior. |
| tests/components/rest/test_sensor.py | New tests for charset auto-detection, fallback to configured encoding, and override behavior. |
| homeassistant/components/rest/data.py | Conditional call to response.text() to respect header charset or use configured encoding. |
| assert hass.states.get("sensor.mysensor").state == "tack själv" | ||
|
|
||
|
|
||
| async def test_setup_auto_encoding_from_content_type( |
Copilot
AI
Jul 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] These tests share a common setup and follow similar patterns; consider parameterizing them with pytest.mark.parametrize to reduce duplication and improve maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't make sense here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not put the content, headers, and the sensor "enconding" config in test params? the rest of the test is the same for the 3 tests.
|
thanks |
|
Any time 🍻 |
|
Looks like I need to fix the mocking for cast |
|
Thanks a lot @bdraco for looking into this right away. I wonder whether it wouldn't be better to make the There are a few buggy servers out there though for which it would be valuable to be able to override the header. With the solution in this PR, users have no way to work around such issues. |
|
The goal right now is to avoid the breaking change and restore it to how it worked before. We would need to add a new field to override the content-type. But since nobody asked for it before it seems like a case of YAGNI |
Why is that? The encoding field is marked as optional. The problem is that in practice it actually isn't optional as it defaults to UTF-8 and that value of now enforced (which this PR is fixing). IMHO the semantics of an optional field suggest that an unset value mean's that the user doesn't care ("figure it out for me" --> use the content type header). If the value is set, however, it would go against the explicitly expressed intent of the user to use the specified encoding. |
I understand your perspective, but changing the behavior of the optional default would be a breaking change. My goal with this PR is just to address the conversion issue, not to introduce new features or broader changes to the integration. If you'd like to explore that direction further, I’d suggest opening a separate PR for discussion. |
| assert hass.states.get("sensor.mysensor").state == "tack själv" | ||
|
|
||
|
|
||
| async def test_setup_auto_encoding_from_content_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not put the content, headers, and the sensor "enconding" config in test params? the rest of the test is the same for the 3 tests.
Proposed change
This PR fixes the REST sensor and binary_sensor platforms to properly respect the charset parameter from the Content-Type header when decoding response text.
Background
When the REST integration was migrated from httpx to aiohttp in #146306, the behavior for character encoding changed:
encodingconfiguration parameter existed but was ignored due to a bug, causing httpx to always auto-detect charset from Content-Type headersThe Issue
Currently, the REST sensor/binary_sensor always uses the configured encoding (default UTF-8) and ignores what the server specifies in the Content-Type header. This causes decoding errors when servers return content in other encodings (ISO-8859-1, Windows-1252, GB2312, etc.) with the proper Content-Type header.
The Solution
This PR makes the REST sensor/binary_sensor respect the Content-Type header charset while maintaining backward compatibility:
This brings the behavior in line with the REST switch component (which uses httpx and respects Content-Type headers) and restores the original (accidentally correct) behavior.
Type of change
Additional information
Checklist
ruff format homeassistant tests)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest.requirements_all.txt.Updated by running
python3 -m script.gen_requirements_all.To help with the load of incoming pull requests: