-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Issue 1356 - Fix SSE unicode line separator handling #1359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Florimond (httpx-sse) replied in an hour... Please create a PR on that side. |
httpx-sse PR: florimondmanca/httpx-sse#37 |
Use compliant_aiter_sse in streamable_http.py to handle Unicode line separator characters correctly, preventing the same issue that affected the SSE client.
Previously, the SSE parser could incorrectly handle CRLF line endings when \r appeared at the end of one chunk and \n at the beginning of the next chunk, potentially treating them as two separate line breaks instead of a single CRLF sequence. This fix implements proper CRLF handling by: - Tracking when a chunk ends with \r using a skip_leading_lf flag - Skipping a leading \n in the next chunk if the previous ended with \r - Ensuring Unicode line/paragraph separators (U+2028/U+2029) are treated as regular content, not line breaks, per the SSE specification Added comprehensive test coverage for the edge case of split CRLF sequences across chunk boundaries.
The Unicode line separator issue (U+2028 and U+2029 characters being incorrectly treated as newlines) has been fixed in httpx-sse 0.4.2. See: florimondmanca/httpx-sse#39 Revert the compliant_aiter_sse workaround and use the standard event_source.aiter_sse() method again. Upgrade httpx-sse to >=0.4.2 to get the fix. Keep the high-level issue test to ensure the problem doesn't regress. Github-Issue:#1356
4021da2
to
45109d2
Compare
httpx-sse 0.4.2 is now released, so this works, but I think we should leave this for a little while to avoid creating a hard dependency on a package that was released just yesterday. Will leave this in draft. Our dependencies already say |
Motivation and Context
httpx-sse
has incorrect handling of newlines in its parsing, particularly around Unicode newline characters that are incorrectly treated as newlines. See florimondmanca/httpx-sse#34 for details.Our bug, #1356, explains that this causes incorrect output and also just hangs the stream processing entirely (due to latent exception handling issues, which also need fixing).
I'm fixing the SSE parsing on our side, but will also look into fixing at the source. Not sure how long that will take to get pushed through.
How Has This Been Tested?
Two unit tests added.
Breaking Changes
None
Types of changes
Checklist