Fix broken WebSocket defragmentation#383
Conversation
src/AsyncWebSocket.cpp
Outdated
| #define SATE_FRAME_START 0 | ||
| #define SATE_FRAME_MASK 1 | ||
| #define SATE_FRAME_DATA 2 |
There was a problem hiding this comment.
Using _pstate as suggested by @willmmiles here to keep track of the reassembling for the mask
| uint8_t *data = (uint8_t *)pbuf; | ||
|
|
||
| while (plen > 0) { | ||
| async_ws_log_v("WS[%" PRIu32 "]: _onData: plen=%" PRIu32 ", _pstate=%" PRIu8 ", _status=%" PRIu8, _clientId, plen, _pstate, static_cast<uint8_t>(_status)); |
There was a problem hiding this comment.
kept all logs async_ws_log_v to still be able to easily debug
b16f022 to
ad9be24
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical bug in WebSocket frame defragmentation that was introduced in PR #353 when addressing Safari's fragmented pong data. The fix refactors the state machine for handling WebSocket frames, particularly improving mask byte accumulation across packet boundaries and restoring proper frame reassembly.
Changes:
- Introduces explicit state constants for the WebSocket parsing state machine
- Refactors mask byte reading to properly handle Safari's behavior of splitting the 4-byte mask across multiple TCP packets
- Fixes frame defragmentation logic that was broken since PR #353
- Adds comprehensive verbose logging to aid debugging
- Updates the WebSocket example to demonstrate both complete and fragmented frame handling
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/AsyncWebSocket.cpp | Core fix: adds state constants, refactors mask reading logic to use _pinfo.masked as a counter (1-5), improves frame defragmentation handling, adds verbose logging, and enhances error handling |
| examples/WebSocket/WebSocket.ino | Updates example to demonstrate fragmented frame handling with detailed logging and increases polling interval from 100ms to 500ms |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/AsyncWebSocket.cpp
Outdated
| // _pinfo.masked is 1 if we need to start reading mask bytes | ||
| // _pinfo.masked is 2, 3, or 4 if we have partially read the mask | ||
| // _pinfo.masked is 5 if the mask is complete | ||
| while (_pinfo.masked && _pstate <= SATE_FRAME_MASK && _pinfo.masked < 5) { |
There was a problem hiding this comment.
reassemble de mask using _pinfo.masked to track byte count but resets _pinfo.masked to 1 at the end for backward compat'
| _pinfo.index = 0; | ||
| } else { | ||
| // Wait for more data | ||
| // async_ws_log_w("WS[%" PRIu32 "]: waiting for more mask data: read=%zu/4", _clientId, mask_offset); | ||
| _pinfo.index = mask_offset; // Save progress | ||
| _pstate = 1; | ||
| return; | ||
| _pinfo.len = 0; |
There was a problem hiding this comment.
To go into the disconnect use case below
| } | ||
|
|
||
| const size_t datalen = std::min((size_t)(_pinfo.len - _pinfo.index), plen); | ||
| const auto datalast = datalen ? data[datalen] : 0; |
There was a problem hiding this comment.
I really hate this thing... users have to add \0 and overflow the data array (see WS examples). That's not normal. I don't know if there is a way to refactor that in a backward compatible way one day.
There was a problem hiding this comment.
This hack is outright wrong. Writing off the end of the pbuf pointer is just not safe in a preemptive system. I've seen crashes because the onEvent handler wrote off the end of the buffer, corrupting the heap, and then got pre-empted by another task that tried to do heap allocation before AsyncWebSocket could "undo" the damage. When I fixed the client, I still saw a crash because of the attempt to restore the byte; same setup - preempted during the onEvent callback - but the (now fixed) client didn't overwrite the value; so the heap manager wrote something new to that location, which then had one byte "restored" to an old value when AsyncWebSocket got the CPU back.
Ultimately I had to remove this from the WLED fork and we fixed all our clients to respect the buffer boundaries. Users don't have to write off the end of the array. It is true that when data arrives as a Pascal-style string (pointer+length), it cannot be used as a C-style string without a copy. But that's only necessary if you want to interact with C functions; modern tools like std::string_view can be used directly. (And if you must use a C API, std::string has a convenient pointer+length constructor for making a quick copy.)
That said: given the current example code, if we consider "onEvent handlers are permitted to write off the end of the buffer by one byte" as part of our backwards compatibilty API contract, then we really have no safe choice but to copy the data into a bigger buffer before calling the callback if the frame ends at the packet boundary.
There was a problem hiding this comment.
That said: given the current example code, if we consider "onEvent handlers are permitted to write off the end of the buffer by one byte" as part of our backwards compatibilty API contract, then we really have no safe choice but to copy the data into a bigger buffer before calling the callback if the frame ends at the packet boundary.
What I thought too which is very ugly.
Let’s merge this PR I will open an issue for that. What we can do for now is to create a copy buffer but only for opcode TEXT with a \0 at the end.
| _status = WS_DISCONNECTING; | ||
| if (_client) { | ||
| _client->ackLater(); | ||
| } | ||
| _queueControl(WS_DISCONNECT, data, datalen); |
There was a problem hiding this comment.
if we receive an invalid frame, nothing were done so the connection would be left in an unknown state, including frame defragmentation... So let's force a disconnect to restore the connection.
it could also be a rogue client...
ad9be24 to
5d3e467
Compare
5d3e467 to
406147f
Compare
willmmiles
left a comment
There was a problem hiding this comment.
Sorry it's taken me so long to get through this; I don't have any Apple devices, so I hacked websocat to let me generate arbitrary frame boundaries.
The code seems to be working well for me for the test cases as defined. I've noted some other issues with the WS header framing code but they are out-of-scope for this PR. me-no-dev#953 on the old repo has a general fix for header fragmentation; this is what I pulled in the WLED fork. To improve here, we could adopt a similar approach of building up the header in a temporary buffer; or expand the state machine to cover the complete header assembly. That can be taken on later.
| // ); | ||
|
|
||
| data += 2; | ||
| plen -= 2; |
There was a problem hiding this comment.
Availability of these bytes is not checked - in theory plen could be 1 here. This is problematic throughout the header analysis. Not a new problem for this PR but I'd like to note it nonetheless.
There was a problem hiding this comment.
Yes I saw that, but we have to look at the WS spec also because these are the first 2 bytes received for the first frame: there shouldn't be any fragmentation there ?
The only case where fragmentation is happening within the assembly of the frame header is for the case of Safari which is sending the mask bytes in 2 shots.
| if (_pinfo.len == 126 && plen >= 2) { | ||
| _pinfo.len = fdata[3] | (uint16_t)(fdata[2]) << 8; | ||
| data += 2; | ||
| plen -= 2; |
There was a problem hiding this comment.
Availability of these bytes is checked, but if it spans the next TCP packet, our deserializer gets confused and we end up out of step. (Again out of scope here but worth noting)
| _status = WS_DISCONNECTING; | ||
| if (_client) { | ||
| _client->ackLater(); | ||
| } | ||
| _queueControl(WS_DISCONNECT, data, datalen); |
|
Thank you for your review @willmmiles ! |
Fix #382
WebSocket defragmentation was broken after a fix for Safari in PR #353
This PR refactors several parts:
How to test this PR:
With
WebSocket.inoexample, compile in verbose mode.Complete frame
Fragmentation test:
Safari fragmented packet test
Hit multiple time CMD+R on Safari => look for
waiting for more mask data: read=0/4