-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TCP read out-of-sync after exception #67
Comments
As a side note: I think this code in err = handle_req_fc(nmbs);
if (err != NMBS_ERROR_NONE && !nmbs_error_is_exception(err)) {
...
return err;
} |
Not all bytes may have been read from the TCP socket read buffer when the request processing is aborted by an exception. - remember the frame length from the message header - derease remaining length on every successful read - after sending the exception flush any remaining bytes from the socket Fixes debevv#67
The proposed PR fixes the problem for me, but I have client code disabled. |
Another side note: the following code in Linux TCP example platform code is incorrect: if (ret == 1) {
ssize_t w = write(fd, buf + total, count); it should of course be: ssize_t w = write(fd, buf + total, count - total); |
Not all bytes may have been read from the TCP socket read buffer when the request processing is aborted by an exception. - remember the frame length from the message header - derease remaining length on every successful read - after sending the exception flush any remaining bytes from the socket Fixes debevv#67
It seems like this only would be an issue when the client polling rate is higher than the server's internal polling rate. I think in most applications this isn't the case which is probably why I have not seen this problem. I also don't use POSIX sockets. Flushing the remainder of the frame seems like the logical thing to do here though. |
Maybe my wording was imprecise, but this has nothing to do with the Socket API, just TCP. I have no idea what the sender transmit or recipient processing speed has to do with this. TCP presents a bi-directional, unstructured stream of octets to the application, no matter what the actual API is. The structure on-top of that stream (AKA protocol) is the responsibility of the application. That is the reason why the Modbus TCP PDU has a length field (RTU must have some other method). But that length field is not available to the low-level platform read function, so frame handling needs to be implemented by the nanoMODBUS code. Sorry for sounding like a "Network 101" text book 😃 |
I was just saying that subsequent calls to I imagine in my application nanoMODBUS doesn't control your connections so I was confused as to why:
How you respond to errors returned by Not trying to bash your issue, I was just offering up some speculation as to why this this problem has gone unnoticed. |
That's the point: it doesn't do that. After the first PDU parsing is aborted too early, the framing is lost and all subsequent PDUs can't be parsed correctly anymore. So the client gets incorrect responses to correct PDUs.
Once again: the processing speed doesn't have anything to do with this. On the TCP stream there are no "frames", just octets. It is up to the application to recognize the frames and read each of them completely.
There is not enough information for the caller to handle the situation, i.e. the only thing he can do is to close the connection. But that means in the end that the client always looses the connection after it sends a PDU whose processing is aborted early due to some error condition. Of course the simple solution is to say: any client that sends a PDU that the server rejects with an exception is broken. Not sure how far you can get with this in real life... |
I still don't think you understand what I'm trying to say. You seem to be taking it way too personal. So sure, whatever you say. |
I think I finally figured out the implicit assumption behind this statement. You are stating that the contract for the
|
Yes, the reason why the library behaves this way is basically RTU. As @pseudotronics explained, the expectation behind this design is that the user will repeatedly call However, in TCP we do have a length field, and the server would be much more reliable if we didn't throw that info away and use it to try to read the whole message all at once. Besides, it feels a bit stupid to put ourselves in an out-of-sync situation by leaving data in the socket when responding whit an exception. It works anyway, but I think it's the time to make everything more solid. I will have a look at it in the next days because I want to introduce the change more holistically. Thanks to both of you for the constructive discussion. |
Thanks for the explanation. This clarifies
For my initial integration I followed the SOP for socket servers to drop the client connection when the application protocol layer reports an error. I have to admit that AFAIR Modbus is the first request/response protocol I encountered with this odd behavior, i.e. that on certain error conditions the server doesn't send a response, leaving the client to play a guessing game. For the product I'm working on I will not have any control over the client, so I have to make sure that my server behaves as expected by any client out there. |
Not all bytes may have been read from the TCP socket read buffer when the request processing is aborted by an exception. - remember the frame length from the message header - frame length must be at least 2 octets - decrease remaining length on every successful read - after sending the exception flush any remaining bytes from the socket Fixes debevv#67
I'm trying to integrate the code into my tree as Modbus TCP server. I started with the minimal configuration, ie. I flagged out all except one function implementation. While trying to test the illegal function exception I noticed that TCP socket read gets out-of-sync.
That is caused by the fact that only the frame header is read when
handle_req_fc()
sends the exception response. But the last bytes of the frame are still in the socket read buffer, hence the next call tonmbs_server_poll()
will then get stuck, because there is an invalid and incomplete message header in the buffer. When the client sends the next frame, thenNMBS_ERROR_INVALID_TCP_MBAP
Browsing through the code there are some other places where frame processing is aborted early. Those code paths will have the same problem.
I don't know if there is a specification how a Modbus TCP client is supposed to act in case of an exception. MUST he close the socket immediately and use a new one for the next request?
I'm not sure how this should be corrected, because
recv_msg_header()
doesn't store the frame length. How about this approachnbms
(as far as I understand that would only be valid for TCP)recv()
in theNBMS_ERROR_NONE
code pathsend_exception_msg()
should do aplatform.read()
(similar tonmbs_server_poll()
after the exception was successfully sent.The text was updated successfully, but these errors were encountered: