Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-9316 Prefer ErrDisconnected over closed pipe in client code #4571

Merged
merged 3 commits into from
Nov 21, 2024

Conversation

benjirewis
Copy link
Member

@benjirewis benjirewis commented Nov 20, 2024

Sensitivity to changes from viamrobotics/goutils#394.

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Nov 20, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 20, 2024
@@ -80,12 +80,12 @@ require (
go.uber.org/zap v1.27.0
go.viam.com/api v0.1.357
go.viam.com/test v1.2.3
go.viam.com/utils v0.1.112
go.viam.com/utils v0.1.116
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This quadruple bump could be trouble? Certainly comes with other direct and indirect transient bumps.

@@ -99,6 +99,11 @@ func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
h.ServeHTTP(w, r)
}

// Stats is unsupported.
func (s *Server) Stats() any {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix from #4566 @dgottlieb.

@@ -130,13 +128,6 @@ func skipConnectionCheck(method string) bool {
return exemptFromConnectionCheck[method]
}

func isClosedPipeError(err error) bool {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke with @dgottlieb a bit offline. We were relying on io.ErrClosedPipe to tell us if a client was disconnected in any of our gRPC calls within robot/client/client.go. We now explicitly rely on rpc.ErrDisconnected, which should be returned any time a gRPC call tries to open a new stream on a closed channel (we send a gRPC request over a closed connection.)

@benjirewis benjirewis marked this pull request as ready for review November 20, 2024 22:00
@@ -161,7 +152,7 @@ func (rc *RobotClient) handleUnaryDisconnect(
err := invoker(ctx, method, req, reply, cc, opts...)
// we might lose connection before our background check detects it - in this case we
// should still surface a helpful error message.
if isClosedPipeError(err) {
Copy link
Member

@dgottlieb dgottlieb Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These "if closed pipe" conditionals were a little abstract before. Can we tighten up the documentation here? I'd love if we could list out exactly which errors bubble up here (obvious ErrDisconnected is one). And how this conditional changes the behavior in a desirable way.

It's hard to claim "surface a helpful error message" is an improvement over some unstated error message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke a bit with @dgottlieb offline. I did some sleuthing into what errors can bubble up to this point; I do not think we're catching them all. Left a TODO with a reference to https://viam.atlassian.net/browse/RSDK-9333 and a NOTE explaining my findings/thoughts.

For now, I think we should capture both rpc.ErrDisconnected and io.ErrClosedPipe as possible disconnect manifestations, so we are not less sensitive than before. Per RSDK-9333, we should likely capture more than that and transform the errors into something more informative than an Unavailable gRPC error.

The test that was failing in a previous CI run was indeed "injecting" an io.ErrClosedPipe and expecting us to transform it into a gRPC error with code Unavailable. Left that test as-is.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 21, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 21, 2024
@benjirewis benjirewis merged commit baf6c50 into viamrobotics:main Nov 21, 2024
16 checks passed
@benjirewis benjirewis deleted the handle-err-disconnected branch November 21, 2024 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants