Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: propagate connection error causes to RPC statuses #4311

Merged
merged 12 commits into from
Apr 13, 2021
Prev Previous commit
Next Next commit
use WithBlock
  • Loading branch information
apolcyn committed Apr 13, 2021
commit bb3a1eea5fcfca216243c53ddec3549bb3fa53ec
4 changes: 3 additions & 1 deletion test/end2end_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1341,7 +1341,9 @@ func (s) TestDetailedConnectionCloseErrorPropagatesToRpcError(t *testing.T) {
return status.Error(codes.Internal, "arbitrary status")
},
}
if err := ss.Start(nil); err != nil {
// Use WithBlock() to guarantee that the RPC will be able to pick a healthy connection to
// go out on before we call Stop on the server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default "wait for ready" behavior of RPCs should be fine for this, too. Are you sure this is necessary (and why)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this in because NI was still looking for a way to make sure that RPCs failed strictly after the client conn had found a healthy TCP connection for them to go out on, because I was seeing some flakes where the RPC failed with:

end2end_test.go:1372: &{0xc0000ba360}.Recv() = _, rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:37309: connect: connection refused", want _, rpc error containing substring: |connection reset by peer| OR |error reading from server: EOF|

i.e., in some test flakes, the call to ss.S.Stop() caused server shutdown apparently before the RPC picked a connection.

I dealt with this instead by just adding back the rpcStartedOnServer channel.

BTW, AFAICS RPCs aren't actually using the WaitForReady call option, when using the stub server, right? I do see

if err := waitForReady(cc); err != nil {
in the stub server, but it looks like that basically accomplishes the same thing as WithBlock, rather than the WaitForReady call option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, AFAICS RPCs aren't actually using the WaitForReady call option, when using the stub server, right?

Oops.. "wait for ready" is not what I meant - the default is not WaitForReady. But by default, RPCs should wait for the channel to go from connecting->ready (or transient failure, whis is not expected in tests).

I'm not sure why that waitForReady call is in the stub server; we should probably remove it and do it in the tests that need it for whatever reason.

But since you are using ss.Start which does that waitForReady thing, I'm not sure why you'd be getting that RPC error with "connection refused". That seems like something we should look into.

if err := ss.Start(nil, grpc.WithBlock()); err != nil {
t.Fatalf("Error starting endpoint server: %v", err)
}
defer ss.Stop()
Expand Down