-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid a spin loop when inactive is fired before active #470
Conversation
Motivation: The NIOSSLHandler can enter a spin loop in `doUnbufferActions` if writing data into BoringSSL fails with SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE. One way these errors can happen is if a write into BoringSSL happens prior to the handshake completing. The NIOSSLHandler is explicit about starting the handshake: it's done in channel active and in handler added if the channel is already active. However, the handshaking step is currently done without any state checking, so if the state is 'closed' (as it would be after channel inactive) then the handshake will still start. This can happen if channel inactive happens before channel active. To reach the write loop there must be a buffered write and flush prior to the handshake step starting and the state isn't 'idle' or 'handshaking'. This can happen if a write and flush hapens while 'NIOSSLHandler' is in 'channelActive' (it forwards the 'channelActive' event _before_ starting the handshake) and 'channelInactive' came first. Modifications: - Early exit from 'doHandshakeStep' if the state isn't applicable for starting a handshake. - Don't buffer writes when the state is 'closed' as they'll never succeed and can be failed immediately. - If the write isn't succesful in 'doUnbufferActions' then either write to the network or try reading. - Only allow a limited number of spins through 'doUnbufferActions' Result: - Resolves apple#467
Sources/NIOSSL/NIOSSLHandler.swift
Outdated
if let promise = bufferedWrite.promise { promises.append(promise) } | ||
_ = self.bufferedActions.removeFirst() | ||
} else if didWrite { | ||
// The write into BoringSSL unsuccessful. This happens when BoringSSL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The write into BoringSSL unsuccessful. This happens when BoringSSL | |
// The write into BoringSSL was unsuccessful. This happens when BoringSSL |
Sources/NIOSSL/NIOSSLHandler.swift
Outdated
break writeLoop | ||
} else { | ||
// No write was successful in this loop, so assume the error | ||
// was 'wants read'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to assume here? It seems likely that we might want to actually take an action on this, should we be returning these values instead?
Relatedly, I'm not sure calling doDecodeData
in the loop here is a good idea. I don't think we should have any pending buffered data, we usually clear it on readComplete
.
Is the expectation here that we're actually in the handshake? If that's true, I wonder if we should rethink the state management here and check for that state, rather than indirectly try to handle want read and want write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relatedly, I'm not sure calling
doDecodeData
in the loop here is a good idea. I don't think we should have any pending buffered data, we usually clear it onreadComplete
.Is the expectation here that we're actually in the handshake? If that's true, I wonder if we should rethink the state management here and check for that state, rather than indirectly try to handle want read and want write.
Being in the implicit handshake is one way we can get to this point. I'm not sure what other situations would get us into this state. With that said though, I think the additional state checking prevents us from getting into the implicit handshake.
I didn't feel entirely comfortable about these changes (especially the else
branch) but leaving the potential spin loop also didn't sit right.
Given this code – aside from the known path into the spin loop – has worked fine for years, perhaps we should back out these changes and just keep the iteration limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the iteration limit can work, combined with checking our state and delaying any flushing until after we think the handshake is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the iteration limit can work
To clarify, do you mean just the iteration limit as-is? Or without the current else
branch? Or something else?
combined with checking our state and delaying any flushing until after we think the handshake is done.
Okay I think we have our bases covered then, we only call doUnbufferActions(context:)
from three different places:
channelRead(context:data:)
if the state isactive
oroutputClosed
completeHandshake(context:)
flush(context:)
if state is one of.active
,.unwrapping
,.closing
,.unwrapped
,.inputClosed
,.outputClosed
,.closed
So we should only be unbuffering actions and flushing if we're in an appropriate state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or without the current else branch?
Without the current else
branch.
Thank you!! |
Motivation:
The NIOSSLHandler can enter a spin loop in
doUnbufferActions
if writing data into BoringSSL fails with SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE.One way these errors can happen is if a write into BoringSSL happens prior to the handshake completing. The NIOSSLHandler is explicit about starting the handshake: it's done in channel active and in handler added if the channel is already active.
However, the handshaking step is currently done without any state checking, so if the state is 'closed' (as it would be after channel inactive) then the handshake will still start. This can happen if channel inactive happens before channel active.
To reach the write loop there must be a buffered write and flush prior to the handshake step starting and the state isn't 'idle' or 'handshaking'. This can happen if a write and flush hapens while 'NIOSSLHandler' is in 'channelActive' (it forwards the 'channelActive' event before starting the handshake) and 'channelInactive' came first.
Modifications:
Result:
writeAndFlush
gets run fromchannelActive
#467