Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSO occasional Connection reset error #469

Open
4 tasks done
ognjenVlad opened this issue Jul 8, 2024 · 10 comments
Open
4 tasks done

SSO occasional Connection reset error #469

ognjenVlad opened this issue Jul 8, 2024 · 10 comments
Labels
impact/documentation A PR with changes which should be addressed in the documentation scope/backend Related to backend changes status/triage/completed Automatic triage completed type/bug Something isn't working

Comments

@ognjenVlad
Copy link

ognjenVlad commented Jul 8, 2024

Issue submitter TODO list

  • I've looked up my issue in FAQ
  • I've searched for an already existing issues here
  • I've tried running main-labeled docker image and the issue still persists there
  • I'm running a supported version of the application which is listed here

Describe the bug (actual behavior)

When using SSO, web-ui sometimes returns 500. Hard to find out when exactly, but seems like it happens when coming back to web-ui after some time, although sometimes it happens when you just try to login randomly. It is always redirection to redirect-uri login/oauth2/code/{client}

{"code":5000,"message":"Connection reset","timestamp":1720450398132,"requestId":"e25340f5-656","fieldsErrors":null,"stackTrace":"org.springframework.web.reactive.function.client.WebClientRequestException: Connection reset\n\tat org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:136)\n\tSuppressed: The stacktrace has been enhanced by Reactor, refer to additional information below: \nError has been observed at the following site(s):\n\t*__checkpoint ⇢ Request to POST https://***/token [DefaultWebClient]\n\t*__checkpoint ⇢ OAuth2LoginAuthenticationWebFilter [DefaultWebFilterChain]\n\t*__checkpoint ⇢ OAuth2AuthorizationRequestRedirectWebFilter [DefaultWebFilterChain]\n\t*__checkpoint ⇢ ReactorContextWebFilter [DefaultWebFilterChain]\n\t*__checkpoint ⇢ HttpHeaderWriterWebFilter [DefaultWebFilterChain]\n\t*__checkpoint ⇢ ServerWebExchangeReactorContextWebFilter [DefaultWebFilterChain]\n\t*__checkpoint ⇢ org.springframework.security.web.server.WebFilterChainProxy [DefaultWebFilterChain]\n\t*__checkpoint ⇢ org.springframework.web.filter.reactive.ServerHttpObservationFilter [DefaultWebFilterChain]\n\t*__checkpoint ⇢ HTTP GET \"/login/oauth2/code/test?code=***\" [ExceptionHandlingWebHandler]\nOriginal Stack Trace:\n\t\tat org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:136)\n\t\tat reactor.core.publisher.MonoErrorSupplied.subscribe(MonoErrorSupplied.java:55)\n\t\tat reactor.core.publisher.Mono.subscribe(Mono.java:4495)\n\t\tat reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103)\n\t\tat reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)\n\t\tat reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)\n\t\tat reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222)\n\t\tat reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93)\n\t\tat reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain.onError(MonoFlatMapMany.java:204)\n\t\tat reactor.core.publisher.SerializedSubscriber.onError(SerializedSubscriber.java:124)\n\t\tat reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.whenError(FluxRetryWhen.java:225)\n\t\tat reactor.core.publisher.FluxRetryWhen$RetryWhenOtherSubscriber.onError(FluxRetryWhen.java:274)\n\t\tat reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onError(FluxContextWrite.java:121)\n\t\tat reactor.core.publisher.FluxConcatMapNoPrefetch$FluxConcatMapNoPrefetchSubscriber.maybeOnError(FluxConcatMapNoPrefetch.java:326)\n\t\tat reactor.core.publisher.FluxConcatMapNoPrefetch$FluxConcatMapNoPrefetchSubscriber.onNext(FluxConcatMapNoPrefetch.java:211)\n\t\tat reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onNext(FluxContextWrite.java:107)\n\t\tat reactor.core.publisher.SinkManyEmitterProcessor.drain(SinkManyEmitterProcessor.java:471)\n\t\tat reactor.core.publisher.SinkManyEmitterProcessor$EmitterInner.drainParent(SinkManyEmitterProcessor.java:615)\n\t\tat reactor.core.publisher.FluxPublish$PubSubInner.request(FluxPublish.java:873)\n\t\tat reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.request(FluxContextWrite.java:136)\n\t\tat reactor.core.publisher.FluxConcatMapNoPrefetch$FluxConcatMapNoPrefetchSubscriber.request(FluxConcatMapNoPrefetch.java:336)\n\t\tat reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.request(FluxContextWrite.java:136)\n\t\tat reactor.core.publisher.Operators$DeferredSubscription.request(Operators.java:1717)\n\t\tat reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onError(FluxRetryWhen.java:192)\n\t\tat reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201)\n\t\tat reactor.netty.http.client.HttpClientConnect$HttpObserver.onUncaughtException(HttpClientConnect.java:403)\n\t\tat reactor.netty.ReactorNetty$CompositeConnectionObserver.onUncaughtException(ReactorNetty.java:703)\n\t\tat reactor.netty.resources.DefaultPooledConnectionProvider$DisposableAcquire.onUncaughtException(DefaultPooledConnectionProvider.java:223)\n\t\tat reactor.netty.resources.DefaultPooledConnectionProvider$PooledConnection.onUncaughtException(DefaultPooledConnectionProvider.java:476)\n\t\tat reactor.netty.channel.FluxReceive.drainReceiver(FluxReceive.java:247)\n\t\tat reactor.netty.channel.FluxReceive.onInboundError(FluxReceive.java:468)\n\t\tat reactor.netty.channel.ChannelOperations.onInboundError(ChannelOperations.java:515)\n\t\tat reactor.netty.channel.ChannelOperationsHandler.exceptionCaught(ChannelOperationsHandler.java:145)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:346)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:325)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:317)\n\t\tat io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireExceptionCaught(CombinedChannelDuplexHandler.java:424)\n\t\tat io.netty.channel.ChannelHandlerAdapter.exceptionCaught(ChannelHandlerAdapter.java:92)\n\t\tat io.netty.channel.CombinedChannelDuplexHandler$1.fireExceptionCaught(CombinedChannelDuplexHandler.java:145)\n\t\tat io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:143)\n\t\tat io.netty.channel.CombinedChannelDuplexHandler.exceptionCaught(CombinedChannelDuplexHandler.java:231)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:346)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:325)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:317)\n\t\tat io.netty.handler.ssl.SslHandler.exceptionCaught(SslHandler.java:1204)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:346)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:325)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:317)\n\t\tat io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1377)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:346)\n\t\tat io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:325)\n\t\tat io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:907)\n\t\tat io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:125)\n\t\tat io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:177)\n\t\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)\n\t\tat io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)\n\t\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)\n\t\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)\n\t\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)\n\t\tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\t\tat io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\t\tat java.base/java.lang.Thread.run(Thread.java:840)\nCaused by: java.net.SocketException: Connection reset\n\tat java.base/sun.nio.ch.SocketChannelImpl.throwConnectionReset(SocketChannelImpl.java:394)\n\tat java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:426)\n\tat io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:255)\n\tat io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132)\n\tat io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:357)\n\tat io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)\n\tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat java.base/java.lang.Thread.run(Thread.java:840)\n"}

Expected behavior

Successful login every time we use SSO

Your installation details

  1. 4de0d53

kafka: clusters: - name: test bootstrapServers: "***" logging: level: "debug" io.kafbat.ui: DEBUG org.springframework.http.codec.json.Jackson2JsonEncoder: DEBUG org.springframework.http.codec.json.Jackson2JsonDecoder: DEBUG reactor.netty.http.server.AccessLog: DEBUG org.springframework.security: DEBUG auth: type: OAUTH2 oauth2: client: test: clientId: "***" clientSecret: "***" scope: ["openid", "email", "groups"] client-name: github authorization-grant-type: authorization_code authorization-uri: https://***/auth redirect-uri: https://***/login/oauth2/code/test provider: "***" user-name-attribute: email token-uri: https://***/token issuer-uri: https://***/ custom-params: type: oauth roles-field: groups rbac: roles:

Steps to reproduce

Since it happens really randomly it is hard to provide steps to reproduce, it happens sometimes when using SSO with custom OIDC and Github. Tried using older versions as well, with every version it happened.

Screenshots

No response

Logs

No response

Additional context

No response

@ognjenVlad ognjenVlad added status/triage Issues pending maintainers triage type/bug Something isn't working labels Jul 8, 2024
@kapybro kapybro bot added status/triage/manual Manual triage in progress status/triage/completed Automatic triage completed and removed status/triage Issues pending maintainers triage labels Jul 8, 2024
Copy link

github-actions bot commented Jul 8, 2024

Hi ognjenVlad! 👋

Welcome, and thank you for opening your first issue in the repo!

Please wait for triaging by our maintainers.

As development is carried out in our spare time, you can support us by sponsoring our activities or even funding the development of specific issues.
Sponsorship link

If you plan to raise a PR for this issue, please take a look at our contributing guide.

@Haarolean
Copy link
Member

which oidc/oauth provider do you use?

@ognjenVlad
Copy link
Author

ognjenVlad commented Jul 15, 2024

Dex https://github.com/dexidp/dex, which doesn't log any errors, authentication is succesful. Just 500 Connection reset occurs on kafbat side.

@Haarolean
Copy link
Member

Could you please try adding this env var
SERVER_REACTIVE_SESSION_TIMEOUT: "86400"
and observing if there are any changes?

@ognjenVlad
Copy link
Author

ognjenVlad commented Jul 17, 2024

It is still happening, but seems like it is harder to reproduce if that makes sense. Thanks

@Haarolean
Copy link
Member

We'd need a minimal reproducible example to be able to reproduce and fix this (if there's anything to fix).
Feel free to use our keycloak setup example if needed.

@Haarolean Haarolean added status/feedback-requested and removed status/triage/manual Manual triage in progress labels Aug 1, 2024
Copy link

kapybro bot commented Aug 1, 2024

Further user feedback is requested. Please reply within 7 days or we might close the issue.

Copy link

kapybro bot commented Aug 8, 2024

No feedback received within 7 days. Auto closing.

@sashaozz
Copy link

This looks like your authorization server might be behind some LoadBalancer which can silently drop idle connections after timeout. For example, behind AWS NLB, which has timeout of 350 seconds.
This is a known problem (e.g. see reactor/reactor-netty#1774) with clients which will pool TCP connections like netty which is used in kafka-ui.

In this case you can try to fix it by instructing netty to remove pooled connections before timeout. E.g. by setting env var
JAVA_OPTS : -Dreactor.netty.pool.maxIdleTime=30000 -Dreactor.netty.pool.maxLifeTime=60000

@Haarolean Haarolean reopened this Sep 11, 2024
@Haarolean Haarolean added the scope/backend Related to backend changes label Sep 11, 2024
@ognjenVlad
Copy link
Author

This actually fixed it! @sashaozz Thank you very much

@Haarolean Haarolean reopened this Oct 10, 2024
@Haarolean Haarolean added the impact/documentation A PR with changes which should be addressed in the documentation label Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact/documentation A PR with changes which should be addressed in the documentation scope/backend Related to backend changes status/triage/completed Automatic triage completed type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants