You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was experimenting with OldestAutoDowning strategy and stumbled upon a behavior when multiple nodes get into unreachable state at the "same" time - or within a "stable-after" window.
In that case OldestAutoDowning will down one node only, the events for other nodes will not get processed and things are stuck in "Leader can currently not perform its duties" state forever.
2018-04-11 15:23:04 [INFO ] [a.c.Cluster(akka://somecluster)] - Cluster Node [akka.tcp://[email protected]:2552] - Leader can currently not perform its duties,
reachability status: [
akka.tcp://[email protected]:2552 -> akka.tcp://[email protected]:60202: Unreachable [Unreachable] (9),
akka.tcp://[email protected]:2552 -> akka.tcp://[email protected]:60265: Unreachable [Unreachable] (8)
], member status: [
akka.tcp://[email protected]:2552 Up seen=true,
akka.tcp://[email protected]:60202 Up seen=false,
akka.tcp://[email protected]:60265 Up seen=false
]
Seems to me in this case
CustomAutoDownBase#downPendingUnreachableMembers will never get called in that case as its called from OldestAutoDownBase#onMemberRemoved only and somehow that will not happen.
How to reproduce:
start node with "master" role
start two other nodes not having "master" role
kill the two nodes
=> one gets downed, the other one not, "Leader can currently not perform its duties" state forever
Is this a bug or a feature ?
Thx !
The text was updated successfully, but these errors were encountered:
kkolman
changed the title
OldestAutoDowning
OldestAutoDowning behavior with multiple nodes going unreachable at the same time
Apr 11, 2018
@kkolman
I hope my implementation can help improve your Scala skills.
The motivation I created this split brain resolver was I wanted to learn how Akka cluster works.
I am not a active developpr now, but thanks to other contributors who kindly updated Scala and Akka versions, I think this project is still a good playground.
Hi !
I was experimenting with OldestAutoDowning strategy and stumbled upon a behavior when multiple nodes get into unreachable state at the "same" time - or within a "stable-after" window.
In that case OldestAutoDowning will down one node only, the events for other nodes will not get processed and things are stuck in "Leader can currently not perform its duties" state forever.
These are the settings:
Seems to me in this case
CustomAutoDownBase#downPendingUnreachableMembers will never get called in that case as its called from OldestAutoDownBase#onMemberRemoved only and somehow that will not happen.
How to reproduce:
Is this a bug or a feature ?
Thx !
The text was updated successfully, but these errors were encountered: