-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IGNITE-22315 Make raft-client starting only once and only with raft-client and replica together #3956
Conversation
# Conflicts: # modules/replicator/src/main/java/org/apache/ignite/internal/replicator/ReplicaManager.java # modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java
if (localMemberAssignment == null) { | ||
// (0) in case if node not in the assignments | ||
return nullCompletedFuture(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make an empty line here, pls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
// TODO: will be replaced with replica usage in https://issues.apache.org/jira/browse/IGNITE-22218 | ||
RaftGroupService raftGroupService; | ||
try { | ||
raftGroupService = internalTable.tableRaftService().partitionRaftGroupService(partitionId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this try-catch now? Could you place a comment, why the simple continue is enough? Shouldn't we write the log with warn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test's logic is go through every partition and try to get raft client from TableRaftService
if presented. The method TableRaftService#partitionRaftGroupService
says in javadoc: IgniteInternalException if partition can't be found.
. So, the only way for now safely check that there no raft client is to catch the exception. The good news is this code is already removed in the next related PR, so, I guess we can leave it now?
...egrationTest/java/org/apache/ignite/internal/runner/app/ItIgniteInMemoryNodeRestartTest.java
Outdated
Show resolved
Hide resolved
} catch (IgniteInternalException e) { | ||
return false; | ||
} | ||
return raftClient != null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add empty lines here and there :) according to the recommendation "The source code should be broken into minimal semantic units and all such units must be separated by one empty line. To simplify the recognition of semantic units, every line of the source code should be separated by one empty line" from the https://cwiki.apache.org/confluence/display/IGNITE/Coding+Guidelines#CodingGuidelines-SemanticUnits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrote it a little
@@ -713,7 +713,7 @@ public TableViewInternal startTable(String tableName, SchemaDescriptor schemaDes | |||
storageIndexTracker, | |||
completedFuture(listener) | |||
); | |||
} catch (NodeStoppingException e) { | |||
} catch (Throwable e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it can be not only node stopping now, but any NPE for example and etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The try-catch
is removed it looks like excessive there.
# Conflicts: # modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java
...egrationTest/java/org/apache/ignite/internal/runner/app/ItIgniteInMemoryNodeRestartTest.java
Outdated
Show resolved
Hide resolved
modules/replicator/src/main/java/org/apache/ignite/internal/replicator/ReplicaManager.java
Outdated
Show resolved
Hide resolved
...egrationTest/java/org/apache/ignite/internal/runner/app/ItIgniteInMemoryNodeRestartTest.java
Show resolved
Hide resolved
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java
Outdated
Show resolved
Hide resolved
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java
Outdated
Show resolved
Hide resolved
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java
Outdated
Show resolved
Hide resolved
modules/table/src/main/java/org/apache/ignite/internal/table/distributed/TableManager.java
Show resolved
Hide resolved
@@ -713,9 +711,6 @@ public TableViewInternal startTable(String tableName, SchemaDescriptor schemaDes | |||
storageIndexTracker, | |||
completedFuture(listener) | |||
); | |||
} catch (NodeStoppingException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
Do you mean that will through unchecked exception instead? Which one if true? Or there's no longer network communication on replica creation and thus it's not possible to get NodeStoppingException?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There deprecated startReplica
is used that doesn't throw NodeStoppingException
explicitly == there are no calls that throw the checked exception, but the true one startReplica
does, but isn't used in the test and will be fixed in IGNITE-22373 | Delete startReplica
overloads from ReplicaManager
...rc/integrationTest/java/org/apache/ignite/internal/rebalance/ItRebalanceDistributedTest.java
Outdated
Show resolved
Hide resolved
…gAssignmentEvent` branch
…balance` (don't know actually why)
…mentEvent` branch
…tract the common method
…updatePartitionClients`
JIRA Ticket: IGNITE-22315 | Make raft-client starting only once and only with raft-client and replica together
The goal
The goal of the ticket is to remove excessive raft-clients starting.
The reason
If a local node is in assignments, then and only then we should start a whole replication group, that consists of raft-node, raft-client and replica itself. Every entity is a single copy per group and neither should be started outside of the group. Before this ticket every partition started a raft-client.
The solution
ReplicaManager#startReplica
chains all raft-entities starting together with replica starting to a common future.TableManager#startPartitionAndStartClient
is refactored due to simplicity of the code.Abandoned PR due to unstable tests flaking is closed