Proposal for new interfaces assign strategy #1

noblepaul · 2020-07-02T00:42:38Z

No description provided.

sigram · 2020-07-02T10:16:13Z

solr/solrj/src/java/org/apache/solr/common/assign/ReplicaAssignStrategy.java

+    interface AddOperation extends Operation {
+        String targetNode();
+        String collection();
+        String slice();


Let's use "shard" everywhere, there's already too much confusion between slice and shard.

sigram · 2020-07-02T10:17:51Z

solr/solrj/src/java/org/apache/solr/common/assign/ReplicaAssignStrategy.java

+    interface MoveOperation extends AddOperation {
+
+        String fromNode();
+        String replicaName();


What is the replicaName? there's also a lot of confusion around this currently. Is it the core name? or coreNode name? Let's clarify it and use consistently.

coreNodeName

is coreNodeName global per cluster, or just the collection? should we use core name instead, which is unique?

coreNodeName is the global name. That's unique for a given collection

sigram · 2020-07-02T10:18:34Z

solr/solrj/src/java/org/apache/solr/common/assign/ReplicaAssignStrategy.java

+     */
+    interface MoveOperation extends AddOperation {
+
+        String fromNode();


We should have an optional toNode if the requestor knows exactly where it wants to put the replica.

that must go into the hints

sigram · 2020-07-02T10:22:22Z

solr/solrj/src/java/org/apache/solr/common/assign/ReplicaAssignStrategy.java

+        /**Generic method invocation endpoint for v2 APIs
+         *
+         */
+        String endPoint();


What's the end point here? /c/collections? Do we have any others? maybe it should be a default method.

yeah, mostly . But there are other sub paths that support collection admin ops as well

sigram · 2020-07-02T10:23:48Z

solr/solrj/src/java/org/apache/solr/common/assign/ReplicaAssignStrategy.java

+    /**
+     * This operation adds a replica to a given collection/shard
+     */
+    interface AddOperation extends Operation {


Why interface here? I would expect a concrete class that implements interface methods.

Make the impl opaque

sigram · 2020-07-02T10:25:42Z

solr/solrj/src/java/org/apache/solr/common/assign/ReplicaAssignStrategy.java

+        /** The payload. This will be serialized to JSON and will be psosted to SOlr
+         *
+         */
+        MapWriter payload();


Maybe this should be an abstract class, or at least provide a default implementation of this method to write out the common payload elements.

Operation extends MapWriter would make sense too.

sigram · 2020-07-02T10:29:17Z

solr/solrj/src/java/org/apache/solr/common/assign/ReplicaAssignStrategy.java

+/**
+ * The implementation class can be stored in clusterprops.json as follows
+ * {
+ *     assign-strategy : {


It should be possible to load and use multiple named strategies. I can imagine a common default strategy that uses Policy framework for some collections, and an override in collection config to use another strategy for other collections where Policy doesn't work well.

I would strongly recommend against that. A violation for one strategy may not be same for another

sigram · 2020-07-02T10:31:13Z

solr/solrj/src/java/org/apache/solr/common/assign/ReplicaAssignStrategy.java

+    /**
+     * get a list of operations that can be performed for the intents.
+     *
+     * There should be a one-to-one mapping between intent and operation


Then we need a NO_OP too. We also need to somehow report that the Intent could not be satisfied, maybe an INVALID op?

Yes, I thought of that . Forgot to add

sigram · 2020-07-02T10:32:35Z

solr/solrj/src/java/org/apache/solr/common/assign/ReplicaAssignStrategy.java

+        /**
+         * Moves a replica
+         */
+        MOVE(CollectionParams.CollectionAction.MOVEREPLICA);


DELETEREPLICA too, for scenarios when you dynamically scale down.

sigram · 2020-07-02T10:36:22Z

solr/solrj/src/java/org/apache/solr/common/assign/ReplicaAssignStrategy.java

+    void init(SolrCloudManager cloudManager);
+
+    /**
+     * get a list of operations that can be performed for the intents.


"can" or "will"? This is an important difference in semantics, because if you know you will perform these ops then you should make reservations for the positions. "Can" is much weaker, it doesn't lock out the slots for other concurrent computations.

We do not know if the invoker "will" execute these operations.

In other words, the assumption that this method is using the current cluster state that was provided to it and itself it does not make any reservations.

(Just clarifying ... I think that's ok, we could provide it a cluster view that already includes other operations in progress).

murblanc · 2020-07-02T10:39:25Z

My main concern with the proposal is that the plugin code now depends on SolrCloudManager and Suggester.Hint. From these classes/interfaces it then sees ClusterStateProvider, ClusterState (that you wanted to remove Noble) etc.

It's going to either be hard or painful to change code (Autoscaling or other related to cluster state etc) when contributions are not living in the same repository or might not be accessible to us at all and can't be refactored at the same time.

Rough example of what I mean when I suggest to separate what the new assignment plugins can touch vs existing code.

murblanc · 2020-07-02T11:39:03Z

Just committed to this branch what the beginning of what I'm thinking of would look like...

sigram · 2020-07-02T12:50:50Z

solr/core/src/java/org/apache/solr/cloud/api/collections/IlanProposalAssigner.java

+
+    int getCountReplicas();
+    long getTotalSizeBytes();
+    long getFreeSizeBytes();


We need much more information that that here - sysprops, JVM and OS metrics, etc.

sigram · 2020-07-02T12:52:37Z

solr/core/src/java/org/apache/solr/cloud/api/collections/IlanProposalAssigner.java

+
+  interface AssignerCollectionState {
+    String getCollectionName();
+    Iterator<String> getSliceNames();


Please, let's not do this again ... :) Let's just use "shard" everywhere, it's already bad as it is.

I'm fine with that. Shall we aim to get rid of Slice in general in Solr code?

That would be great but it's another issue. I already filed Jira for removing ReplicaInfo, we have Replica that serves exactly the same role.

No sure which is better. The shard is a an abused term. Solr code is littered with both

Whichever we use, lets just use one consistently. Shard is abused? it always refers to a partition of the collection. Adding Slice was a mistake IMHO, but that was done early on when the terminology was still in flux. Nowadays in my experience people commonly use "shard" not "slice".

sigram · 2020-07-02T12:54:14Z

solr/core/src/java/org/apache/solr/cloud/api/collections/IlanProposalAssigner.java

+    long getFreeSizeBytes();
+  }
+
+  interface AssignerReplicaInfo {


Similarly, we need much more info here, too - things like replica state and replica metrics - stuff like physical size, number of docs, recent use, etc etc.

Yes @sigram we actually need to make a comprehensive list of data needed to compute operations

sigram · 2020-07-02T12:57:00Z

solr/core/src/java/org/apache/solr/cloud/api/collections/IlanProposalAssigner.java

+    String getCollectionName();
+    Iterator<String> getSliceNames();
+
+    Iterator<AssignerReplicaInfo> getSliceReplicasInfo(String sliceName);


I understand the desire to hide as much as possible how these are constructed, but we will frequently need things like size(), so maybe better to use Collection here and in similar places.

My motivation to provide Iterator was for cases when it's likely we don't need the whole set/collection (lazy building). Maybe there aren't such cases.

sigram · 2020-07-02T12:58:00Z

solr/core/src/java/org/apache/solr/cloud/api/collections/IlanProposalAssigner.java

+  interface AssignementRequest {
+  }
+
+  interface AddReplicaAssignmentRequest {


These probably can be concrete classes.

The requests should not, no reason to expose an implementation to the plugin.
Maybe the decisions could be concrete classes that the plugin would instantiate though.

interfaces are in general better

sigram · 2020-07-02T12:58:57Z

solr/core/src/java/org/apache/solr/cloud/api/collections/IlanProposalAssigner.java

+    Replica.Type getType();
+  }
+
+  interface AssignmentDecision {


Why not pass in the original AssignmentRequest here? this will be needed for logging, tracking and debugging anyway.

I was not assuming a one to one correspondance between assignement requests and decisions. The calling code (Solr calling the plugin) knows what it just passed in the call to computeAssignements() when it gets the list of AssignmentDecision back.

Even if it's not always a 1:1 correspondence, in cases when it is it would be good to know what was the reason for this decision.

Is it possible that a request may not have a corresponding decision ?

Or the other way around: a request having multiple decisions? Like create collection could result in multiple replica creations. Just making this up as I write, but it doesn't sound impossible.

I don't see a problem in either case - we should be able to tie any decisions on the output to a particular intent on the input. In 1 -> N case each Decision would simply refer to the same Intent. In 1 -> null we have a NO_OP Decision. Is N -> M even realistic? I don't think so at the moment.

sigram · 2020-07-02T12:59:29Z

solr/core/src/java/org/apache/solr/cloud/api/collections/IlanProposalAssigner.java

+  }
+
+  interface AddReplicaAssignmentDecision extends AssignmentDecision {
+    String getNodeName();


The other properties could come from the AssignmentRequest referenced here.

noblepaul · 2020-07-03T01:00:42Z

My main concern with the proposal is that the plugin code now depends on SolrCloudManager and Suggester.Hint.

Yes. @murblanc That Hint enum will be copied over to the new API . I just reused it for sake of simplicity

I'm +1 for making a lighter interface instead of SolrCloudManager

noblepaul · 2020-07-16T10:10:11Z

@murblanc

I like your proposal. Why don't we make this formal and make a ticket ?

murblanc · 2020-07-16T12:11:20Z

@noblepaul give me a few more days please. I'd like to have a better global view of current Autoscaling to make sure this "stop gap" option (that's how I see my proposal) does go in the right longer term direction.

noblepaul added 6 commits July 2, 2020 10:41

scratch

6a94ac2

javadocs

d050d2f

javadocs

82d73f7

AL header

82bb8d7

javadocs

9fa184f

javadocs

1540d19

sigram reviewed Jul 2, 2020

View reviewed changes

Create IlanProposalAssigner.java

8547290

Rough example of what I mean when I suggest to separate what the new assignment plugins can touch vs existing code.

sigram reviewed Jul 2, 2020

View reviewed changes

Proposal for new interfaces assign strategy #1

Are you sure you want to change the base?

Proposal for new interfaces assign strategy #1

Conversation

noblepaul commented Jul 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

murblanc commented Jul 2, 2020 • edited Loading

murblanc commented Jul 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

noblepaul commented Jul 3, 2020

noblepaul commented Jul 16, 2020

murblanc commented Jul 16, 2020

murblanc commented Jul 2, 2020 •

edited

Loading