-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
duplicate key error (E11000) #218
Comments
Supposition (based on real observation) :
is it really possible ? I seen another problem : on the first persist failure (due to timeout) is is "normal" that circuits breaker open. But on the second one, we have an issue on a particular "resource" not on all database accesses and circuit breaker will probably open and impact all database operations, could we avoid that ? Is it event possible with akka persistence plugin API ? |
Hum ok, in fact it's a well known subject :
Thus it may depends on write/read concern, and read preference on mongodb to be able to see it's own write... |
Which write concern are you using? I've never seen this behavior, but I don't have a super high rate of messages or especially recovery. I do use journaled write concern. |
Acknowledge Read preference is read on primary by default. it is overridden by the plug-in ? |
I don't think the plugin does anything to actively touch read preference. Acknowledged write concern does have some consistency tradeoffs for performance, enough to scare me off from using that mode in production for whatever that's worth. |
Ok. Am I right ? The question is then : how to implement : "defer reading the highest sequence number until all outstanding writes have completed" |
I think the best path is to retry Since only the mongo server knows when all outstanding writes have completed, and the plugin does not control how/when the library ( |
I don't think we can retry persist: in doc of A solution that I see could be to use write-concern=majority for all writes and read-concern=linearisable when reading highest sequence number. This would cover that cases where write was acknoledged, for the race condition where it times out, we could just erase the previous event on second persist (after duplicate-key error). This solution is certainly not ideal because it would mask some other errors. We may ask to @patriknw how it is managed in cassandra plugin ? I see that there is a data structure |
Ah yes of course - good point. You could exponential backoff if you had a request/reply command protocol on the outside, but not inside the PA. The advantage of this is that the
Is your replica set flapping / doing master re-elections frequently? If not, is there a way to figure out why the database is so slow as to cause the circuit breaker to time out? Perhaps the logs, if slow logging is enabled. I know if the mongo working set is bigger than available RAM it bogs down pretty seriously. |
We don't have re-elections frequently, I was just trying to find a solution where overwriting could be acceptable (and thus in all cases - even with master changes - if first write was acknowledged it is not acceptable to overwrite). I'm still working on this but it seems this we have a part of persist failures due to other causes, (probably a problem with akka-cluster-sharding that ends with 2 PA with the same persistent id at the same time...) I'm still investigating, but this part has probably no link with the plugin. |
Ouch - ok, so a network partition split-brain. That would definitely cause constraint failures even by itself. Are you using a cluster downing library to take down isolated nodes? There's the lightbend subscription that provides one, and I think there's also a few OSS ones on github. |
We have our own implementation of split brain resolver. (implemented before OSS one is available) |
It's right that the |
Ok. Our main issue was a bad configuration of serializers that made journal event not replayed (only snapshots) for internal akka cluster sharding data and lead to complex situations... But the issue described initially here remains present and I will continue to work on it. |
[Edited] problem seems not linked to scala driver.
I have "duplicate key error (E11000)" errors :
(build with openjdk8, running on openjdk11)
version 2.2.2
("com.github.scullxbones" %% "akka-persistence-mongo-scala" % "2.2.2")
I did not took time to investigate for now.
I guess stack trace will not help bu just in case:
The text was updated successfully, but these errors were encountered: