-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CASSANDRA-19744 Accord migration and interop correctness #3395
base: cep-15-accord
Are you sure you want to change the base?
Conversation
…ally split and reroute if necessary Mostly complete batch + hint support checkpoint here checkpoint again accord branch more checkpoint More checkpoint Checkpoint checkpoint checkpoint fixes Update .gitmodules, make CASMultiDCTest loopable
a4fcbd8
to
17b6f85
Compare
17b6f85
to
a4468ad
Compare
…failure, various doc and bug fixes:
7245fc8
to
96a7efd
Compare
@@ -122,7 +122,7 @@ public long preAcceptTimeout() | |||
@Override | |||
public Txn emptyTxn(Kind kind, Seekables<?, ?> seekables) | |||
{ | |||
return new Txn.InMemory(kind, seekables, TxnRead.EMPTY, TxnQuery.EMPTY, null); | |||
return new Txn.InMemory(kind, seekables, TxnRead.EMPTY, TxnQuery.UNSAFE_EMPTY, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we rename this method to emptySystemTxn
or emptyUnsafeTxn
or something so people know that it doesn't offer all the normal guardrails
@@ -89,7 +89,7 @@ private CommitLogPosition addToCommitLog(Mutation mutation) | |||
if (update.metadata().params.memtable.factory().writesShouldSkipCommitLog()) | |||
ids.add(update.metadata().id); | |||
} | |||
mutation = mutation.without(ids); | |||
mutation = mutation.filter(ids::contains); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right. If I'm reading this right then when we encounter mixed writesShouldSkipCommitLog
mutations we'll only write commit log entries for tables which aren't supposed to be written to the commit log?
* like Accord that can't safely read data that is written non-transactionally. | ||
* | ||
*/ | ||
default boolean allowPotentialTransactionConflicts() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a hard time parsing this comment
TODO: propose something easier to understand
@@ -140,7 +140,7 @@ public boolean isWorthMergingForRangeQuery(ReplicaCollection merged, ReplicaColl | |||
} | |||
Thread.sleep(1500); | |||
|
|||
Statement stmt = new SimpleStatement(String.format("SELECT DISTINCT token(id, id2), id, id2 FROM %s", table)); | |||
Statement stmt = new SimpleStatement(String.format("SELECT DISTINCT token(id, id2), id, id2 FROM %s WHERE token(id, id2) >= -9223372036854775808", table)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's up with the token bit you added here?
} | ||
|
||
private int addMutations(int version, List<ByteBuffer> serializedMutations) throws IOException | ||
private int addMutations(List<Mutation> unsplitMutations, int version, List<ByteBuffer> serializedMutations) throws IOException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this (and addMutation
) can be made static if you pass writtenAt
in
Mutation mutation = hint.mutation; | ||
// Also may need to apply locally because it's possible this is from the batchlog | ||
// and we never applied it locally | ||
// TODO (review): Additional error handling necessary? Hints are lossy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, they're best effort
this.accordTxnStartNanos = accordTxnStartNanos; | ||
this.accordTxnResult = accordTxnResult; | ||
if (accordTxnResult != null) | ||
accordTxnResult.right.addListener(this, ImmediateExecutor.INSTANCE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we register as a result/throwable consumer so we are notified if the txn fails unexpectedly?
@@ -112,6 +112,10 @@ public ConsistencyLevel commitCLForStrategy(ConsistencyLevel consistencyLevel) | |||
return consistencyLevel; | |||
} | |||
|
|||
// TODO (required): This won't work for migration directly from none to full because there is no safe system to read from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What a migration from none
to full
effectively ran accord in mixed_reads
mode during migration?
@@ -294,8 +329,20 @@ public void onFailure(InetAddressAndPort from, RequestFailure failure) | |||
if (blockFor() + n > candidateReplicaCount()) | |||
signal(); | |||
|
|||
if (hintOnFailure != null && StorageProxy.shouldHint(replicaPlan.lookup(from))) | |||
StorageProxy.submitHint(hintOnFailure.get(), replicaPlan.lookup(from), null); | |||
// If the failure was RETRY_ON_DIFFERENT_TRANSACTION_SYSTEM then we only want to hint once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't we want to retry with accord instead of writing a hint?
throw new RetryOnDifferentSystemException(); | ||
} | ||
|
||
public static boolean tokenShouldBeWrittenThroughAccord(@Nonnull ClusterMetadata cm, @Nonnull TableMetadata tm, @Nonnull Token token) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be passing in the TableId
and using it to retrieve the table metadata from the provided ClusterMetadata
. Passing in table metadata directly like this makes it possible to read table params from a different epoch than the cm passed in, which I think is possible in invocation in BlockingReadRepair
CASSANDRA-19744
There are several different fixes/changes here.
cep-15-accord
specific breakage of handlingCoordinatorBehind
responsesCoordinatorBehind
errors no longer fail operations, the coordinator will retryTODO CoordinatorBehind retry could retry operations that are not idempotent, but may have already succeededTODO Check that barriers of keys check the key was actually barrieredTODO The batchlog failed routing test doesn't actually experience failed routing it's also a timeout, but it should be possible to trigger a failed routing by doing a transaction on the key and then allowing the out of sync coordinator to catch up and be forced to execute the transaction in the new epoch.