Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue in 'writeMessages' which doesn't update high sequence value in certain cases #98

Closed
rajeshkalaria80 opened this issue Apr 9, 2021 · 5 comments
Milestone

Comments

@rajeshkalaria80
Copy link

rajeshkalaria80 commented Apr 9, 2021

Versions used

akka-persistence-dynamodb: 1.1.1
Akka version: 2.6.10

Expected Behavior

When using 'persistAll' or 'persistAllAsync' with more than one events, If lowestSequenceNr is 100, 200, 300 etc, then it should update the high sequence record. It shouldn't allow overwriting of events too (akka/akka-persistence-dynamodb#64)

Actual Behavior

https://github.com/akka/akka-persistence-dynamodb#higher-throughput-structure

When writing an item we typically do not touch the high sequence number storage, only when writing an item with sort key 0 is this done.
The dynamodb writeMessages logic seems to have a bug:

If AtomicWrite payload contains only one event
    If (sequenceNr % 100 == 0)
        seq = sequenceNr / 100
        SH-$persistenceId-${seq%SequenceShards}
        the high seqValue written is = seq * 100
Else (more than one event)
    val low = atomicWrite.lowestSequenceNr
    val high = atomicWrite.highestSequenceNr
    if (low / 100 != high / 100) then only it will write/touch the high sequence number
        seq = highestSequenceNr / 100
        SH-$persistenceId-${seq%SequenceShards}
        the high seqValue written is = seq * 100

The issue

  • Because of this above code block, if actor starts/recovers with last sequence number 99, 199, 299 etc, and then it writes more than one events via 'persistAll' or 'persitsAllAsync' (such that lowestSequenceNr would be 100, 200 or 300 etc and highest sequence number would be greater than lowestSequenceNr say 101, 201 or 301 etc), it doesn't/won't update the high sequence record
  • and then next time when it starts, it doesn't recover all events which were persisted and because of that its lastSequenceNr value is less than what it should have been
  • and then when actor tries to persists any event, it overwrites events and ends up with 'invalid event replayed' warning message and so the event stream is corrupted and actor doesn't behave as expected.
@coreyoconnor
Copy link
Member

Thanks for the clear report! Definitely concerning.

@rajeshkalaria80 rajeshkalaria80 changed the title issue in ' writeMessages' which doesn't update high sequence value in certain cases issue in 'writeMessages' which doesn't update high sequence value in certain cases Apr 27, 2021
@spangaer
Copy link
Contributor

spangaer commented Oct 13, 2021

Ok.

This branch introduces a test that fails on given problem (after some modifications to make things run in VSCode)
https://github.com/spangaer/akka-persistence-dynamodb/tree/v1.1.x_issue-98_test

This branch does the minimal fix, which prevents the problem from occurring. So addresses the root cause.
https://github.com/spangaer/akka-persistence-dynamodb/tree/v1.1.x_issue-98_minimal-fix

This branch makes a fix that can recover event sources that suffer from this problem but have not been corrupted by overwrites. (overwrites we can't fix). It does not contain the first fix because that allows showing that indeed fixes "the problem" too, so successfully recovers from the error, even though that still occurs (needs to be explicitly enabled by a config flag BTW)
https://github.com/spangaer/akka-persistence-dynamodb/tree/v1.1.x_issue-98_tail-chasing-fix

This branch contains all of the above. So adds both root cause and recovery fix.
https://github.com/spangaer/akka-persistence-dynamodb/tree/v1.1.x_issue-98

@johanandren johanandren added this to the 1.1.2 milestone Jun 7, 2022
@johanandren
Copy link
Member

I think this was fixed #100 and now part of the 1.1.2 release

@spangaer
Copy link
Contributor

spangaer commented Jun 7, 2022

It seems to be, but shouldn't that version have been released as 1.2.0?

@johanandren
Copy link
Member

My bad, I looked at the changes and thought a patch release was good and minor didn't need a bump, but I noticed the 1.2.0 milestone after the release was already done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants