Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report]: BatchProduceAsync seems to produce items out or order #594

Open
1 task done
AlexeyRaga opened this issue Sep 10, 2024 · 2 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@AlexeyRaga
Copy link

AlexeyRaga commented Sep 10, 2024

Prerequisites

  • I have searched issues to ensure it has not already been reported

Description

It looks like BatchProduceAsync can produce message out-of-order.
It happens from time to time, and the order doesn't seem to be deterministic.
Most of the time it does seem to produce messages in the right order, but not always.

Steps to reproduce

I was using this code to reproduce:

private sealed record Batch(Guid Id, List<ISpecificRecord> Messages);

private static Batch CreateBatch()
{
    var id = Guid.NewGuid();
    var messages = new List<ISpecificRecord>
    {
        new UserJoined { id = id, order = 1 },
        new UserDeactivated { id = id, order = 2 },
        new UserActivated { id = id, order = 3 }
    };

    return new(id, messages);
}

[Fact]
public async Task Should_preserve_order_of_messages()
{
    var batchesToSend = Enumerable.Range(0, 10).Select(_ => CreateBatch()).ToList();

    foreach (var batch in batchesToSend)
    {
        var toProduce =
            batch.Messages
                .Select(x => new BatchProduceItem(
                    fixture.TopicName,
                    batch.Id.ToString(),
                    x,
                    new MessageHeaders()))
                .ToList();

        await fixture.Producer.BatchProduceAsync(toProduce);
    }

    // assertion is omitted for brevity 
}

I then look at the topic itself and see that some batches come out of order.

Expected behavior

I expect that the messages order is guaranteed by the batch producer all the time.

Actual behavior

Most often messages are coming in the right order: 1, 2, 3 but sometimes they are written in the wrong order.

KafkaFlow version

v3.0.10

@AlexeyRaga AlexeyRaga added the bug Something isn't working label Sep 10, 2024
@AlexeyRaga
Copy link
Author

AlexeyRaga commented Sep 10, 2024

Is it possible that the problem can be in how middlewares are handled?

I see that BatchProduceAsync executes Produce synchronously for each message, separately.

But it internally it does _middlewareExecutor.Execute which returns Task. This Task is dropped on the floor.
This could mean that multiple Tasks are just running in parallel, calling the "real" producer deep inside, and by chance can finish out-of-order.

If I am right, then this execution model is unsound.

@massada
Copy link
Contributor

massada commented Sep 21, 2024

That is exactly why is not guaranteeing in order producing. For it to guarantee in order producing, the code would need to wait until InternalProduce is called (this method pushes the message to the driver buffer).

It should be very easy to reproduce this issue by adding a producing middleware with a descending timeout and see an inverse order of producing.

Let's say 3 messages with descending timeouts of 1s, 500ms, 0s.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants