Skip to content

A testing 🧰 library for Kafka-based applications

License

Notifications You must be signed in to change notification settings

rcardin/kafkaesque

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Java Maven Central GitHub Package Registry version

🐛 Kafkaesque

Kafkaesque is a test library whose aim is to make the experience in testing Kafka application less painful. By now, the project is in its early stage, defining the API that we will implement in the near future.

Every help will be very useful :)

The library allows to test the following use cases:

Use Case 1: The Application Produces Some Messages on a Topic

The first use case tests the messages produced by an application, reading them from the topic. The code, producing the messages, is external to Kafkaesque. Through Kafkaesque, it is possible to assert some properties on the messages generated by the application.

Kafkaesque
  .at("broker:port")
  .<Key, Value>consume()
  .fromTopic("topic-name")
  .withDeserializers(keyDeserializer, valueDeserializer)
  .waitingAtMost(10, SECONDS)
  .waitingEmptyPolls(2, 50L, MILLISECONDS)
  .expectingConsumed()
  .havingRecordsSize(3) // <-- from here we use a ConsumedResult
  .havingHeaders(headers -> {
    // Assertions on headers
  })
  .havingKeys(keys -> {
    // Assertions on keys
  })
  .havingPayloads(payloads -> {
    // Asserions on payloads
  })
  .havingConsumerRecords(records -> {
    // Assertions on the full list of ConsumerRecord<Key, Value>
  })
  .assertingThatPayloads(contains("42")) // Uses Hamcrest.Matchers on collections :)
  .andCloseConsumer();

Use Case 2: The Application Consumes Some Messages from a Topic

The second use case tests an application that reads messages from a topic. Kafkaesque is responsible to produce such messages to trigger the execution of the application. It is also possible to assert conditions on the system state after the consumption of the messages.

Kafkaesque
  .at("broker:port")
  .<Key, Value>produce()
  .toTopic("topic-name")
  .withDeserializers(keyDeserializer, valueDeserializer)
  .messages( /* Some list of messages, eventually with headers */)
  .waitingAtMostForEachAck(100, MILLISECONDS) // Waiting time for each ack from the broker
  .waitingForTheConsumerAtMost(10, SECONDS) // Waiting time for the consumer to read one / all the messages
  .andAfterAll()
  .asserting(messages -> {
    // Assertions on the consumer process after the sending of all the messages
  });

An equivalent method pipeline is available to test assertions after the consumption of each message:

Kafkaesque
  .at("broker:port")
  .<Key, Value>produce()
  .toTopic("topic-name")
  .withDeserializers(keyDeserializer, valueDeserializer)
  .messages( /* Some list of messages, eventually with headers */)
  .waitingAtMostForEachAck(100, MILLISECONDS) // Waiting time for each ack from the broker
  .waitingForTheConsumerAtMost(10, SECONDS) // Waiting time for the consumer to read one / all the messages
  .andAfterEach()
  .asserting(message -> {
    // Assertions on the consumer process after the sending of each message
  });

Use Case 3: Synchronize on Produced or Consumed Messages and Test Them Outside Kafkaesque

The kafka-streams-test-utils testing library offers to developers some useful and powerful abstractions. Indeed, the TestInputTopic and the TestOutputTopic let developers manage asynchronous communication with a broker as it is fully synchronous. In this case, the library does not start any broker, not even embedded.

Kafkaesque offers to developers the same abstractions, trying to achieve the same synchronous behavior, using the yolo.Kfksq class.

var kfksq = Kfksq.at("broker:port");
var inputTopic = kfksq.createInputTopic("inputTopic", keySerializer, valueSerializer);
inputTopic.pipeInput("key", "value");

var outputTopic = kfksq.createOutputTopic("outputTopic", keyDeserializer, valueDeserializer);
var records = outputTopic.readRecordsToList();

Modules

Core module

The Kafkaesque library contains many submodules. The kafkaesque-core module contains the interfaces and agnostic concrete classes offering the above fluid API. Add the following dependency to your pom.xml file to use module:

<dependency>
  <groupId>in.rcard</groupId>
  <artifactId>kafkaesque-core</artifactId>
  <version>0.2.0</version>
  <scope>test</scope>
</dependency>

In detail, the kafkaesque-core module uses the Awaitility Java library to deal with the asynchronicity nature of each of the above use cases.

Configuration

Kafkaesque also supports internal producers and consumers configuration via an external configuration file. Kafkaesque can read multiple file formats. The available ones are the HOCON file format, JSON format, and Java properties format.

The configurations must be prefixed with kafkaesque.consumer for consumers. The available configuration are:

  • group-id
  • auto-offset-reset
  • enable-auto-commit
  • auto-commit-interval
  • client-id
  • fetch-max-wait
  • fetch-min-size
  • isolation-level
  • max-poll-records

The configurations must be prefixed with kafkaesque.producer for producers, instead. The available configuration are:

  • client-id
  • retries
  • acks
  • batch-size
  • buffer-memory
  • compression-type

You can pass the path to the file using the withConfiguration method available both for consumers and producers. Here is an example:

Kafkaesque
  .at("broker:port")
  .<Key, Value>produce()
  .toTopic("topic-name")
  .withDeserializers(keyDeserializer, valueDeserializer)
  .withConfiguration("path-to-the-file.conf")
  .messages( /* Some list of messages */)
  .waitingAtMostForEachAck(100, MILLISECONDS) // Waiting time for each ack from the broker
  .waitingForTheConsumerAtMost(10, SECONDS) // Waiting time for the consumer to read one / all the messages
  .andAfterAll()
  .asserting(messages -> {
    // Assertions on the consumer process after the sending of all the messages
  });

The path is relative to the /src/test/resources folder.

An example of configuration file could be the following. The file contains the configurations for both producers and consumers:

kafkaesque {
  consumer {
    group-id: "kfksq-test-consumer"
    client-id: "kfksq-client-id"
    auto-commit-interval: 5000
    auto-offset-reset: "earliest"
    enable-auto-commit: false
    fetch-max-wait: 500
    fetch-min-size: 1
    heartbeat-interval: 3000
    isolation-level: "read_uncommitted"
    max-poll-records: 500
  }

  producer {
    acks: "all"
    batch-size: 16384
    buffer-memory: 33554432
    client-id: "kfksq-client-id"
    compression-type: "none"
    retries: 2147483647
  }
}