- fix: avoid partition revocation on paused consumers (#391) by @lynnagara
- ref(batching): add compute_batch_size to BatchStep (#390) by @MeredithAnya
- ref(rust/run_task): Remove unnecessary boxing, make FnMut, change return type (#388) by @untitaker
- fix: Allow sending committable from Unfold (#371) by @mj0nez
- all-repos: update actions/upload-artifact to v4 (#381) by @joshuarli
- Remove non-existent @getsentry/processing from CODEOWNERS (#386) by @onkar
- chore: Fix release builds (#385) by @untitaker
- Add a basic metric for tracking the capacity in VecDeque buffer (#383) by @ayirr7
- feat: enhance metrics defs (#378) by @mj0nez
- feat: Add From<BrokerMessage<_>> impl for InvalidMessage (#377) by @evanpurkhiser
- feat: Add Noop processing strategy (#376) by @evanpurkhiser
- Update RunTask to receive a Message (#375) by @evanpurkhiser
- hotfix, fix master ci (66f1efc3) by @untitaker
- fix: add guard to Produce.poll to ensure next_step is called regardless of produce queue (#370) by @mj0nez
- ref: Add pre-commit hook for rustfmt (#364) by @untitaker
- update accumulator sig to return Result instead of TResult (#359) by @john-z-yang
- ref: Use coarsetime consistently (#366) by @untitaker
- ref(rust): Backpressure metrics for threadpools (#367) by @untitaker
- ref(reduce): Refactor for timeout=0 (#363) by @untitaker
- ref(rust): Remove strategy.close (#361) by @untitaker
- ref(rust): Add join-time metric for threadpools (#362) by @untitaker
- rust: add more rust logging (#351) by @dbanda
- fixes #353: return message.payload (#354) by @mwarkentin
- feat(header): Implement find method on headers (#350) by @nikhars
- feat: make default auto.offset.reset earliest (#349) by @lynnagara
- fix: Enable stats collection (#348) by @phacops
- ref(reduce): Allow to reduce to non-cloneable values (#346) by @untitaker
- build(deps): bump black from 22.3.0 to 24.3.0 (#343) by @dependabot
- feat: confluent-kafka-python 2.3.0 (#344) by @lynnagara
- meta: Update codeowners (#345) by @lynnagara
- ref: Metric definition (#341) by @lynnagara
- feat: Ingest a metric for rdkafka queue size (#342) by @phacops
- feat: Increase backpressure threshold to 5 seconds (#340) by @lynnagara
- ref(metrics): Add pause/resume counters [INC-626] (#338) by @untitaker
- perf: inline now calling
coarsetime::Instant
(#336) by @anonrig
- Add flexible Reduce strategy type (#333) by @cmanallen
- fix(release): Stop releasing to crates.io while it doesn't work (#335) by @untitaker
- feat(logs): Add partition info upon assignment and revocation (#334) by @ayirr7
- ref: Restore owners on rust-arroyo (#330) by @untitaker
- fix: Rename Rust arroyo metric since it has a different unit (#329) by @untitaker
- fix(rust): Add feature to Reduce to flush empty batches (#332) by @untitaker
- ref: Add release workflow for Rust (#327) by @untitaker
- ref(ci): Add Rust CI (#324) by @untitaker
- Revert "ref: Move rust-arroyo from snuba to arroyo" (#325) by @untitaker
- ref: Move rust-arroyo from snuba to arroyo (#323) by @untitaker
- move rust-arroyo to subdirectory (#323) by @untitaker
- fix(multiprocessing): Implement better error messages for block overflow (#322) by @untitaker
- rust: add rust concurrency metric (#5341) by @untitaker
- ref(rust): Don't panic in RunTaskInThreads::poll (#5387) by @untitaker
- deps(rust): Change rdkafka dep to upstream master (#5386) by @untitaker
- ref(metrics): Refactor how global tags work, and introduce min_partition tag (#5346) by @untitaker
- Add Metrics impl based on
merni
(#5351) by @untitaker - fix: Remove any panics in threads (#5353) by @untitaker
- ref(devserver): Use Rust consumers almost everywhere, and fix commitlog implementation (#5311) by @untitaker
- Avoid calling
Topic::new
for every received Message (#5331) by @untitaker - Reuse Tokio Handle in DlqPolicy (#5329) by @untitaker
- ref(rust): Log actual error if strategy panics (#5317) by @untitaker
- fix(rust): add testcase for empty batches (#5299) by @untitaker
- Revert "ref(rust): Do not do extra work when merging if not needed (#5294)" (#323) by @untitaker
Plus 146 more
- fix: Make
arroyo.consumer.latency
a timing metric (#316) by @lynnagara
- fix(dlq): Ensure consumer crashes if DLQ limit is reached (#314) by @lynnagara
- fix(multiprocessing): Reset pool if tasks are not completed (#315) by @lynnagara
- fix(filter): Handle MessageRejected correctly (INC-584) (#313) by @untitaker
- feat: Reusable multiprocessing pool (#311) by @lynnagara
- ref: Combine DlqLimitState methods (#312) by @loewenheim
- ref: Add more metrics for slow rebalancing (#310) by @untitaker
- fix(dlq): Actually respect the DLQ limits (#309) by @lynnagara
- feat: Enable automatic latency recording for consumers (#308) by @lynnagara
- fix: Default input block size should be larger (#306) by @untitaker
- fix: Attempt to fix InvalidStateError (#305) by @lynnagara
- ref: Rename backpressure metric (#303) by @lynnagara
- ref: Add extra logging for debugging commits (#302) by @lynnagara
- fix(dlq): RunTaskWithMultiprocessing supports forwarding downstream invalid message (#301) by @lynnagara
- fix: Cleanup backpressure state between assignments (#299) by @lynnagara
- fix: Temporarily bring back support for legacy commit log format (#298) by @lynnagara
- fix: Revert change when consumer is paused (#297) by @lynnagara
- perf: Avoid unnecessarily clearing the rdkafka buffer on backpressure (#296) by @lynnagara
- feat: Add optional received_p99 timestamp to commit log (#295) by @lynnagara
- doc: Describe max_batch_size/max_batch_time in our strategies (#294) by @untitaker
- ref: Timestamp is not optional on commit log (#293) by @lynnagara
- feat(metrics): Add a metric for number of invalid messages encountered (#292) by @lynnagara
- fix(dlq): Gracefully handle case of no valid messages (#291) by @nikhars
- ref: Use float in commit codec instead of complex datetime format (#290) by @lynnagara
- ref: A better commit log format (#289) by @lynnagara
- Revert "Add coverage report to CI (#286)" (#288) by @lynnagara
- feat: Remove pointless (and wrong) config check (#287) by @lynnagara
- Add coverage report to CI (#286) by @dbanda
- Make producer thread a deamon (#282) by @dbanda
- fix(dlq): Make InvalidMessage pickleable (#284) by @nikhars
- fix(produce): Apply backpressure instead of crashing (#281) by @lynnagara
- feat: Automatically resize blocks if they get too small (#270) by @untitaker
- fix: Ensure carried over message is in buffer (#283) by @lynnagara
- docs: Actually fix the getting started (#280) by @lynnagara
- feat: Add global shutdown handler for strategy factory (#278) by @untitaker
- fix(run_task_in_threads): Commit offsets even when consumer is idle (#276) by @untitaker
- add socket.timeout.ms to supported kafka configurations (#275) by @hubertdeng123
- release: 2.14.2 (5f8ee08a) by @getsentry-bot
- No documented changes.
- fix(reduce): Add missing call to next_step.terminate() (#272) by @untitaker
- metrics: Add metrics about dropped messages in FilterStep (#265) by @ayirr7
- fix: Logo dimensions on mobile (#268) by @untitaker
- doc: Update dlq doc (#267) by @lynnagara
- docs: Update goals (#266) by @lynnagara
- doc: Add new logo (#264) by @untitaker
- ref: Remove deprecated strategies (#263) by @lynnagara
- feat: Add liveness healthcheck (#262) by @untitaker
- feat: Bump confluent-kafka-python (#258) by @lynnagara
- fix: Commit offsets even when topic is empty (#259) by @untitaker
- feat: A better default retry policy (#256) by @lynnagara
- feat: Avoid spinning on MessageRejected (#253) by @lynnagara
- feat: Basic support for librdkafka stats (#252) by @untitaker
- add entries to list (#250) by @ayirr7
- ref(multiprocessing): Rename some metrics and add documentation (#241) by @untitaker
- feat(metrics): Allow reconfiguring the metrics backend (#249) by @lynnagara
- ref: Global metrics registry (#245) by @untitaker
- fix: Fix run task in threads (again) (#244) by @lynnagara
- fix(run-task-in-threads): Fix shutdown (#243) by @lynnagara
- fix: Mermaid diagrams in dark mode (#242) by @untitaker
- Revert "feat: add more item size stats to multiprocessing (and metrics refactor) (#235)" (#240) by @untitaker
- fix(multiprocessing): More performant backpressure [INC-378] (#238) by @untitaker
- fix(multiprocessing): Honor join() timeout even if processing pool is overloaded [INC-378] (#237) by @untitaker
- docs: Minor readability details (#236) by @kamilogorek
- feat: add more item size stats to multiprocessing (and metrics refactor) (#235) by @untitaker
- fix: Distinguish between backpressure and output overflow (#234) by @untitaker
- perf(dlq): Improve performance when there are many invalid messages (#233) by @lynnagara
- chore(processor): More logs for shutdown sequences (#230) by @nikhars
- fix: Remove flaky test (#232) by @untitaker
- fix: Do not log error when consumer is already crashing (#228) by @untitaker
- fix(run_task_with_multiprocessing): Do not attempt to skip over invalid messages (#231) by @untitaker
- chore(processor): Add more logging (#229) by @nikhars
- doc: Document how backpressure currently works (#227) by @untitaker
- ref: Remove extraneous codecs from arroyo (#219) by @untitaker
- docs: Organise processing strategies and add to TOC (#226) by @lynnagara
- ref: Remove legacy DLQ implementation (#225) by @lynnagara
- docs: Remove WIP note in DLQ docs (#224) by @lynnagara
- docs: 2023 (#223) by @lynnagara
- feat: Count how many times consumer spins (#222) by @untitaker
- fix: Add metrics around RunTaskWithMultiprocessing + docs [SNS-2204] (#220) by @untitaker
- ref: Clean up documentation (#221) by @untitaker
- fix: Fix RunTaskInThreads to handle InvalidMessage during close, add tests (#218) by @lynnagara
- fix(dlq): Fix crash during join when InvalidMessage in flight (#217) by @lynnagara
- feat: Measure time to join strategy [SNS-2154] (#214) by @untitaker
- feat: Add error instrumentation and timing to arroyo callbacks (#215) by @untitaker
- feat: Add consumer dlq time (#213) by @untitaker
- fix: Do not enter invalid state when strategy.submit raises InvalidMessage (#212) by @untitaker
- feat(dlq): A DLQ implementation that keeps a copy of raw messages in a buffer (#202) by @lynnagara
- fix: Fix installation of sphinx-autodoc-typehints (#211) by @untitaker
- Revert "fix: Prevent repeated calls to consumer.pause() if already paused (#209)" (#210) by @lynnagara
- feat: Codecs can encode and be used outside of strategies (#208) by @lynnagara
- fix: Prevent repeated calls to consumer.pause() if already paused (#209) by @lynnagara
- feat: Make DLQ buffer size configurable (#207) by @lynnagara
- ref: Remove the FileMessageStorage backend (#206) by @lynnagara
- feat: DLQ perf test (#205) by @lynnagara
- ref: Split run task, update strategies docs page (#204) by @lynnagara
- fix: RunTaskWithMultiprocessing support for filtered message (#203) by @lynnagara
- feat: Add unfold strategy (#197) by @lynnagara
- feat(dlq): Make dlq interface generic (#200) by @lynnagara
- feat(dlq): Introduce DLQ docs and interfaces (#195) by @lynnagara
- docs: Document how to use metrics (#198) by @lynnagara
- docs: Update get started section (#196) by @lynnagara
- meta: Update readme to remove some duplicate text (#199) by @lynnagara
- ci: Make tests go faster (#194) by @lynnagara
- docs: Update readme (#192) by @lynnagara
- small grammar improvements (#191) by @barkbarkimashark
- fix(run_task): Fix bug where join() would not actually wait for tasks (#190) by @untitaker
- fix(filter-strategy): Options to commit filtered messages, take 2 (#185) by @untitaker
- fix: Fix multiprocessing join (#189) by @lynnagara
- build: Run CI on Python 3.11 (#188) by @lynnagara
- feat: Configurable strategy join timeout (#187) by @lynnagara
- feat: Log exception if processing strategy exists on assignment (#186) by @lynnagara
- fix(decoder): Fix importing optional dependencies (#184) by @lynnagara
- No documented changes.
- fix(produce): Fix closing next step (#183) by @lynnagara
- fix(reduce): Ensure next step is properly closed (#182) by @lynnagara
- ref: Don't alias ProcessingStrategy as ProcessingStep (#181) by @lynnagara
- ref: upgrade isort to work around poetry breakage (#179) by @asottile-sentry
- fix: Documentation typos (#178) by @markstory
- build: Split avro, json, msgpack into separate modules (#176) by @lynnagara
- feat: Remove CollectStep and ParallelCollectStep (#173) by @lynnagara
- feat(reduce): Record
arroyo.strategies.reduce.batch_time
(#175) by @lynnagara - docs: Add decoders section (#174) by @lynnagara
- build: Make json, msgpack and avro optional dependencies (#172) by @lynnagara
- feat(serializers): Add more decoders (#171) by @lynnagara
- feat(reduce): Support greater flexibility in initial value (#170) by @lynnagara
- auto publish to internal pypi on release (#169) by @asottile-sentry
- feat: Remove support for cooperative-sticky partition assignment strategy (#168) by @lynnagara
- ref(decoder): Remove dead code (#167) by @lynnagara
- ref(commit_policy): Improve performance of commit policy (#162) by @untitaker
- feat(decoders): Json decoder can be used without schema (#166) by @lynnagara
- feat: Add Reduce strategy (#157) by @lynnagara
- ref: Remove
Position
and stop passing timestamps to the commit function (#165) by @lynnagara - feat: Add a schema validation strategy (#154) by @lynnagara
- fix: Ensure Collect/ParallelCollect properly calls next_step methods (#163) by @lynnagara
- feat: Remove the BatchProcessingStrategy and AbstractBatchWorker (#164) by @lynnagara
- test: Fix streamprocessor tests (#160) by @lynnagara
- docs: Call out the run task strategy (#161) by @lynnagara
- fix: RunTaskWithMultiprocessing handles MessageRejected from subsequent steps (#158) by @lynnagara
- ref: Improve typing of CommitOffsets strategy (#159) by @lynnagara
- fix(commit_policy): Calculate elapsed timerange correctly (#155) by @untitaker
- ref: Remove dead code (#153) by @lynnagara
- feat(Collect) Attempt to redesign the Collector step (#127) by @fpacifici
- feat: Collect / ParallelCollect supports a next step (#149) by @lynnagara
- feat: Add RunTaskWithMultiprocessing strategy (#152) by @lynnagara
- feat: Add replace() method to message (#151) by @lynnagara
- feat: Death to the consumer strategy factory (#150) by @lynnagara
- feat: Add RunTask strategy (#147) by @lynnagara
- feat: Split the Message interface to better support batching steps (#134) by @lynnagara
- feat: Remove BatchProcessingStrategyFactory from Arroyo (#138) by @lynnagara
- docs: Add docstring for MessageRejected (#146) by @lynnagara
- feat: Increase log level during consumer shutdown (#144) by @lynnagara
- feat: Remove ProduceAndCommit (#145) by @lynnagara
- feat: Introduce separate commit strategy (#140) by @lynnagara
- feat: Mark ConsumerStrategyFactory/KafkaConsumerStrategyFactory deprecated (#139) by @lynnagara
- docs: Update DLQ docs (#136) by @lynnagara
- fix: Compute offset deltas for commit policy [SNS-1863] (#135) by @untitaker
- fix: Make Position pickleable (#129) by @lynnagara
- feat: Avoid storing entire messages in RunTaskInThreads (#133) by @lynnagara
- feat: Avoid storing entire messages in the produce step (#132) by @lynnagara
- test: Split the collect and transform tests into separate files (#130) by @lynnagara
- test: Simplify the commit codec test (#131) by @lynnagara
- Update actions/upload-artifact to v3.1.1 (#128) by @mattgauntseo-sentry
- feat: Add stream processor metrics (#124) by @lynnagara
- docs: Remove comment about asyncio in architecture (#121) by @lynnagara
- docs: Remove the big pointless black square (#120) by @lynnagara
- docs: Remove invalid release number (#122) by @lynnagara
- feat: Make commit policy a required argument (#116) by @lynnagara
- docs(autodocs) Adds autodoc support and import docstrings in the sphinx doc (#118) by @fpacifici
- Add architecture page (#117) by @fpacifici
- feat: Add RunTaskInThreads strategy (#111) by @lynnagara
- feat: Add produce step to library (#109) by @lynnagara
- ref: Rename variable (#115) by @lynnagara
- ref: Move strategies out of streaming module (#114) by @lynnagara
- feat: Add Message.position_to_commit (#113) by @lynnagara
- feat: Simplify example (#112) by @lynnagara
- feat: Add deprecation comment on batching strategies (#110) by @lynnagara
- build: Run CI on Python 3.10 (#108) by @lynnagara
- build: Include requirements.txt in source distribution (#107) by @lynnagara
- feat: Introduce commit policies (#106) by @lynnagara
- Add parallel collect timeout (#104) by @fpacifici
- move 'what is arroyo for' docs to public docs (#103) by @volokluev
- Link the official documentation to the readme (#102) by @fpacifici
- fix(docs) Fix makefile (#101) by @fpacifici
- Fix branch name in docs generation (#100) by @fpacifici
- feat(docs) Add a
getting started
page to the docs (#99) by @fpacifici
- fix: Use correct log level for fatal crashes [sns-1622] (#97) by @untitaker
- adjust confluent-kafka version to >= (#96) by @asottile-sentry
- upgrade confluent-kafka to 1.9.0 (#95) by @asottile-sentry
- Configure CodeQL (#94) by @mdtro
- deps: Update confluent-kafka-python to 1.8.2 (#92) by @lynnagara
- doc: Remove reference to synchronized consumer in readme (#93) by @lynnagara
- Synchronized consumer deprecated (#81)
- create deprecated in favor of create_with_partitions (#91)
- fix(dlq): InvalidMessages Exception repr (#88) by @rahul-kumar-saini
- feat: Add docker-compose orchestrated example script (#85) by @cmanallen
- fix(dlq): Join method did not handle InvalidMessages (#83) by @rahul-kumar-saini
- always use create_with_partitions (#82) by @MeredithAnya
- feat(processing): Pass through partitions using create_with_partitions (#75) by @MeredithAnya
- ref: Upgrade Kafka and Zookeeper in CI (#80) by @lynnagara
- ref: don't depend on mypy, upgrade and fix mypy (#79) by @asottile-sentry
- docs: Fix docstring (#77) by @lynnagara
- feat: Prevent passing invalid next_offset (#73) by @lynnagara
- fix: Fix offset committed in example (#76) by @lynnagara
- refactor(dlq): Renamed policy "closure" to policy "creator" (#71) by @rahul-kumar-saini
- fix(dlq): Policy closure added + Producer closed (#68) by @rahul-kumar-saini
- fix(dlq): Updated DLQ logic for ParallelTransformStep (#61) by @rahul-kumar-saini
- fix(tests): Make tests pass on M1 Macs (#67) by @mcannizz
- test: Fix flaky test (#66) by @lynnagara
- feat: Add flag to restore confluence kafka auto.offset.reset behavior (#54) by @mitsuhiko
- feat(dlq): Added produce policy (#57) by @rahul-kumar-saini
- feat(dlq): Revamped Invalid Message(s) Model (#64) by @rahul-kumar-saini
- feat: Avoid unnecessarily recreating processing strategy (#62) by @lynnagara
- feat: Run CI on multiple Python versions (#63) by @lynnagara
- feat: Support incremental assignments in stream processor (#58) by @lynnagara
- feat(dlq): InvalidMessage exception refactored to handle multiple invalid messages (#50) by @rahul-kumar-saini
- test: Fix flaky test (#59) by @lynnagara
- feat(consumer): Wrap consumer strategy with DLQ if it exists (#56) by @rahul-kumar-saini
- feat: Bump confluent-kafka-python to 1.7.0 (#55) by @lynnagara
- feat(consumer): Support incremental cooperative rebalancing (#53) by @lynnagara
- feat: Bump confluent kafka to 1.6.1 (#51) by @lynnagara
- ci: Upgrade black version to 22.3.0 (#52) by @lynnagara
- Removed Generic payload from DLQ Policy (#49) by @rahul-kumar-saini
- export InvalidMessage from DLQ (#48) by @rahul-kumar-saini
- Dead Letter Queue (#47) by @rahul-kumar-saini
- Added an example for Arroyo usage. (#44) by @rahul-kumar-saini
- ref(metrics): Add metrics for time spent polling and closing batch (#46) by @nikhars
- chore(parallel_collect): Allow importing ParallelCollectStep (#43) by @nikhars
- perf(collect): Add ParallelCollect step (#41) by @nikhars
- Increase log level (#39) by @fpacifici
- feat(perf) Add latency metrics to the messages coming from the commit log (#38) by @fpacifici
- Number of processes in the multi-process poll metric added. Its key is
transform.processes
.
- Handle missing
orig_message_ts
header. Since all events in the pipeline produced using an older version of arroyo may not have the header yet, temporarily support a None value fororig_message_ts
.
-
Replaces Offset in consumer and processing strategy with Position, which contains both offset and timestamp information.
stage_offsets
is nowstage_positions
andcommit_offsets
is nowcommit_positions
, and now includes the timestamp. -
Add orig_message_ts field to Commit and commit_codec. This field is included in the Kafka payload as a header.
- Add optional initializer function to parallel transform step. Supports passing a custom function to be run on multiprocessing pool initialization.
- First release 🎉
This project follows semver, with three additions:
-
Semver says that major version
0
can include breaking changes at any time. Still, it is common practice to assume that only0.x
releases (minor versions) can contain breaking changes while0.x.y
releases (patch versions) are used for backwards-compatible changes (bugfixes and features). This project also follows that practice. -
All undocumented APIs are considered internal. They are not part of this contract.
-
Certain features may be explicitly called out as "experimental" or "unstable" in the documentation. They come with their own versioning policy described in the documentation.