Skip to content

Releases: thanos-io/thanos

v0.37.0

25 Nov 12:03
v0.37.0
889d527
Compare
Choose a tag to compare

v0.37.0 is now out!

We have some really interesting features this time around, with several improvements across components, a new replication protocol for Receivers, and even fixes for Prometheus v3! Do take a look at some of the breaking changes below!

Thank you to all contributors who have contributed to this release. It wouldn't be possible without you! 💜

Please try it out and let us know if you find any issues! 🚀

Changelog

Fixed

  • #7511 Query Frontend: fix doubled gzip compression for response body.
  • #7592 Ruler: Only increment thanos_rule_evaluation_with_warnings_total metric for non PromQL warnings.
  • #7614 *: fix debug log formatting.
  • #7492 Compactor: update filtered blocks list before second downsample pass.
  • #7658 Store: Fix panic because too small buffer in pool.
  • #7643 Receive: fix thanos_receive_write_{timeseries,samples} stats
  • #7644 fix(ui): add null check to find overlapping blocks logic
  • #7814 Store: label_values: if matchers contain name=="something", do not add != "" to fetch less postings.
  • #7679 Query: respect store.limit.* flags when evaluating queries
  • #7821 Query/Receive: Fix coroutine leak introduced in #7796.
  • #7843 Query Frontend: fix slow query logging for non-query endpoints.
  • #7852 Query Frontend: pass "stats" parameter forward to queriers and fix Prometheus stats merging.
  • #7832 Query Frontend: Fix cache keys for dynamic split intervals.
  • #7885 Store: Return chunks to the pool after completing a Series call.
  • #7893 Sidecar: Fix retrieval of external labels for Prometheus v3.0.0.
  • #7903 Query: Fix panic on regex store matchers.
  • #7915 Store: Close block series client at the end to not reuse chunk buffer

Added

  • #7763 Ruler: use native histograms for client latency metrics.
  • #7609 API: Add limit param to metadata APIs (series, label names, label values).
  • #7429: Reloader: introduce TolerateEnvVarExpansionErrors to allow suppressing errors when expanding environment variables in the configuration file. When set, this will ensure that the reloader won't consider the operation to fail when an unset environment variable is encountered. Note that all unset environment variables are left as is, whereas all set environment variables are expanded as usual.
  • #7560 Query: Added the possibility of filtering rules by rule_name, rule_group or file to HTTP api.
  • #7652 Store: Implement metadata API limit in stores.
  • #7659 Receive: Add support for replication using Cap'n Proto. This protocol has a lower CPU and memory footprint, which leads to a reduction in resource usage in Receivers. Before enabling it, make sure that all receivers are updated to a version which supports this replication method.
  • #7853 UI: Add support for selecting graph time range with mouse drag.
  • #7855 Compcat/Query: Add support for comma separated replica labels.
  • #7654 *: Add '--grpc-server-tls-min-version' flag to allow user to specify TLS version, otherwise default to TLS 1.3
  • #7854 Query Frontend: Add --query-frontend.force-query-stats flag to force collection of query statistics from upstream queriers.
  • #7860 Store: Support hedged requests
  • #7924 *: Upgrade promql-engine to v0.0.0-20241106100125-097e6e9f425a and objstore to v0.0.0-20241111205755-d1dd89d41f97
  • #7835 Ruler: Add ability to do concurrent rule evaluations
  • #7722 Query: Add partition labels flag to partition leaf querier in distributed mode

Changed

  • #7494 Ruler: remove trailing period from SRV records returned by discovery dnsnosrva lookups
  • #7567 Query: Use thanos resolver for endpoint groups.
  • #7741 Deps: Bump Objstore to v0.0.0-20240913074259-63feed0da069
  • #7813 Receive: enable initial TSDB compaction time randomization
  • #7820 Sidecar: Use prometheus metrics for min timestamp
  • #7886 Discovery: Preserve results from other resolve calls
  • #7669 Receive: Change quorum calculation for rf=2

Removed

  • #7704 *: breaking ⚠️ remove Store gRPC Info function. This has been deprecated for 3 years, its time to remove it.
  • #7793 Receive: Disable dedup proxy in multi-tsdb
  • #7678 Query: Skip formatting strings if debug logging is disabled

New Contributors

Full Commit History: v0.35.1...v0.37.0-rc.0

v0.37.0-rc.0

19 Nov 11:50
v0.37.0-rc.0
0256823
Compare
Choose a tag to compare
v0.37.0-rc.0 Pre-release
Pre-release

The first release candidate of v0.37.0 is out!

We have some really interesting features this time around, with several improvements across components, a new replication protocol for Receivers, and even fixes for Prometheus v3! Do take a look at some of the breaking changes below!

Thank you to all contributors who have contributed to this release. It wouldn't be possible without you! 💜

Please try it out and let us know if you find any issues! 🚀

Changelog

Fixed

  • #7511 Query Frontend: fix doubled gzip compression for response body.
  • #7592 Ruler: Only increment thanos_rule_evaluation_with_warnings_total metric for non PromQL warnings.
  • #7614 *: fix debug log formatting.
  • #7492 Compactor: update filtered blocks list before second downsample pass.
  • #7658 Store: Fix panic because too small buffer in pool.
  • #7643 Receive: fix thanos_receive_write_{timeseries,samples} stats
  • #7644 fix(ui): add null check to find overlapping blocks logic
  • #7814 Store: label_values: if matchers contain name=="something", do not add != "" to fetch less postings.
  • #7679 Query: respect store.limit.* flags when evaluating queries
  • #7821 Query/Receive: Fix coroutine leak introduced in #7796.
  • #7843 Query Frontend: fix slow query logging for non-query endpoints.
  • #7852 Query Frontend: pass "stats" parameter forward to queriers and fix Prometheus stats merging.
  • #7832 Query Frontend: Fix cache keys for dynamic split intervals.
  • #7885 Store: Return chunks to the pool after completing a Series call.
  • #7893 Sidecar: Fix retrieval of external labels for Prometheus v3.0.0.
  • #7903 Query: Fix panic on regex store matchers.
  • #7915 Store: Close block series client at the end to not reuse chunk buffer

Added

  • #7763 Ruler: use native histograms for client latency metrics.
  • #7609 API: Add limit param to metadata APIs (series, label names, label values).
  • #7429: Reloader: introduce TolerateEnvVarExpansionErrors to allow suppressing errors when expanding environment variables in the configuration file. When set, this will ensure that the reloader won't consider the operation to fail when an unset environment variable is encountered. Note that all unset environment variables are left as is, whereas all set environment variables are expanded as usual.
  • #7560 Query: Added the possibility of filtering rules by rule_name, rule_group or file to HTTP api.
  • #7652 Store: Implement metadata API limit in stores.
  • #7659 Receive: Add support for replication using Cap'n Proto. This protocol has a lower CPU and memory footprint, which leads to a reduction in resource usage in Receivers. Before enabling it, make sure that all receivers are updated to a version which supports this replication method.
  • #7853 UI: Add support for selecting graph time range with mouse drag.
  • #7855 Compcat/Query: Add support for comma separated replica labels.
  • #7654 *: Add '--grpc-server-tls-min-version' flag to allow user to specify TLS version, otherwise default to TLS 1.3
  • #7854 Query Frontend: Add --query-frontend.force-query-stats flag to force collection of query statistics from upstream queriers.
  • #7860 Store: Support hedged requests
  • #7924 *: Upgrade promql-engine to v0.0.0-20241106100125-097e6e9f425a and objstore to v0.0.0-20241111205755-d1dd89d41f97
  • #7835 Ruler: Add ability to do concurrent rule evaluations
  • #7722 Query: Add partition labels flag to partition leaf querier in distributed mode

Changed

  • #7494 Ruler: remove trailing period from SRV records returned by discovery dnsnosrva lookups
  • #7567 Query: Use thanos resolver for endpoint groups.
  • #7741 Deps: Bump Objstore to v0.0.0-20240913074259-63feed0da069
  • #7813 Receive: enable initial TSDB compaction time randomization
  • #7820 Sidecar: Use prometheus metrics for min timestamp
  • #7886 Discovery: Preserve results from other resolve calls
  • #7669 Receive: Change quorum calculation for rf=2

Removed

  • #7704 *: breaking ⚠️ remove Store gRPC Info function. This has been deprecated for 3 years, its time to remove it.
  • #7793 Receive: Disable dedup proxy in multi-tsdb
  • #7678 Query: Skip formatting strings if debug logging is disabled

New Contributors

Full Commit History: v0.35.1...v0.37.0-rc.0

v0.36.1

13 Aug 11:54
v0.36.1
99a5742
Compare
Choose a tag to compare

This patch release brings a few fixes! Please try it out and let us know if you face issues! 🚀

Changelog

Fixed

  • #7634 Rule: fix Query and Alertmanager TLS configurations with CA only.
  • #7618 Proxy: Query goroutine leak when store.response-timeout is set

v0.36.0

31 Jul 15:41
v0.36.0
cfff551
Compare
Choose a tag to compare

v0.36.0 is out now!

Thank you to all contributors who have contributed to this release. It wouldn't be possible without you!

Please try it out and let us know if you find any issues! 🚀

Changelog

Fixed

  • #7326 Query: fixing exemplars proxy when querying stores with multiple tenants.
  • #7403 Sidecar: fix startup sequence
  • #7484 Proxy: fix panic in lazy response set

Added

  • #7317 Tracing: allow specifying resource attributes for the OTLP configuration.
  • #7367 Store Gateway: log request ID in request logs.
  • #7361 Query: breaking ⚠️ pass query stats from remote execution from server to client. We changed the protobuf of the QueryAPI, if you use query.mode=distributed you need to update your client (upper level Queriers) first, before updating leaf Queriers (servers).
  • #7363 Query-frontend: set value of remote_user field in Slow Query Logs from HTTP header
  • #7335 Dependency: Update minio-go to v7.0.70 which includes support for EKS Pod Identity.
  • #7477 *: Bump objstore to 20240622095743-1afe5d4bc3cd

Changed

  • #7334 Compactor: do not vertically compact downsampled blocks. Such cases are now marked with no-compact-mark.json. Fixes panic panic: unexpected seriesToChunkEncoder lack of iterations.
  • #7393 *: breaking ⚠️ Using native histograms for grpc middleware metrics. Metrics grpc_client_handling_seconds and grpc_server_handling_seconds will now be native histograms, if you have enabled native histogram scraping you will need to update your PromQL expressions to use the new metric names.

New Contributors

Full Changelog: v0.35.1...v0.36.0

v0.36.0-rc.1

18 Jul 10:11
v0.36.0-rc.1
8511649
Compare
Choose a tag to compare
v0.36.0-rc.1 Pre-release
Pre-release

The second release candidate of v0.36.0 is out!

We include a server gRPC histogram fix in this release.

Thank you to all contributors who have contributed to this release. It wouldn't be possible without you!

Please try it out and let us know if you find any issues! 🚀

Changelog

Fixed

  • #7326 Query: fixing exemplars proxy when querying stores with multiple tenants.
  • #7403 Sidecar: fix startup sequence
  • #7484 Proxy: fix panic in lazy response set

Added

  • #7317 Tracing: allow specifying resource attributes for the OTLP configuration.
  • #7367 Store Gateway: log request ID in request logs.
  • #7361 Query: breaking ⚠️ pass query stats from remote execution from server to client. We changed the protobuf of the QueryAPI, if you use query.mode=distributed you need to update your client (upper level Queriers) first, before updating leaf Queriers (servers).
  • #7363 Query-frontend: set value of remote_user field in Slow Query Logs from HTTP header
  • #7335 Dependency: Update minio-go to v7.0.70 which includes support for EKS Pod Identity.
  • #7477 *: Bump objstore to 20240622095743-1afe5d4bc3cd

Changed

  • #7334 Compactor: do not vertically compact downsampled blocks. Such cases are now marked with no-compact-mark.json. Fixes panic panic: unexpected seriesToChunkEncoder lack of iterations.
  • #7393 *: breaking ⚠️ Using native histograms for grpc middleware metrics. Metrics grpc_client_handling_seconds and grpc_server_handling_seconds will now be native histograms, if you have enabled native histogram scraping you will need to update your PromQL expressions to use the new metric names.

New Contributors

Full Changelog: v0.35.1...v0.36.0-rc.0

0.36.0-rc.0

26 Jun 17:34
v0.36.0-rc.0
c930d2e
Compare
Choose a tag to compare
0.36.0-rc.0 Pre-release
Pre-release

The first release candidate of v0.36.0 is out!

We have mostly dependency bumps and bugfixes but some minor breaking changes, please see the changelog below for details.

Thank you to all contributors who have contributed to this release. It wouldn't be possible without you!

Please try it out and let us know if you find any issues! 🚀

Changelog

Fixed

  • #7326 Query: fixing exemplars proxy when querying stores with multiple tenants.
  • #7403 Sidecar: fix startup sequence
  • #7484 Proxy: fix panic in lazy response set

Added

  • #7317 Tracing: allow specifying resource attributes for the OTLP configuration.
  • #7367 Store Gateway: log request ID in request logs.
  • #7361 Query: breaking ⚠️ pass query stats from remote execution from server to client. We changed the protobuf of the QueryAPI, if you use query.mode=distributed you need to update your client (upper level Queriers) first, before updating leaf Queriers (servers).
  • #7363 Query-frontend: set value of remote_user field in Slow Query Logs from HTTP header
  • #7335 Dependency: Update minio-go to v7.0.70 which includes support for EKS Pod Identity.
  • #7477 *: Bump objstore to 20240622095743-1afe5d4bc3cd

Changed

  • #7334 Compactor: do not vertically compact downsampled blocks. Such cases are now marked with no-compact-mark.json. Fixes panic panic: unexpected seriesToChunkEncoder lack of iterations.
  • #7393 *: breaking ⚠️ Using native histograms for grpc middleware metrics. Metrics grpc_client_handling_seconds and grpc_server_handling_seconds will now be native histograms, if you have enabled native histogram scraping you will need to update your PromQL expressions to use the new metric names.

New Contributors

Full Changelog: v0.35.1...v0.36.0-rc.0

v0.35.1

28 May 14:13
v0.35.1
086a698
Compare
Choose a tag to compare

This patch release bring a few fixes to all components and addresses a security concern! Please try it out and let us know if you face issues! 🚀

Changelog

Fixed

  • #7323 Sidecar: wait for prometheus on startup
  • #6948 Receive: fix goroutines leak during series requests to thanos store api.
  • #7382 *: Ensure objstore flag values are masked & disable debug/pprof/cmdline
  • #7392 Query: fix broken min, max for pre 0.34.1 sidecars
  • #7373 Receive: Fix stats for remote write
  • #7318 Compactor: Recover from panic to log block ID

Full Changelog: v0.35.0...v0.35.1

v0.35.0

02 May 11:02
v0.35.0
d9a0efa
Compare
Choose a tag to compare

v0.35.0 is out now!
We have several amazing features this time, including distributed query execution, receive tenant-label based request splitting, better query analysis, and loads of bugfixes and optimizations!

Thank you to all contributors who have contributed to this release. It wouldn't be possible without you!

Please try it out and let us know if you find any issues! 🚀

Changelog

Fixed

  • #7083 Store Gateway: Fix lazy expanded postings with 0 length failed to be cached.
  • #7080 Receive: race condition in handler Close() when stopped early
  • #7132 Documentation: fix broken helm installation instruction
  • #7134 Store, Compact: Revert the recursive block listing mechanism introduced in #6474 and use the same strategy as in 0.31. Introduce a --block-discovery-strategy flag to control the listing strategy so that a recursive lister can still be used if the tradeoff of slower but cheaper discovery is preferred.
  • #7122 Store Gateway: Fix lazy expanded postings estimate base cardinality using posting group with remove keys.
  • #7166 Receive/MultiTSDB: Do not delete non-uploaded blocks
  • #7179 Query: Fix merging of query analysis
  • #7224 Query-frontend: Add Redis username to the client configuration.
  • #7220 Store Gateway: Fix lazy expanded postings caching partial expanded postings and bug of estimating remove postings with non existent value. Added PromQLSmith based fuzz test to improve correctness.
  • #7225 Compact: Don't halt due to overlapping sources when vertical compaction is enabled
  • #7244 Query: Fix Internal Server Error unknown targetHealth: "unknown" when trying to open the targets page.
  • #7248 Receive: Fix RemoteWriteAsync was sequentially executed causing high latency in the ingestion path.
  • #7271 Query: fixing dedup iterator when working on mixed sample types.
  • #7289 Query Frontend: show warnings from downstream queries.
  • #7308 Store: Batch TSDB Infos for blocks.

Added

  • #7155 Receive: Add tenant globbing support to hashring config
  • #7231 Tracing: added missing sampler types
  • #7194 Downsample: retry objstore related errors
  • #7105 Rule: add flag --query.enable-x-functions to allow usage of extended promql functions (xrate, xincrease, xdelta) in loaded rules
  • #6867 Query UI: Tenant input box added to the Query UI, in order to be able to specify which tenant the query should use.
  • #7186 Query UI: Only show tenant input box when query tenant enforcement is enabled
  • #7175 Query: Add --query.mode=distributed which enables the new distributed mode of the Thanos query engine.
  • #7199 Reloader: Add support for watching and decompressing Prometheus configuration directories
  • #7200 Query: Add --selector.relabel-config and --selector.relabel-config-file flags which allows scoping the Querier to a subset of matched TSDBs.
  • #7233 UI: Showing Block Size Stats
  • #7256 Receive: Split remote-write HTTP requests via tenant labels of series
  • #7269 Query UI: Show peak/total samples in query analysis
  • #7280 *: Adding User-Agent to request logs
  • #7219 Receive: add --remote-write.client-tls-secure and --remote-write.client-tls-skip-verify flags to stop relying on grpc server config to determine grpc client secure/skipVerify.
  • #7297 *: mark as not queryable if status is not ready
  • #7302 Considering the X-Forwarded-For header for the remote address in the logs.
  • #7304 Store: Use loser trees for merging results

Changed

  • #7123 Rule: Change default Alertmanager API version to v2.
  • #7192 Rule: Do not turn off ruler even if resolving fails
  • #7223 Automatic detection of memory limits and configure GOMEMLIMIT to match.
  • #7283 Compact: breaking ⚠️ Replace group with resolution in compact downsample metrics to avoid cardinality explosion with large numbers of groups.
  • #7305 Query|Receiver: Do not log full request on ProxyStore by default.

New Contributors

Full Commit History: v0.34.1...v0.35.0-rc.0

v0.35.0-rc.0

29 Apr 14:12
v0.35.0-rc.0
bcad1e1
Compare
Choose a tag to compare
v0.35.0-rc.0 Pre-release
Pre-release

The first release candidate of v0.35.0 is out!
We have several amazing features this time, including distributed query execution, receive tenant-label based request splitting, better query analysis, and loads of bugfixes and optimizations!

Thank you to all contributors who have contributed to this release. It wouldn't be possible without you!

Please try it out and let us know if you find any issues! 🚀

Changelog

Fixed

  • #7083 Store Gateway: Fix lazy expanded postings with 0 length failed to be cached.
  • #7080 Receive: race condition in handler Close() when stopped early
  • #7132 Documentation: fix broken helm installation instruction
  • #7134 Store, Compact: Revert the recursive block listing mechanism introduced in #6474 and use the same strategy as in 0.31. Introduce a --block-discovery-strategy flag to control the listing strategy so that a recursive lister can still be used if the tradeoff of slower but cheaper discovery is preferred.
  • #7122 Store Gateway: Fix lazy expanded postings estimate base cardinality using posting group with remove keys.
  • #7166 Receive/MultiTSDB: Do not delete non-uploaded blocks
  • #7179 Query: Fix merging of query analysis
  • #7224 Query-frontend: Add Redis username to the client configuration.
  • #7220 Store Gateway: Fix lazy expanded postings caching partial expanded postings and bug of estimating remove postings with non existent value. Added PromQLSmith based fuzz test to improve correctness.
  • #7225 Compact: Don't halt due to overlapping sources when vertical compaction is enabled
  • #7244 Query: Fix Internal Server Error unknown targetHealth: "unknown" when trying to open the targets page.
  • #7248 Receive: Fix RemoteWriteAsync was sequentially executed causing high latency in the ingestion path.
  • #7271 Query: fixing dedup iterator when working on mixed sample types.
  • #7289 Query Frontend: show warnings from downstream queries.
  • #7308 Store: Batch TSDB Infos for blocks.

Added

  • #7155 Receive: Add tenant globbing support to hashring config
  • #7231 Tracing: added missing sampler types
  • #7194 Downsample: retry objstore related errors
  • #7105 Rule: add flag --query.enable-x-functions to allow usage of extended promql functions (xrate, xincrease, xdelta) in loaded rules
  • #6867 Query UI: Tenant input box added to the Query UI, in order to be able to specify which tenant the query should use.
  • #7186 Query UI: Only show tenant input box when query tenant enforcement is enabled
  • #7175 Query: Add --query.mode=distributed which enables the new distributed mode of the Thanos query engine.
  • #7199 Reloader: Add support for watching and decompressing Prometheus configuration directories
  • #7200 Query: Add --selector.relabel-config and --selector.relabel-config-file flags which allows scoping the Querier to a subset of matched TSDBs.
  • #7233 UI: Showing Block Size Stats
  • #7256 Receive: Split remote-write HTTP requests via tenant labels of series
  • #7269 Query UI: Show peak/total samples in query analysis
  • #7280 *: Adding User-Agent to request logs
  • #7219 Receive: add --remote-write.client-tls-secure and --remote-write.client-tls-skip-verify flags to stop relying on grpc server config to determine grpc client secure/skipVerify.
  • #7297 *: mark as not queryable if status is not ready
  • #7302 Considering the X-Forwarded-For header for the remote address in the logs.
  • #7304 Store: Use loser trees for merging results

Changed

  • #7123 Rule: Change default Alertmanager API version to v2.
  • #7192 Rule: Do not turn off ruler even if resolving fails
  • #7223 Automatic detection of memory limits and configure GOMEMLIMIT to match.
  • #7283 Compact: breaking ⚠️ Replace group with resolution in compact downsample metrics to avoid cardinality explosion with large numbers of groups.
  • #7305 Query|Receiver: Do not log full request on ProxyStore by default.

New Contributors

Full Commit History: v0.34.1...v0.35.0-rc.0

v0.34.1

20 Feb 12:00
v0.34.1
4cf1559
Compare
Choose a tag to compare

This patch release includes a fix for CVE-2023-44478, thanks @hanyuting8!

Changelog

Fixed

  • #7078 *: Bump gRPC to 1.57.2

Added

Changed

Removed

Full Changelog: v0.34.0...v0.34.1