- A flag -print-secrets to disable redacting config secrets.
- A prometheus remote-write sink compatible with Cortex and more. Thanks, philipnrmn!
- Some veneur-prometheus arguments to rename and add additional tags. Thanks, christopherb-stripe!
- Migrate Prometheus to new config format; part of multi-sink routing update. Thanks, truong-stripe!
- Authentication support for Cortex remote-write sink. Thanks, oscil8!
- Option to flush sinks on shutdown. Thanks, csolidum!
trace.StartTrace
andtrace.StartChildSpan
now scale better across multiple goroutines. Thanks bpowers- Support for mTLS in
veneur-proxy
. Thanks arnavdugar - Support for extended clientside aggregation. Thanks, praboud-stripe!
- Use
T.TempDir
to create temporary directory in tests (#944). - When the request to send data from Cloudwatch & SFX sink fails, log the count of metrics that are dropped.
- A fix for forwarding metrics with gRPC using the kubernetes discoverer. Thanks, androohan!
- Regenerate testing certs/CA that have expired and have broken tests. Thanks, randallm
- The config field
trace_lightstep_access_token
is redacted if printed. Thanks arnavdugar!
- The ability to emit dogstatsd metrics from veneur-emit over plain TCP in addition to the current plain UDP. Thanks, shrivu-stripe!
- A config option and health checking for beginning support of emitting/receiving metrics via gRPC. Thanks, eriwo-stripe!
- A gRPC server that listens for SSF spans and dogstatsd metrics on grpc_listening_addresses. Thanks eriwo-stripe and shrivu-stripe!
- The ability to emit metrics from veneur-emit via the gRPC protocol as well as the option to specify a proxy for those metrics. Thanks eriwo-stripe and shrivu-stripe!
- The "veneur.listen.received_per_protocol_total" metric to be published by global Veneur instances. This is a counter to track the metrics being directly listened for (rather than imported) by protocol. Thanks, eriwo-stripe!
- Log the SignalFX endpoint base on creation of the sink. Useful for grepping across multiple host logs. Thanks rma-stripe!
- Upgrade to Goji 2.0.2 and use the modern Goji mux API. Thanks hans-stripe!
- Increase the timeout of TestCountActiveHandlers from 3 seconds to 10 seconds. Helps some tests pass in certain environments. Thanks sushain97!
- Migrated from dep to Go modules. Clients must now use the updated import path
github.com/stripe/veneur/v14
. Thanks, andybons!
- The SignalFx sink now supports dynamically fetching per-tag Access tokens from SignalFX. Thanks, szabado!
- The Kafka sink now includes metrics for skipped and dropped spans, as well as a debug log line for flushes. Thanks, franklinhu
- A flush "watchdog" controlled by the config setting
flush_watchdog_missed_flushes
. If veneur has not started this many flushes, the watchdog panics and terminates veneur (so it can be restarted by process supervision). Thanks, antifuchs! - Splunk sink: Trace-related IDs are now represented in hexadecimal for cross-tool compatibility and a small byte savings. Thanks, myndzi
- Splunk sink: Indicator spans are now tagged with
"partial":true
when they would otherwise have been sampled, distinguishing between partial and full traces. Thanks, myndzi - New configuration options
veneur_metrics_scopes
andveneur_metrics_additional_tags
, which allow configuring veneur such that it aggregates its own metrics globally (rather than reporting a set of internal metrics per instance/container/etc). Thanks, antifuchs! - New SSF
sample
field:scope
. This field lets clients tell Veneur what to do with the sample - it corresponds exactly to theveneurglobalonly
andveneurlocalonly
tags that metrics can hold. Thanks, antifuchs! - veneur-prometheus now allows you to specify mTLS configuration for the polling HTTP client. Thanks, choo-stripe!
- The
http_quit
config option enables the/quitquitquit
endpoint, which can be used to trigger a graceful shutdown using an HTTP POST request. Thanks, aditya! - New config option
count_unique_timeseries
which is used to emit metricveneur.flush.unique_timeseries_total
, the HyperLogLog cardinality estimate of the unique timeseries in a flush interval. Thanks, randallm! - veneur-emit now allows you to mark SSF spans as having errored. Thanks, randallm!
- The Splunk span sink supports tag exclusion, to better manage Splunk indexing volume. If a span contains a tag that's in the excluded set, the Splunk sink will skip sending that span to Splunk. Thanks, aditya!
- veneur-prometheus now supports Prometheus Untyped metrics. Thanks, kklipsch-stripe!
- veneur-prometheus now accepts a socket parameter for proxied requests. Thanks, kklipsch-stripe!
- The Datadog sink can now filter metric names by prefix with
datadog_metric_name_prefix_drops
. Thanks, kaplanelad! - The Datadog sink can now filter tags by metric names prefix with
datadog_exclude_tags_prefix_by_prefix_metric
. Thanks, kaplanelad! - When specifying the SignalFx key with
signalfx_vary_key_by
, if both the host and the metric provide a value, the metric-provided value will take precedence over the host-provided value. This allows more granular forms of metric organization and attribution. Thanks, aditya! - Support for listening to abstract statsd metrics on Unix Domain Socket(Datagram type). Thanks, androohan!
- Implementation of a Prometheus sink through Statsd Exporter. Thanks, yanske!
- New Relic sink supporting Metrics, Events, Service Checks (as events) and Trace Spans. Thanks, jthurman42!
- Updated the vendored version of DataDog/datadog-go which adds support for sending metrics to Unix Domain socket. Thanks, prudhvi!
- Splunk sink: Downgraded Splunk HEC errors to be logged at warning level, rather than error level. Added a note to clarify that Splunk cluster restarts can cause temporary errors, which are not necessarily problematic. Thanks, aditya!
- Updated the vendored version of github.com/gogo/protobuf which fixes Gopkg.toml conflicts for users of veneur. Thanks, dtbartle!
- Updated server.go to use the aws sdk (https://docs.aws.amazon.com/sdk-for-go/api/aws/session/) when the creds are not set in the config.yaml. Thanks, linuxdynasty!
- Changed the certificates that veneur tests with to include SANs and no longer rely on Common Names, in order to comply with Go's upcoming crackdown on CN certificate constraints. Thanks, antifuchs!
- Disabled the dogstatsd client telemetry on the internal statsd client used by Veneur. Thanks, prudhvi!
- Migrated from the deprecated Sentry package, raven-go, to sentry-go. Thanks, yanske!
- veneur-prometheus now reports incremental counters instead of cumulative counters. This may cause dramatic differences in the statistics reported by veneur-prometheus. Thanks, kklipsch-stripe!
- veneur-emit no longer panics when an empty command is passed. Thanks, shrivu-stripe!
- Fixed a bug that caused some veneur-emit builds to not flush metrics to udp. Thanks, shrivu-stripe!
- Veneur listening on UDS for statsd metrics will respect the
read_buffer_size_bytes
config. Thanks, prudhvi! - The splunk HEC span sink didn't correctly spawn the number of submission workers configured with
splunk_hec_submission_workers
, only spawning one. Now it spawns the number configured. Thanks, antifuchs! - The signalfx sink now correctly constructs ingestion endpoint URLs when given URLs that end in slashes. Thanks, antifuchs!
- Veneur now sets a deadline for its flushes: No flush may take longer than the configured server flush interval. Thanks, antifuchs!
- The signalfx sink no longer deadlocks the flush process if it receives more than one error per submission. Thanks, antifuchs!
- Fixed the README to link to the correct HLL implementation. Thanks, gphat!
- Fixed the BucketRegionError while using the S3 Plugin. Thanks, linuxdynasty!
- A dependency on
github.com/Sirupsen/logrus
from the trace client packagegithub.com/stripe/veneur/trace
. Thanks, antifuchs and samczsun!
- The OpenTracing implementation's
Tracer.Inject
in thetrace
package now sets HTTP headers in a way that tools like Envoy can propagate on traces. Thanks, antifuchs! - SSF packets are now validated to ensure they contain either a valid span or at least one metric. The metric
veneur.worker.ssf.empty_total
tracks the number of empty SSF packets encountered, which indicates a client error. Thanks, tummychow and aditya! - SSF indicator spans can now report an additional "objective" metric, tagged with their service and name. Thanks, tummychow!
- Support for listening to statsd metrics on Unix Domain Socket(Datagram type). Thanks, prudhvi!
- The metric
veneur.sink.spans_dropped_total
now includes packets that were skipped due to UDP write errors. Thanks, aditya! - The
debug
blackhole sink features improved logging output, with more data and better formatting. Thanks, aditya! - Container images are now built with Go 1.12. Thanks, aditya!
- The signalfx client no longer reports a timeout when submission to the datapoint API endpoint encounters an error. Thanks, antifuchs!
- SSF packets without a name are no longer considered valid for
protocol.ValidTrace
. Thanks, tummychow! - The splunk sink no longer hangs or complains when a HEC endpoint should close the connection. Thanks, antifuchs!
- Go 1.10 is no longer supported.
- Datadog's distribution type for DogStatsD is now supported and treated as a plain histogram for compatibility. Thanks, gphat!
- Add support for
tags_exclude
to the DataDog metrics sink. Thanks, mhamrah! - The
github.com/stripe/veneur/trace
package has brand new and much more extensive documentation! Thanks, antifuchs! - New configuration setting
signalfx_flush_max_per_body
that allows limiting the payload of HTTP POST bodies containing data points destined for SignalFx. Thanks, antifuchs!
- The new X-Ray sink provides support for AWS X-Ray as a tracing backend. Thanks, gphat and aditya!
- A new package
github.com/stripe/veneur/trace/testbackend
contains two trace client backends that can be used to test the trace data emitted by applications. Thanks, antifuchs!
- Updated the vendored version of x/net, which picks up a package rename that can lead issues when integrating veneur into other codebases. Thanks, nicktrav!
- Updated the vendored versions of x/sys, protobuf, and gRPC. Thanks nicktrav!
- The Splunk span sink no longer reports an internal error for timeouts encountered in event submissions; instead, it reports a failure metric with a cause tag set to
submission_timeout
. Thanks, antifuchs! - The Splunk span sink now honors
Connection: keep-alive
from the HEC endpoint and keeps around as many idle HTTP connections in reserve as it has HEC submission workers. Thanks, antifuchs! - The metric
veneur.forward.post_metrics_total
was being emitted both as a gauge and a counter. The errant gauge was removed. Thanks, gphat!
- The Splunk span sink can be configured with a sample rate for non-indicator spans with the
splunk_span_sample_rate
setting. Thanks, aditya! - The splunk span sink now has configuration parameters
splunk_hec_max_connection_lifetime
andsplunk_hec_connection_lifetime_jitter
to regulate how long HTTP connections can be kept alive for. Thanks, antifuchs! - The SignalFx sink can now filter metric names by prefix with
signalfx_metric_name_prefix_drops
and tag literals (case-insensitive) withsignalfx_metric_tag_literal_drops
. Thanks gphat! - Histograms and timers now support global scope. Histograms and timers tagged with "veneurglobalonly" will now emit all metrics from the global veneur. The default behavior is to emit aggregates like max, min locally and percentiles globally. Thanks, clin!
- The
ssf.spans.root.received_total
global counter tracks the number of traces (root spans) processed system-wide. Thanks, aditya!
- The README's Metrics section has been updated, as it referred to some missing metrics. Thanks, gphat!
- Various references to Datadog were removed from the README, Veneur is vendor agnostic. Thanks, gphat!
- The metrics
veneur.flush.total_duration_ns
andveneur.flush.worker_duration_ns
were removed, please use the per-sinkveneur.sink.metric_flush_total_duration_ns
to monitor flush durations. - The metrics
veneur.gc.GCCPUFraction
,veneur.gc.alloc_heap_bytes_total
,veneur.gc.mallocs_objects_total
metrics were removed. Also from veneur proxy. Thanks, gphat! - The metric
veneur.flush.other_samples_duration_ns
was removed. Thanks, gphat!
- Metrics can be forwarded over gRPC using veneur-proxy (and Consul). Thanks, noahgoldman and Quantcast!
- Added tracer.InjectHeader convenience function for... convenience! Thanks, mikeh!
- Veneur has a new sink that can be configured to send spans as events into a Splunk HEC endpoint. Thanks, antifuchs and aditya!
- Go 1.11 is now supported and used for all public Docker images. Thanks, aditya!
- The
veneur/trace
package now supports setting the indicator bit on a span manually. Thanks, aditya! - The
-validate-config
andvalidate-config-strict
flags will make veneur exit appropriately after checking the specified (-f
) config file. Thanks, sdboyer! veneur-emit
will now exit with an error if no data would have been sent. Thanks, sdboyer!
- The trace client can now correctly parse trace headers emitted by Envoy. Thanks, aditya!
- Go 1.9 is no longer supported.
veneur-emit
now takes a new option-span_tags
for tags that should be applied only to spans. This allows span-specific tags that are not applied to other emitted values. Thanks gphat!veneur-emit
's-tag
flag now applies the supplied tags to any value emitted, be it a span, metric, service check or event. Use other, mode specific flags (e.g. span_tags) to add tags only to those modes. Thanks gphat!- Isolated a potential resource starvation issue. Added new configuration options for
veneur-proxy
to configure its http.Transport:idle_connection_timeout
for controlling how long connections may idle before timing out, corresponds toIdleConnTimeout
max_idle_conns
for controlling the maximum number of idle connections in totalmax_idle_conns_per_host
for controlling the maximum number of idle connections per host. Not that this now defaults to100
for safety!
- Added configuration options and improved defaults for the following tracing client parameters:
tracing_client_capacity
for controlling the depth of a buffer that holds tracing spans when they can't be emitted, defaults to1024
, up from64
tracing_client_flush_interval
for controlling how often the tracing client's backing buffer will be emptied (as an alternative to when it is full), defaults to500ms
from3s
tracing_client_metrics_interval
for controlling how often thew tracing client will send metrics about it's own operations, defaults to1s
and is unchanged
veneur-prometheus
no longer crashes when the metrics host is unreachable. Thanks, arjenvanderende!
veneur-proxy
now only logs forward counts at Debug level, drastically reducing log volume.
- Metrics can be imported over gRPC if the
grpc_address
parameter is set. Thanks, noahgoldman and Quantcast! - When timing commands,
veneur-emit
now passes stdin, stdout and stderr through to the child process unmodified. Thanks antifuchs and sdboyer! - Metrics can be forwarded over gRPC (currently only to a single Veneur) using
forward_use_grpc
. Thanks, noahgoldman and Quantcast! - Two new options,
debug_ingested_spans
anddebug_flushed_metrics
make veneur log (at level DEBUG) information about the metrics and spans it processes. Thanks, antifuchs! veneur-emit
now takes a new option-set
with a string argument, which allows counting how many unique values were reported in veneur's flush interval. Thanks, antifuchs!
- Fix a possible crash-before-panic when unable to open UDP socket. Thanks, gphat
- The
StartSpan
method ontracer.Tracer
will default to the providedoperationName
if provided. This function is provided for compatibility with OpenTracing, but the package-leveltrace.StartSpanFromContext
function is recommended for new users. - When creating timer metrics from indicator spans, veneur no longer prefixes
indicator_span_timer_name
with the stringveneur.
. Thanks, antifuchs! veneur-prometheus
now exports Histograms properly, with a statsd tag for each bucket- Environment config for
veneur-proxy
now usesVENEUR_PROXY_
as a prefix. Previously usedVENEUR_
which was a bug!
- Metric sampler parse function now looks for
veneurlocalonly
andveneurglobalonly
by prefix instead of direct equality for times where value can't/shouldn't be excluded even if it's blank. Thanks joeybloggs veneur-prometheus
now exports a tag for each quartile rather than a seperate metric
- The tag
span_name
has been removed from the timer metric generated for indicator spans. Thanks, aditya!
- Added a timeout for sink ingestion to all sinks, which prevents a single slow sink from blocking ingestion on other span sinks indefinitely. Thanks, aditya!
- Added
trace.SetDefaultClient
to handle overridding the default trace client, and closing the existing one. Thanks, franklinhu
- Veneur's key performance indicator metrics for metrics processing are reported through the statsd client. This way, KPI metrics are affected only by the metrics pipeline, not the tracing pipeline as well.
- SignalFX sink can now handle and convert ssf service checks (represented as a gauge). Thanks, stealthcode!
- Converted the grpsink to use unary instead of stream RPCs. Thanks, sdboyer!
- Bumped version of SignalFx go client to prevent accidental removal of
-
from tag keys. Thanks, gphat! - Switched to a faster consistent hash implementation. Thanks, aditya and noahgoldman!
- Reduce mutex contention around the default RNG in the math/rand standard library package. Thanks, aditya!
- Improve performance of
ssf.RandomlySample
by a factor of one million. Thanks, aditya! - Added additional metrics for internal memory usage. Thanks, aditya!
gc.alloc_heap_bytes_total
gc.mallocs_objects_total
gc.GCCPUFraction
- The
ignored-labels
andignored-metrics
flags for veneur-prometheus will filter no metrics or labels if no filter is specified. Thanks, arjenvanderende! - Fixed problem where all Datadog service checks were set to
OK
instead of the supplied value. Thanks, gphat! - Added a timeout to the Kafka sink, which prevents the Kafka client from blocking other span sinks. Thanks, aditya!
- Official support for building Veneur's binaries with Go 1.8 has been dropped. Supported versions of Go for building Veneur 1.9, 1.10, or tip.
- Veneur's trace client library can still be used in applications that are built with Go 1.8, but it is no longer tested against Go 1.8.
- The
veneur.ssf.received_total
metric has been removed, as it is mostly redundant withveneur.ssf.spans.received_total
, and was not reported consistently between packet and framed formats. - The
veneur.ssf.spans.received_total
metric now tracks all SSF data received, in either packet or framed format, whether or not a valid span was extracted.
- Receiving SSF in UDP packets now happens on
num_readers
goroutines. Thanks, antifuchs - Updated SignalFx library dependency so that compression is enabled by default, saving significant time on large metric bodies. Thanks, gphat
- Decreased logging output of veneur-proxy. Thanks, gphat!
- Better warnings when invalid flag combinations are passed to
veneur-emit
. Thanks, sdboyer! - Revamped how sinks handle DogStatsD's events and service checks. Thanks, gphat
veneur.worker.events_flushed_total
andveneur.worker.checks_flushed_total
have been replaced byveneur.worker.other_samples_flushed_total
veneur.flush.event_worker_duration_ns
has been replaced byveneur.flush.other_samples_duration_ns
- The new
tags_exclude
parameter can be used to strip tags from all metrics Veneur processes, either for all supported sinks or a subset of sinks. Thanks, aditya! - The SSF client now defaults to opening 8 connections in parallel to avoid blocking client code. Thanks, antifuchs!
- New config settings
num_span_workers
andspan_channel_capacity
that allow you to customize the parallelism of span ingestion. Thanks, antifuchs! - New span sink utilization metrics - Thanks, antifuchs:
veneur.sink.span_ingest_total_duration_ns
gives the total time persink
spent ingesting spansveneur.worker.span_chan.total_elements
overveneur.worker.span_chan.total_capacity
gives the utilization of the sink ingestion channel.
- Introduce a generic gRPC streaming backend for trace spans. Thanks, sdboyer!
- New config keys
signalfx_vary_key_by
andsignalfx_per_tag_api_keys
which which allow sending signalfx data points with an API key specific to these data points' dimensions. Thanks, antifuchs! - veneur-proxy now reports runtime metrics (with the prefix
veneur-proxy.
) at a configurable interval controlled byruntime_metrics_interval
. It defaults to 10s. Thanks gphat! - Allow specifying trace start/end times on
veneur-emit
. Thanks, sdboyer! - Default
span_channel_capacity
to a non-zero value so we don't drop most spans in a minimal configuration. Thanks, gphat! - Added tests for parsing floating point timers and histograms, just in case! Thanks gphat!
- New
ignored-labels
andignored-metrics
flags added to veneur-prometheus to selectively restrict exports to Veneur. Thanks, yolken!
- These deprecated configuration keys are no longer supported and will cause an error on startup if used in a veneur config file:
api_hostname
- replaced in 1.5.0 bydatadog_api_hostname
key
- replaced in 1.5.0 bydatadog_api_key
trace_address
- replaced in 1.7.0 byssf_listen_addresses
trace_api_address
- replaced in 1.5.0 bydatadog_trace_api_address
ssf_address
- replaced in 1.7.0 byssf_listen_addresses
tcp_address
andudp_address
- replaced in 1.7.0 bystatsd_listen_addresses
- These metrics have changed names:
- Datadog, MetricExtraction, and SignalFx sinks now emit
veneur.sink.metric_flush_total_duration_ns
for metric flush duration and tag it withsink
- Datadog, Kafka, MetricExtraction, and SignalFx sinks now emits
sink.metrics_flushed_total
for metric flush counts and tag it withsink
- Datadog and LightStep sinks now emit
veneur.sink.span_flush_total_duration_ns
for span flush duration and tag it withsink
- Datadog, Kafka, MetricExtraction, and LightStep sinks now emit
sink.spans_flushed_total
for metric flush counts and tag it withsink
- Datadog, MetricExtraction, and SignalFx sinks now emit
- Veneur's internal metrics are no longer tagged with
veneurlocalonly
. This means that percentile metrics (such as timers) will now be aggregated globally.
- LightStep sink was hardcoded to use plaintext, now adjusts based on URL scheme (http versus https). Thanks gphat!
- The Datadog sink will no longer panic if
flush_max_per_body
is not configured; a default is used instead. Thanks silverlyra! - The statsd source will no longer reject all packets if
metric_max_length
is not configured; a default is used instead. Thanks silverlyra!
veneur-emit
now infers parent and trace IDs from the environment (using the variablesVENEUR_EMIT_TRACE_ID
andVENEUR_EMIT_PARENT_SPAN_ID
) and sets these environment variables from its-trace_id
andparent_span_id
when timing commands, allowing for convenient construction of trace trees if traced programs callveneur-emit
themselves. Thanks, antifuchs- The Kafka sink for spans can now sample spans (at a rate determined by
kafka_span_sample_rate_percent
) based off of traceIDs (by default) or a tag's values (configurable viakafka_span_sample_tag
) to consistently sample spans related to each other. Thanks, rhwlo! - Improvements in SSF metrics reporting - thanks, antifuchs!
- Function
ssf.RandomlySample
that takes an array of samples and a sample rate and randomly drops those samples, adjusting the kept samples' rates - New variable
ssf.NamePrefix
that can be used to prepend a common name prefix to SSF sample names. - Package
trace/metrics
, containing functions that allow reporting metrics through a trace client. - New type
ssf.Samples
holding a batch of samples which can be submitted conveniently throughtrace/metrics
. - Method
trace.(*Trace).Add
, which allows adding metrics to a trace span.
- Function
veneur-proxy
has a new configuration optionforward_timeout
which allows specifying how long forwarding a batch to global veneur servers may take in total. Thanks, antifuchs!- Add native support for running Veneur within Kubernetes. Thanks, aditya!
- Updated Datadog span sink to latest version in Datadog tracing agent. Thanks, gphat!
- The semantics around
veneur-emit
command timing have changed:-shellCommand
argument has been renamed to-command
, and-command
is now gone. The only way to time a command is to provide the command and its arguments as separate arguments, the method of passing in a shell-escaped string is no longer supported.
- A SignalFx sink has been added for flushing metrics to SignalFx. Thanks, gphat!
- A Kafka sink has been added for publishing spans or metrics. Thanks, parabuzzle and gphat!
- Buffered trace clients in
github.com/stripe/veneur/trace
now have a new option to automatically flush them in a periodic interval. Thanks, antifuchs! - Gauges can now be marked as
veneurglobalonly
to be globally "last write wins". Thanks gphat! veneur-emit
now takes-span_service
,-trace_id
,-parent_span_id
, and-indicator
arguments. These (combined with-ssf
) allow submitting spans for tracing when recording timing data for commands. In addition,-timing
with-ssf
now works, too.- The
github.com/stripe/veneur/ssf
package now has a few helper functions to create samples that can be attached to spans:Count
,Gauge
,Histogram
,Timing
. In addition, veneur now has a span sink that extracts these samples and treats them as metrics. Thanks, antifuchs - Spans sent to Lightstep now have an
indicator
tag set, indicating whether the span is an indicator span or not. Thanks, aditya! veneur-emit -command
now streams output from the invoked program's stdout/stderr to its own stdout/stderr. Thanks, antifuchs!- Veneur now supports the tag
veneursinkonly:<sink_name>
on metrics, which routes the metric to only the sink specified. See the docs here. Thanks, antifuchs!
- Veneur now emits a timer metric giving the duration (in nanoseconds) of every "indicator" span that it receives, if you configure the setting
indicator_span_timer_name
. Thanks, antifuchs! - All sinks have been moved to their own packages for smaller code and better interfaces. Thanks gphat!
- Removed noisy Sentry events that duplicated Datadog error reporting. Thanks, aditya!
- Veneur now reuses HTTP connections for forwarding and Datadog flushes. Furthermore each phase of the HTTP request is traced and can be seen using the trace sink of your choice. This should drastically improve performance and reliability for installs with large numbers of instances. Thanks gphat!
- Veneur now tracks statsd metrics for SSF spans concerning its own operation. This means that the
veneur.ssf.spans.received_total
counter and theveneur.ssf.packet_size
histogram again reflect trace spans routed internally. Thanks, antifuchs!
- New 'blackhole' sink for testing and benchmark purposes. Thanks gphat!
- Veneur no longer requires the use of Datadog as a target for flushes. Veneur can now use one or more of any of its supported sinks as a backend. This realizes our desire for Veneur to be fully vendor agnostic. Thanks gphat!
- The package
github.com/stripe/veneur/trace
now depends on fewer other packages across veneur, making it easier to pull intrace
as a dependency. Thanks antifuchs! - A Veneur server with tracing enabled now submits traces and spans concerning its own operation to itself internally without sending them over UDP. See the "Upgrade Notes" section below for metrics affected by this change. Thanks antifuchs!
- veneur-prometheus and veneur-proxy executables are now included in the docker images. Thanks jac
- All Veneur executables are now in $PATH in the docker images. Thanks jac
- When using Lightstep as a tracing sink, spans can be load-balanced more evenly across collectors by configuring the
trace_lightstep_num_clients
option to multiplex across multiple clients. Thanks aditya! - sync_with_interval is a new configuration option! If enabled, when starting, Veneur will now delay its first metric to be aligned with an
interval
boundary on the local clock. This will effectively "synchronize" Veneur instances across your deployment assuming reasonable clock behavior. The result is a metric timestamps in your TSDB that mostly line up improving bucketing behavior. Thanks gphat! - Cleaned up some linter warnings. Thanks gphat!
- Tests no longer depend on implicit presence of a Datadog metric or span sink. Thanks gphat!
- Refactor internal HTTP helper into its own package fix up possible circular deps. Thanks gphat!
- Fix a panic when using
veneur-emit
to emit metrics via-ssf
when no tags are specified. Thanks myndzi - Remove spurious warnings about unset configuration settings. Thanks antifuchs
- Removed the InfluxDB plugin as it was experimental and wasn't working. We can make a sink for it in the future if desired. Thanks gphat!
- The
set
data structure serialization format for communiation with a global Veneur server has changed in an incompatible way. If your infrastructure relies on a global Veneur installation, they will dropset
data from non-matching versions until the entire fleet and the global Veneur are all at the same version. - The metrics for SSF packets (and spans) received have changed names: They used to be
veneur.packet.received_total
andveneur.packet.spans.received_total
, respectively, and they are now namedveneur.ssf.received_total
andveneur.ssf.spans.received_total
.
- New configuration option
statsd_listen_addresses
, a list of URIs indicating on which ports (and protocols) Veneur should listen on for statsd metrics. This deprecates both theudp_address
andtcp_address
settings. Thanks antifuchs! - New package
github.com/stripe/veneur/protocol
, containing a wire protocol for sending/reading SSF over a streaming connection. Thanks antifuchs! github.com/veneur/trace
now contains customizableClient
s that support streaming connections.- veneur-prometheus now has a
-p
option for specifying a prefix for all metrics. Thanks gphat! - New metrics
ssf.spans.tags_per_span
andssf.packet_size
track the distribution of tags per span and total packet sizes, respectively. - Our super spiffy logo, designed by mercedes, is now included at the top of the README!
- Refactor internals to use a new intermediary metric struct to facilitate new plugins. Thanks gphat!
- The BUILD_DATE and VERSION variables can be set at link-time and are now exposed by the
/builddate
and/version
endpoints.
- A new HyperLogLog implementation means
set
s are faster and allocate less memory. Thanks, martinpinto and seiflotfy! - Introduced a new
metricSink
which provides a common interface for metric backends. In an upcoming release all plugins will be converted to this interface. Thanks gphat!
veneur-emit
no longer supports the-config
argument, as it's no longer possible to reliably detect which statsd host/port to connect to. The-hostport
option now takes a URL of the same formstatsd_listen_addresses
takes to explicitly tell it what address it should send to.SSFSpanCollection
got removed, as it is superseded by the wire protocol. If you need to send multiple spans in bulk, we recommend setting up a bufferedtrace.Client
!
- Veneur-emit can now time any shell command and emit its duration as a Timing metric. Thanks redsn0w422!
- Config options can now be provided via environment variables using envconfig for Veneur and veneur-proxy. Thanks gphat!
- SSF now includes a boolean
indicator
field for signaling that this span is useful as a Service Level Indicator for its service. - A type
SSFSpanCollection
has been added but is not yet used. - The
veneur-prometheus
command can be used to scrape prometheus endpoints and emit those metrics to Veneur. Thanks gphat and jvns
As a result of the efficiency improvements in this release, we've seen ~50% reduction in memory usage by way of measuring the allocated heap.
Secondly, the shift in not buffering spans on their way to LightStep should be noted. This changes behavior in Veneur which has traditionally done everything in 10s increments.
- If possible, initialization errors when starting Veneur will now be reported to Sentry. Thanks chimeracoder!
- Check return value of LightStep flush. Thanks chimeracoder!
- No longer using a fork of Logrus that fixed a race condition. Thanks chimeracoder!
- Updated to latest version of LightStep's tracing library, which drastically improves the success of spans on hosts with high span rates. Thanks gphat!
- No longer buffers spans for LightStep, they are dispatched directly to the LightStep client. Thanks gphat!
- Reuse an existing buffer when parsing incoming spans, reducing allocations. Thanks gphat!
- Use gogo protobuf for code generation of SSF's protobuf, resulting in faster and less memory span ingestion. Thanks gphat!
- veneur-proxy no longer balks at using static hosts for tracing and metrics. Thanks gphat!
- SSF's
operation
field has been deprecated in favor of the fieldname
. - SSF spans with a tag
name
will have that name placed into the SSF spanname
field until 2.0 is released.
- Correctly parse
Set
metrics if sent via SSF. Thanks redsn0w422! - Return correct array of tags after parsing an SSF metric. Thanks redsn0w422!
- Don't panic if a packet doesn't have tags. Thanks redsn0w422!
- Fix a typo in the link to veneur-emit in the readme. Thanks vasi!
- Adds events and service checks to
veneur-emit
. Thanks redsn0w422! - Switch to dep for managing the
vendor
directory. Thanks chimeracoder! - Remove support for
govendor
. Thanks chimeracoder! - Emit GC metrics at flush time. This feature is best used with Go 1.9 as previous versions cause GC pauses when collecting this information from go. Thanks gphat!
- Allow configuration of LightStep's reconnect period via
trace_lightstep_reconnect_period
and the maximum number of spans it will buffer viatrace_lightstep_maximum_spans
. Thanks gphat! - Switch to dep for managing the
vendor
directory. Thanks chimeracoder! - Remove support for
govendor
. Thanks chimeracoder! - Added harmonic mean as an optional aggregate type. Use
hmean
as an option to theaggregates
config option to enable. Thanks non!
- Added tests for
parseMetricSSF
. Thanks redsn0w422! - Refactored
veneur-emit
flag usage to make testing easier. Thanks redsn0w422! - Minor text fixes in the README. Thanks an-stripe!
- Restructured SSF parsing logic and added more tests. Thanks redsn0w422!
- Tag
packet.spans.received_total
withservice
from the span. Thanks chimeracoder!
- Better document how to configure Veneur as a DogStatsD replacement. Thanks gphat with assist from stangles!
- Fixed an error in graceful shutdown of the TCP listener. Thanks evanj!
- Don't hang if we call
log.Fatal
and we aren't hooked up to a Sentry. Thanks evanj! - Fix flusher_test being called more than once resulting in flappy failure. Thanks evanj!
- Improve flusher test to not start Veneur, fixing flapping test. Thanks evanj!
veneur-emit
can now emit metrics using the SSF protocol. Thanks redsn0w422!- Documentation for SSF. Thanks gphat!
- It is no longer required to emit a
sum
to get anavg
when configuring what aggregations to emit for a histogram. Thanks cgilling! - Tags added in the
tags
config key are now applied to trace spans. Thanks chimeracoder! - Additional documentation for
veneur-proxy
. Thanks gphat! - Revamped configuration file organization and comments. Thanks gphat!
- Changed some config keys to have more specific names to facilitate future refactoring. Thanks gphat!
- Adjust the flush loop to listen for server shutdown to improve test consistency. Thanks evanj!
- Veneur can now, experimentally, ingest metrics using the SSF protocol. Thanks redsn0w422!
- Reresolve the LightStep trace flusher on each flush, accomodating Consul-based DNS use and preventing stale sinks. Thanks chimeracoder!
- The following configuration keys are deprecated and will be removed in version 2.0 of Veneur:
datadog_api_key
replaceskey
datadog_api_hostname
replacesapi_hostname
datadog_trace_api_address
replacestrace_api_address
ssf_address
replacestrace_address
- Require Go 1.8+ and stop building against 1.7 Thanks Thanks chimeracoder!
- Decrease logging level for proxy's "forwarded" messages. Thanks gphat!
- Failed discovery refreshes now log the service name. Thanks gphat!
- Proxy no longer requires a trace service name, since it's not wired up. Thanks gphat!
- No longer allow clients to pass in
nan
,+inf
or-inf
as a value for a metric, as this caused errors on flush. Thanks gphat!
- Added
veneur-proxy
to provide HA features with consistent hashing. See the Proxy section of the README
- Fix flusher_test to properly shutdown HTTP after handling. Thanks evanj!
- Verify that
trace_max_length_bytes
is properly set. Thanks evanj! - Fix some race conditions in testing.
- Document performance cost of TLS with RSA and ECDH keys. Thanks evanj!
- Reduce logging of tracing information to
debug
level to decrease unnecessary logging. - Reduce common TCP error logs to
info
level. Thanks evanj! - Deal with server shutdown without inspecting errors strings. Thanks evanj!
- Decrease the number of things we send to Sentry as "errors".
- Detect and emit a metric
veneur.packet.error_total
taggedreason:toolong
for metrics that exceed the metric max length. - Emit a metric
veneur.packet.error_total
taggedreason:zerolength
for metrics have no contents. - Correct unnecessary allocation / goroutine in TCP connections that was leaking memory. Thanks evanj!
- Close idle TCP connections after 10 minutes. Thanks evanj!
- Fixed a lot of go lint errors.
- Add a metric
veneur.sentry.errors_total
for number of errors we send to Sentry. - New plugin
flush_file
for writing metrics to a flat file. - New
/healthcheck/tracing
endpoint that returns 200 if this Veneur instance is accepting traces.
- Refactor tests to use a more shareable test fixture. Thanks evanj!
- Refactor
Server
's constructor to not start any goroutines and add aStart()
that takes care of that, making for easier tests.
- Hostname and device name tags are now omitted from JSON generated for transmission to Datadog at flush time. Thanks evanj!
- Fix panic when an error is generated and Sentry is not configured. Thanks evanj!
- Fix typos in README
- Add
omit_empty_hostname
option. If true andhostname
tag is set to empty, Veneur will not add a host tag to its own metrics. Thanks evanj! - Support "all interfaces" addresses (
:1234
) for listening configuration. Thanks evanj! - Add support for receiving statsd packets over authenticated TLS connections. Thanks evanj!
- [EXPERIMENTAL] Add InfluxDB support.
- [EXPERIMENTAL] Add support for ingesting traces and sending to Datadog's APM agent.