-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Commas in tag values are not escaped #83
Comments
FYI we implemented our custom sanitizers for tags which follows the best practices from DataDog documentation and I believe that we can create PR after maintainers of this repo confirm that sanitizers should be in this package. |
Hi there, Feel free to open a PR, sanitizing the tags makes sense to me. I am a bit worried about the potential performance hit, make sure to mesure it if you create one ;). Arthur |
I referenced this from an OpenTelemetry metrics SDK issue in which we are looking to address the same problem. I started poking around hoping that DataDog had already set a convention for how to encode tag values with commas in them. @arbll the problem is that your agent will have to understand the convention that is used, so this issue can't be solved in the client alone. How does DataDog intend to support tag values with commas in them? |
Hi @jmacd,
I think in this library it would make sense to sanitize by either removing commas or replacing them by underscores. |
@arbll The document linked above, which I believe you are quoting from, describes naming tags, i.e., the valid set of characters permissible in a tag name. Restricting tag names is fine. It does not mention a restriction on possible tag values. Anyway, in as much as I am currently responsible for the OpenTelemetry metrics specification, I will not approve any such sanitization of tag values and if this is something Datadog won't spec out, we'll probably end up creating a new specification or not supporting dogstatsd out of the box. One way to spec this out would be to follow the W3C specification for trace context, which specifies how to encode distributed context tags in URL encoding. I.e., commas would be encoded using %2C. |
Here is the specification I refer to: https://w3c.github.io/correlation-context/#value-format |
Please correct if I'm wrong — from what I see, DataDog Agent doesn't have "tag name" and "tag value" distinction for metrics, those are parsed [0] directly into Note that according to statsd/dogstatsd spec, "tag names" are not unique, so as per "valid list of tags" example above, tag [0] https://github.com/DataDog/datadog-agent/blob/master/pkg/dogstatsd/parser.go#L85 |
Tags are a slice of strings in the documentation, but datadog-go documentation does not mention valid values for tags.
datadog-go
removes newlines from tags, but does not remove/escape other invalid characters like comma (,
). Comma in tag value will be interpreted as multiple tags since tags are comma-separated in the UDP datagrams.The page at https://docs.datadoghq.com/developers/guide/what-best-practices-are-recommended-for-naming-metrics-and-tags/ lists recommeded practice for naming tags, perhaps datadog-go should provide a function to sanitize the tag values according to those rules?
The text was updated successfully, but these errors were encountered: