Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loosen IP address scrubbing in span.domain tag #3572

Open
gggritso opened this issue May 9, 2024 · 8 comments
Open

Loosen IP address scrubbing in span.domain tag #3572

gggritso opened this issue May 9, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@gggritso
Copy link
Member

gggritso commented May 9, 2024

Problem Statement

IP addresses like 8.8.8.8 get fully scrubbed out to *.*.*.*. When users look at the requests module, all requests to IP addresses get mashed together.

Solution Brainstorm

  • no scrubbing at all (IPv4 addresses have a known cardinality. It's only 4 billion! That's fine, right? I'm kidding)
  • partial scrubbing (e.g., 8.8.8.8 becomes *.*.8.8 or *.*.*.8 or 8.8.*.*)
  • customizable IP scrubbing allowlist (let people add IPs that don't get scrubbed)
  • allow self-hosted users to turn IP scrubbing off (people who have their own infra might want to turn this off, and bear the weight of high cardinality)

Product Area

Ingestion and Filtering

@gggritso gggritso changed the title Loosen IP address scrubbing in Requests module Loosen IP address scrubbing in span.domain tag May 9, 2024
@gggritso gggritso transferred this issue from getsentry/sentry May 9, 2024
@gggritso gggritso added the enhancement New feature or request label May 9, 2024
@jjbayer
Copy link
Member

jjbayer commented May 10, 2024

customizable IP scrubbing allowlist (let people add IPs that don't get scrubbed)

I wonder if we could get to an "80%" solution by hardcoding a list of"special" IPs. E.g. reserved IPs, DNS servers, .... Unless this does not cover the majority of the cases.

@aldy505
Copy link

aldy505 commented May 11, 2024

I'd like to see private IPs being not redacted on SaaS and self-hosted. For public IPs, I think it's better if it's redacted on SaaS and not redacted on self-hosted. Since on self-hosted, it's a different environment, it's only possible that the outgoing request is made only for those defined on the tracePropagationTargets config (from the sdk side).

@aldy505
Copy link

aldy505 commented Jul 6, 2024

Hey @jjbayer @gggritso seeing this PR I think I can implement this one.. partially. Right now we only allow 127.0.0.1 as seen on here:

Ipv4Addr::LOCALHOST => "127.0.0.1",

But considering what @jjbayer said, it's a good idea to whitelist "special IPs", with what I had in mind are:

  • 192.168.0.0–192.168.255.255 (65,536 addresses)
  • 10.0.0.0–10.255.255.255 (16,777,216 addresses)
  • 100.64.0.0–100.127.255.255 (4,194,304 addresses)
  • 172.16.0.0–172.31.255.255 (1,048,576 addresses)
  • 192.0.0.0–192.0.0.255 (256 addresses)
  • `198.18.0.0–198.19.255.255 (131,072 addresses)

Private IPv6 address has higher address count, so I believe that's out of the question. For popular IPs like DNS servers and such, I don't think there's any good use case for it to be included in a span. Although I might be wrong here.

@Dav1dde
Copy link
Member

Dav1dde commented Jul 8, 2024

I think we should just use Ipv4Addr::is_private, if we decide to implement it.

@jjbayer
Copy link
Member

jjbayer commented Jul 8, 2024

Private addresses can still be high cardinality, so I wouldn't unconditionally allowlist them all. IMO the simplest thing we can start with is add a global option that allows self-hosted users to disable IP scrubbing. @gggritso what do you think?

@aldy505
Copy link

aldy505 commented Jul 8, 2024

Private addresses can still be high cardinality, so I wouldn't unconditionally allowlist them all.

I'm wondering about how "high" is the high cardinality for you?

IMO the simplest thing we can start with is add a global option that allows self-hosted users to disable IP scrubbing. @gggritso what do you think?

Yes, this should also work, since self-hosted has known range. But the default should be set to false (disabled) though.

@jjbayer
Copy link
Member

jjbayer commented Jul 8, 2024

I'm wondering about how "high" is the high cardinality for you?

Can't give you an exact number, but the purpose of the scrubbing is to merge spans into "meaningful" groups. So accepting every single private IP as a separate group would be too fine grained IMO, but we could do partial scrubbing for private IPs like the issue description proposes (e.g. 192.168.*.*).

@gggritso
Copy link
Member Author

gggritso commented Jul 8, 2024

We haven't heard a lot of complaints about this issue lately, so I don't have a strong opinion. I think starting with allowing self-hosted users to turn off scrubbing completely feels like a good solution even though it'll let people degrade their own experience with Sentry. Either that, or an IP allowlist for self-hosted folks seems like it'd solve a lot of these problems!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

4 participants