feat(sentryapps) Add RPC caching to installation for organization call #80488

markstory · 2024-11-08T21:45:38Z

The get_installed_for_organization is our highest volume RPC call, as it is part of post_process_group. It also rarely changes are sentry-app installation rates are low compared to ingest volumes.

Having the response of this RPC call cached should help reduce tail call times for post_process_group and free up control silo webworkers.

This is our highest frequency RPC call right now. The results of this RPC call don't often change (as orgs infrequently setup new sentryapps). These changes add a new RPC call which caching wrappers that remove a significant number of RPC calls from error ingestion tasks.

The sentry_apps method requires a new style caching adapter.

Move the option read into the wrapper method and update call sites to call the wrapper method now. This will simplify cleanup.

markstory · 2024-11-08T21:48:27Z

src/sentry/sentry_apps/services/app/service.py

+        if in_random_rollout("app_service.installations_for_org.cached"):
+            return get_installations_for_organization(organization_id)
+        else:
+            return self.get_installed_for_organization(organization_id=organization_id)


Once these changes are fully deployed, I plan on using this option to shift load onto caching and hope to observe changes in RPC throughput and post_process task duration.

codecov · 2024-11-08T22:27:46Z

Codecov Report

Attention: Patch coverage is 93.93939% with 4 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/sentry/hybridcloud/rpc/caching/service.py	93.18%	1 Missing and 2 partials ⚠️
src/sentry/sentry_apps/services/app/service.py	92.30%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master   #80488       +/-   ##
===========================================
+ Coverage   57.53%   78.46%   +20.93%     
===========================================
  Files        7199     7221       +22     
  Lines      319053   319642      +589     
  Branches    43912    43971       +59     
===========================================
+ Hits       183561   250815    +67254     
+ Misses     130738    62441    -68297     
- Partials     4754     6386     +1632

Christinarlong · 2024-11-12T11:36:28Z

src/sentry/hybridcloud/rpc/caching/service.py

+    When cache read returns no data, the wrapped function will be
+    invoked. The result of the wrapped function is then stored in cache.
+
+    Ideal for 'get many X for organization' style methods


Q; How is this class and the SiloCacheManyBackedCallable different? It seems like they have the same purpose.

Like could we adapt the result processing in SiloCacheManyBackedCallable to process a list of objects. Would that lead to a lot of downstream refactoring? I guess I'm trying to understand the +/- of each approach. :thonk

Q; How is this class and the SiloCacheManyBackedCallable different? It seems like they have the same purpose.

SiloCacheManyBackedCallable has a similar return value, but different parameters and behavior. The CacheMany decorator takes a list of ids, and will attempt a multi-get on cache. Then if the results from cache have 'holes', it emits a single RPC request to get the balance. Each id maps to a different cache key so that we can share cache results with the get_by_id style cache.

The new caching decorator, takes a single id as parameter and gets a list of results. There is only a single cache key and the result is always a list. The two approaches could probably be combined, but I was concerned about the shared code path just moving the complexity into if/else logic.

I don't think we'd have to add too much complexity to CacheMany ? Like if we input a list of 1 org_id (list[int]) then CacheMany would return a list[list[SentryAppInstallation]], the added complexity being we need to serialize a list of lists instead of calling record.json. Am I misunderstanding how CacheMany works and/or where would the if/else logic be added to?

Or is it like kinda jank because they have separate "definitions" i.e CacheMany is list[X] -> list[Y] where as the new decorator is more X -> list[Y]

Or is it like kinda jank because they have separate "definitions" i.e CacheMany is list[X] -> list[Y] where as the new decorator is more X -> list[Y]

I think this could contribute to jank/complexity. The internals of resolve_from would be a big if/else block which lead me to keeping them separate classes.

I kinda sees, but the logic between resolve_from & get_many is not that different though if we're only working on 1 id though (??). I do agree on trying refactor one function to work with another definition is not the best though

I kinda sees, but the logic between resolve_from & get_many is not that different though if we're only working on 1 id though

Yeah the logic there would be similar. I felt that duplicating the code would result in easier to read and change logic over inheritance though.

src/sentry/hybridcloud/rpc/caching/service.py

#80488) The `get_installed_for_organization` is our highest volume RPC call, as it is part of post_process_group. It also rarely changes are sentry-app installation rates are low compared to ingest volumes. Having the response of this RPC call cached should help reduce tail call times for post_process_group and free up control silo webworkers.

With all traffic using the cached method call we can remove the deprecated RPC call. Refs #80488

sentry-io · 2024-11-22T11:38:35Z

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

‼️ OperationalError: OperationalError('server closed the connection unexpectedly\n\tThis probably means the server ter... sentry.sentry_apps.tasks.sentry_apps.process_re... View Issue

_{Did you find this useful? React with a 👍 or 👎}

markstory added 3 commits November 8, 2024 15:24

Add back_with_silo_cache_list method

f46769d

The sentry_apps method requires a new style caching adapter.

Update call sites to call the new method to centralize options

1ff2a9b

Move the option read into the wrapper method and update call sites to call the wrapper method now. This will simplify cleanup.

markstory requested review from a team as code owners November 8, 2024 21:45

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 8, 2024

vercel bot deployed to Preview November 8, 2024 21:46 View deployment

markstory commented Nov 8, 2024

View reviewed changes

Forgot to save a file

3f764fe

vercel bot deployed to Preview November 8, 2024 21:55 View deployment

Christinarlong reviewed Nov 12, 2024

View reviewed changes

Improve naming

f85a3c1

vercel bot deployed to Preview November 12, 2024 18:05 View deployment

Christinarlong approved these changes Nov 14, 2024

View reviewed changes

Merge branch 'master' into chore-installed-for-org-caching

fdb2dec

vercel bot deployed to Preview November 18, 2024 21:31 View deployment

markstory merged commit 15cb8de into master Nov 19, 2024
50 checks passed

markstory deleted the chore-installed-for-org-caching branch November 19, 2024 15:15

markstory added a commit that referenced this pull request Nov 21, 2024

chore(sentryapps) Remove uncached RPC method

2889757

With all traffic using the cached method call we can remove the deprecated RPC call. Refs #80488

markstory mentioned this pull request Nov 21, 2024

chore(sentryapps) Remove uncached RPC method #81153

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sentryapps) Add RPC caching to installation for organization call #80488

feat(sentryapps) Add RPC caching to installation for organization call #80488

markstory commented Nov 8, 2024

markstory Nov 8, 2024

codecov bot commented Nov 8, 2024 •

edited

Loading

Christinarlong Nov 12, 2024

markstory Nov 12, 2024

Christinarlong Nov 13, 2024

markstory Nov 14, 2024

Christinarlong Nov 14, 2024

markstory Nov 18, 2024

sentry-io bot commented Nov 22, 2024

feat(sentryapps) Add RPC caching to installation for organization call #80488

feat(sentryapps) Add RPC caching to installation for organization call #80488

Conversation

markstory commented Nov 8, 2024

markstory Nov 8, 2024

Choose a reason for hiding this comment

codecov bot commented Nov 8, 2024 • edited Loading

Codecov Report

Christinarlong Nov 12, 2024

Choose a reason for hiding this comment

markstory Nov 12, 2024

Choose a reason for hiding this comment

Christinarlong Nov 13, 2024

Choose a reason for hiding this comment

markstory Nov 14, 2024

Choose a reason for hiding this comment

Christinarlong Nov 14, 2024

Choose a reason for hiding this comment

markstory Nov 18, 2024

Choose a reason for hiding this comment

sentry-io bot commented Nov 22, 2024

Suspect Issues

codecov bot commented Nov 8, 2024 •

edited

Loading