Fine-grained sampling rate per checkpoint #156

ramboz · 2024-05-15T22:02:56Z

Is your feature request related to a problem? Please describe.

In the context of running experiments and hoping to gain fast convergence on the winner, we've seen over the last year that the default 1/100 sampling rate is not enough to reach statistical significance in 2 weeks for most customers. We've lowered the sampling to 1/10 since the start of the year on pages that run an experiment but this triggered a few issues:

Lack of proper API: the RUM library does not offer an easy API to control this, and we had to modify the weight after its initialization, which leads to 2.
Breaking session integrity: since we dynamically change the weight only on pages that run experiments, we end up doing this sometime in the eager phase, after some of the checkpoints have already been passed (top, load, possibly others). So we end up with some sessions missing those 2 events and break the integrity of the events lifecycle for that session
Side effect on other checkpoints: once the experimentation checkpoint changes the weight, it's a global change that also impacts all other events happening afterwards… so we suddenly end up sampling block views, resource loads, media views at 1/10 as well which creates a lot of noise for no immediate gain for our use case (increase cost and noise, but no value gained).

Describe the solution you'd like

Ideally, the sampelRUM object would expose an API to change the sampling rate for a specific checkpoint, so only that 1 checkpoint (or a few like experiment & convert) is (are) impacted, and not the others.

Something along the lines:

sampleRUM('experimentation', { source: 'experiment-a', target: 'variant-1' }, 10);

and/or

sampleRUM.sampleAt('experiment', 10);
sampleRUM.sampleAt('convert', 10);
sampleRUM('experiment', { source: 'experiment-a', target: 'variant-1' });

The 2nd approach doesn't create tight coupling between the 2 events, and the experiments can decide to increase sampling for conversion so they align on the page that needed it without impacting pages that just have conversion with no experiments.

Describe alternatives you've considered

Setting a global object dynamically in the head.html, like window.RUM_SAMPLING_RATE = 10, before aem.js/lib-franklin.js is loaded so we can adjust the default sampling before the 1st events are triggered.
- This leaks JS logic in the head and decouples from the "plugins" that actually require it… so makes instrumentation harder and leaks plugin details in the global code:
```
<script>
window.RUM_SAMPLING_RATE = document.head.querySelector('meta[name="experiment"]') ? 10 : 100;
</script>
```
- We still don't address the side effects on other checkpoints and only increase the cost and noise even further
Resetting the sampling rate after the experiment checkpoint is fired
- There is still no guarantee that other checkpoints haven't been fired with the custom sampling rate, so we don't fully address cost/noise issues
- Since experiment works tightly coupled with convert, we actually also need to wait for the 1st conversion to happen otherwise we can't compute the winner

The text was updated successfully, but these errors were encountered:

ramboz · 2024-05-29T23:14:57Z

After discussing with @trieloff, we decided to stick to the 1. alternative instead for now

…ific use cases This introduces a new global variable, `window.RUM_SAMPLING_RATE`, that can be set before loading the library to increase the sampling rate for specific use cases that require more data collected for short-term reporting. For instance: - when running an experiment in a 2-week time-frame and achieve statistical significance even with low traffic - when running short-lived marketing campaign and wanting to collect enough data over a single weekend ## Usage For instance: ```js window.SAMPLE_PAGEVIEWS_AT_RATE = 'high'; ``` Or in an HTML context: ```html <head> <meta name="experiment" content="Foo"/> <meta name="experiment-variants" content="/bar,/baz"/> <script>window.SAMPLE_PAGEVIEWS_AT_RATE = document.head.querySelector('meta[name="experiment"]') ? 'high' : null</script> <script src="/scripts/aem.js" type="module"></script> <script src="/scripts/scripts.js" type="module"></script> <link rel="stylesheet" href="/styles/styles.css"> </head> ``` ## Related Issues Fix #156

# [2.3.0](v2.2.0...v2.3.0) (2024-08-19) ### Features * allow increasing the sampling rate up to 1/10 for specific use cases ([f96e713](f96e713)) * **minirum:** allow increasing the sampling rate up to 1/10 for specific use cases ([05f6da7](05f6da7)), closes [#156](#156)

adobe-bot · 2024-08-19T17:21:36Z

🎉 This issue has been resolved in version 2.3.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

ramboz added the enhancement New feature or request label May 15, 2024

ramboz mentioned this issue May 29, 2024

feat(minirum): allow increasing the sampling rate up to 1/10 for specific use cases #159

Merged

trieloff closed this as completed Jun 5, 2024

adobe-bot added the released label Aug 19, 2024

kptdobe mentioned this issue Aug 22, 2024

Corrupted rum-standalone #194

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-grained sampling rate per checkpoint #156

Fine-grained sampling rate per checkpoint #156

ramboz commented May 15, 2024 •

edited

Loading

ramboz commented May 29, 2024

adobe-bot commented Aug 19, 2024

Fine-grained sampling rate per checkpoint #156

Fine-grained sampling rate per checkpoint #156

Comments

ramboz commented May 15, 2024 • edited Loading

ramboz commented May 29, 2024

adobe-bot commented Aug 19, 2024

ramboz commented May 15, 2024 •

edited

Loading