[connector/datadog] Update README for accuracy (#35121)

**Description:** The current description of the Datadog connector implies that it is only useful in the presence of sampling. However, its use is actually required to see trace-emitting services and their statistics in Datadog APM. This PR rewords the README to reflect that more clearly. I also fixed some indentation issues in the provided example. **Link to tracking Issue:** No tracking issue on Github. Internal Jira issue: OTEL-1776 --------- Co-authored-by: Pablo Baeyens <[email protected]>
open-telemetry · Sep 11, 2024 · a1a77a5 · a1a77a5
1 parent 5cd3cd0
commit a1a77a5
Showing 1 changed file with 14 additions and 53 deletions.
diff --git a/connector/datadogconnector/README.md b/connector/datadogconnector/README.md
@@ -25,29 +25,22 @@
 
 ## Description
 
-The Datadog Connector is a connector component that computes Datadog APM Stats pre-sampling in the event that your traces pipeline is sampled using components such as the tailsamplingprocessor or probabilisticsamplerprocessor.
+The Datadog Connector is a connector component that derives APM statistics, in the form of metrics, from service traces, for display in the Datadog APM product. This component is *required* for trace-emitting services and their statistics to appear in Datadog APM.
 
-The connector is most applicable when using the sampling components such as the [tailsamplingprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor#tail-sampling-processor), or the [probabilisticsamplerprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/probabilisticsamplerprocessor) in one of your pipelines. The sampled pipeline should be duplicated and the `datadog` connector should be added to the the pipeline that is not being sampled to ensure that Datadog APM Stats are accurate in the backend.
+The Datadog connector can also forward the traces passed into it into another trace pipeline. Notably, if you plan to sample your traces with the [tailsamplingprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor#tail-sampling-processor) or the [probabilisticsamplerprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/probabilisticsamplerprocessor), you should place the Datadog connector upstream to ensure that the metrics are computed before sampling, ensuring their accuracy. An example is given below.
 
 ## Usage
 
-To use the Datadog Connector, add the connector to one set of the duplicated pipelines while sampling the other. The Datadog Connector will compute APM Stats on all spans that it sees. Here is an example on how to add it to a pipeline using the [probabilisticsampler]:
-
-<table>
-<tr>
-<td> Before </td> <td> After </td>
-</tr>
-<tr>
-<td valign="top">
-
 ```yaml
 # ...
 processors:
   # ...
   probabilistic_sampler:
     sampling_percentage: 20
-  # add the "datadog" processor definition
-  datadog:
+
+connectors:
+  # add the "datadog" connector definition and further configurations
+  datadog/connector:
 
 exporters:
   datadog:
@@ -58,53 +51,21 @@ service:
   pipelines:
     traces:
       receivers: [otlp]
-      # prepend it to the sampler in your pipeline:
-      processors: [batch, datadog, probabilistic_sampler]
+      processors: [batch]
+      exporters: [datadog/connector]
+
+    traces/2: # this pipeline uses sampling
+      receivers: [datadog/connector]
+      processors: [batch, probabilistic_sampler]
       exporters: [datadog]
 
     metrics:
-      receivers: [otlp]
+      receivers: [datadog/connector]
       processors: [batch]
       exporters: [datadog]
 ```
 
-</td><td valign="top">
-
-```yaml
-# ...
-processors:
-  probabilistic_sampler:
-    sampling_percentage: 20
-
-connectors:
-    # add the "datadog" connector definition and further configurations
-    datadog/connector:
-
-exporters:
-  datadog:
-    api:
-      key: ${env:DD_API_KEY}
-
-service:
-  pipelines:
-   traces:
-     receivers: [otlp]
-     processors: [batch]
-     exporters: [datadog/connector]
-
-   traces/2: # this pipeline uses sampling
-     receivers: [datadog/connector]
-     processors: [batch, probabilistic_sampler]
-     exporters: [datadog]
-
-  metrics:
-    receivers: [datadog/connector]
-    processors: [batch]
-    exporters: [datadog]
-```
-</tr></table>
-
-Here we have two traces pipelines that ingest the same data but one is being sampled. The one that is sampled has its data sent to the datadog backend for you to see the sampled subset of the total traces sent across. The other non-sampled pipeline of traces sends its data to the metrics pipeline to be used in the APM stats. This unsampled pipeline gives the full picture of how much data the application emits in traces.
+In this example configuration, incoming traces are received through OTLP, and processed by the Datadog connector in the `traces` pipeline. The traces are then forwarded to the `traces/2` pipeline, where a sample of them is exported to Datadog. In parallel, the APM stats computed from the full stream of traces are sent to the `metrics` pipeline, where they are exported to Datadog as well.
 
 ## Configurations