Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Errors during telemetry data flush - broken pipe #2915

Open
kn0x1c opened this issue Oct 28, 2024 · 3 comments
Open

[Bug]: Errors during telemetry data flush - broken pipe #2915

kn0x1c opened this issue Oct 28, 2024 · 3 comments
Labels
🐛 bug Something isn't working

Comments

@kn0x1c
Copy link

kn0x1c commented Oct 28, 2024

Bug report

Hello,

I am using a currently not supported integration of your tracer. But it worked fine for the last versions.

My integration:
AWS Lambda with Vapor
PHP 8.2
Laravel 10.x

My issue is, that starting from tracer version 1.4.0 I get the following error:

NOTICE: PHP message: [ddtrace] [error] Failed signaling lifecycle end: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
NOTICE: PHP message: [ddtrace] [error] Failed flushing service data: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
NOTICE: PHP message: [ddtrace] [error] Failed flushing telemetry buffer: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
NOTICE: PHP message: [ddtrace] [error] Failed sending traces to the sidecar: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }

Fun part is, that it still sends data, metrics traces etc. all are appearing in the UI

I took a couple of hours and debugged it a little bit, I tried the following:
Upgrading / Downgrading the AWS Lambda layer to: v60-alpine and v65-alpine
Upgrading PHP from 8.2 to 8.3
Upgrading / Downgrading the tracer to: v1.0.0, v1.3.0, v1.3.2, v1.4.0, v1.4.2

As far for now I was able to tackle it down, that the issue is only present in all version of the tracer starting from v1.4.0 (v1.4.X) and it doesn't matter if it's PHP 8.2/8.3 and also which AWS Lambda layer (agent) version I am using.

I reviewed the changes and saw, that you switched to sidecar trace sender (I assume also for trace metrics) in v1.4.0 (tracer) but only for PHP 8.3. However I also get the issue in 8.2 so I am not sure if that is the problem, but I would explain a lot.

Also it seems like the application is MUCH slower, 2-3x.

Latest working configuration:
PHP: 8.2/8.3
AWS Lambda Layer: v65-alpine

Question
Is there something I can do, for example enable something in the agent (aws lambda layer), for example socket based communication etc. something like that, so I can get the new tracer running in lambda.
I know you are not supporting it yet, but it worked flawlessly in previous versions and we love using datadog and even rolled the AWS Lambda integration we build out to production, with no issues!

Tech Stuff
Dockerfile (not working using tracer 1.4.0)

FROM laravelphp/vapor:php82

# Add DataDog APM tracer
RUN apk add tar gzip libgcc \
&&  curl -LO --retry 3 https://github.com/DataDog/dd-trace-php/releases/download/1.4.0/datadog-setup.php \
&&  ln -s /sbin/ldconfig /usr/local/bin/ldconfig \
&&  php datadog-setup.php --php-bin=all \
&&  rm -f datadog-setup.php

# Load datadog agent layer
COPY --from=public.ecr.aws/datadog/lambda-extension:65-alpine /opt/. /opt/

COPY . /var/task

PHP version

8.2

Tracer or profiler version

1.4.0

Installed extensions

No response

Output of phpinfo()

No response

Upgrading from

Tracer 1.3.2

@kn0x1c kn0x1c added the 🐛 bug Something isn't working label Oct 28, 2024
@kn0x1c
Copy link
Author

kn0x1c commented Oct 28, 2024

Update:
I saw that you already identified such an issue with Lambda:
#2904

But however I tried using the following combinations:
PHP: 8.2 / 8.3
Agent Lambda Layer: 65-alpine
Tracer: 1.4.2 (this one should have this disabled check from the above PR)

=> It still throws the same errors, even though running in a AWS Lambda Function with the ENV AWS_LAMBDA_FUNCTION_NAME set

[ddtrace] [error] Failed flushing telemetry buffer: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
[ddtrace] [error] Failed flushing service data: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
[ddtrace] [error] Failed signaling lifecycle end: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }

But: It seems that the trace error is gone. Is it possible to update the conditional sending via sidecar also for telemtry, service and lifecycle data?

@kn0x1c
Copy link
Author

kn0x1c commented Oct 31, 2024

Update: We rolled out to production and reduced consistently our performance by factor 3-5x
Image

Tracer is now fixed to 1.3.2

@TophrC-dd
Copy link

Hey @kn0x1c -- My name is Topher, I am a senior escalations engineer that specializes in serverless here at datadog. I am sorry you're seeing this performance issue. Would it be possible for you to provide a bare bones vapor project for us to work with to further investigate?

Best,
~Topher

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants