You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
In our setup a main PHP CLI process triggered by cron is creating sub PHP processes to do work parallelly. We have observed that after some time, there are multiple child processes that are stuck. Their unit of work is completed, but they are not terminating. Upon further investigating these processes, we have seen that they are waiting for a lock to be released. After reproducing the issue locally we have seen in the traces that the PHP CLI process hangs on the PeriodicTaskExecutor class destructor's thread_.join() call.
strace outpout of the stuck process
strace: Process 488 attached
futex(0x7f66c8bff910, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 492, NULL, FUTEX_BITSET_MATCH_ANY
gbd backtrace
(gdb) backtrace
#0 __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=492, futex_word=0x7f66c8bff910) at ./nptl/futex-internal.c:57
#1 __futex_abstimed_wait_common (cancel=true, private=128, abstime=0x0, clockid=0, expected=492, futex_word=0x7f66c8bff910) at ./nptl/futex-internal.c:87
#2 __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7f66c8bff910, expected=492, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=128) at ./nptl/futex-internal.c:139
#3 0x00007f66d0636624 in __pthread_clockjoin_ex (threadid=140079431415360, thread_return=0x0, clockid=0, abstime=0x0, block=<optimized out>) at ./nptl/pthread_join_common.c:105
#4 0x00007f66c93d9a63 in __gthread_join (__value_ptr=0x0, __threadid=<optimized out>) at /home/build/prerequisities/gcc-12.2.0/x86_64-linux-gnu/libstdc++-v3/include/x86_64-linux-gnu/bits/gthr-default.h:669
#5 std::thread::join (this=0x5653fcdf7688) at ../../../.././libstdc++-v3/src/c++11/thread.cc:112
#6 0x00007f66c9328f1a in elasticapm::php::PeriodicTaskExecutor::~PeriodicTaskExecutor (this=0x5653fcdf7640, __in_chrg=<optimized out>) at /source/agent/native/libcommon/code/./PeriodicTaskExecutor.h:35
#7 elasticapm::php::PeriodicTaskExecutor::~PeriodicTaskExecutor (this=0x5653fcdf7640, __in_chrg=<optimized out>) at /source/agent/native/libcommon/code/./PeriodicTaskExecutor.h:37
#8 std::default_delete<elasticapm::php::PeriodicTaskExecutor>::operator() (__ptr=0x5653fcdf7640, this=<optimized out>) at /opt/gcc-12.2.0/include/c++/12.2.0/bits/unique_ptr.h:95
#9 std::unique_ptr<elasticapm::php::PeriodicTaskExecutor, std::default_delete<elasticapm::php::PeriodicTaskExecutor> >::~unique_ptr (this=0x5653fce212c8, __in_chrg=<optimized out>) at /opt/gcc-12.2.0/include/c++/12.2.0/bits/unique_ptr.h:396
#10 elasticapm::php::AgentGlobals::~AgentGlobals (this=0x5653fce21290, __in_chrg=<optimized out>) at /source/agent/native/libcommon/code/./AgentGlobals.h:12
#11 zm_globals_dtor_elastic_apm (elastic_apm_globals=0x7f66c947dd80 <elastic_apm_globals>) at /source/agent/native/ext/elastic_apm.cpp:299
#12 0x00005653fafc4758 in module_destructor (module=0x7f66c947d8a0 <elastic_apm_module_entry>) at ./Zend/zend_API.c:3075
#13 0x00005653fafd1015 in _zend_hash_del_el_ex (prev=<optimized out>, p=0x5653fca2df00, idx=59, ht=0x5653fb2458a0 <module_registry>) at ./Zend/zend_hash.c:1425
#14 _zend_hash_del_el (p=0x5653fca2df00, idx=59, ht=0x5653fb2458a0 <module_registry>) at ./Zend/zend_hash.c:1452
#15 zend_hash_graceful_reverse_destroy (ht=0x5653fb2458a0 <module_registry>) at ./Zend/zend_hash.c:1977
#16 0x00005653fafc2e96 in zend_destroy_modules () at ./Zend/zend_API.c:2384
#17 0x00005653fafbded7 in zend_shutdown () at ./Zend/zend.c:1099
#18 0x00005653faf575ba in php_module_shutdown () at ./main/main.c:2453
#19 php_module_shutdown () at ./main/main.c:2430
#20 0x00005653fadecbf2 in main (argc=5, argv=0x5653fc9e96e0) at ./sapi/cli/php_cli.c:1346
To Reproduce
Running the following setup reproduces the situation consistently for me:
Files:
./Dockerfile
FROM ubuntu:22.04
ARG DEBIAN_FRONTEND=noninteractive
ARG PHP_VERSION=8.2
RUN apt -y update
RUN apt-get -y install software-properties-common
RUN apt-get -y install tar gdb strace
RUN add-apt-repository -c main -c main/debug ppa:ondrej/php
RUN apt -y update
RUN apt-get -y install php${PHP_VERSION} php${PHP_VERSION}-cli-dbgsym
RUN update-alternatives --set php /usr/bin/php${PHP_VERSION}
RUN apt-get -y install php${PHP_VERSION}-dom php${PHP_VERSION}-mysql php${PHP_VERSION}-soap
ADD --chmod=0755 https://github.com/elastic/apm-agent-php/releases/download/v1.13.0/apm-agent-php_1.13.0_amd64.deb /root
ADD --chmod=0755 https://github.com/elastic/apm-agent-php/releases/download/v1.13.0/apm-agent-php-debugsymbols-linux-x86-64.tar /root
RUN dpkg -i /root/apm-agent-php_1.13.0_amd64.deb
RUN tar xvf /root/apm-agent-php-debugsymbols-linux-x86-64.tar -C /usr/lib/debug
RUN touch /var/log/apm_process.log
COPY ./src /root/src
WORKDIR /root/src
After the process finished, check with top command the list of processes and there should be at least one, but possibly multiple like this /usr/bin/php -f run.php {{delay_time}} {{main_process_id}}-{{counter_of_child_processes}} stuck forever.
with strace -p {{process_id}} we should see that the process is stuck on a futex() call like above.
Notes:
The issue is possibly a race condition, so it might take a few runs to reproduce, but for me the above setup consistently ended in multiple stuck processes.
I have originally started with our production setup and dependencies to reproduce the issue, but managed to strip down to this relative minimalistic example. The 3 extensions (dom, mysql, soap) seems to be needed to reproduce the issue. If any of them is taken out of the example, the problem did not occur for me.
Expected behavior
PHP process should not stuck upon terminating when APM extension is enabled.
The text was updated successfully, but these errors were encountered:
Describe the bug
In our setup a main PHP CLI process triggered by cron is creating sub PHP processes to do work parallelly. We have observed that after some time, there are multiple child processes that are stuck. Their unit of work is completed, but they are not terminating. Upon further investigating these processes, we have seen that they are waiting for a lock to be released. After reproducing the issue locally we have seen in the traces that the PHP CLI process hangs on the PeriodicTaskExecutor class destructor's thread_.join() call.
strace outpout of the stuck process
gbd backtrace
To Reproduce
Running the following setup reproduces the situation consistently for me:
Files:
Steps:
docker compose up
docker compose exec apm_process /bin/bash
php test.php 2>>/var/log/apm_process.log 1>&2
top
command the list of processes and there should be at least one, but possibly multiple like this/usr/bin/php -f run.php {{delay_time}} {{main_process_id}}-{{counter_of_child_processes}}
stuck forever.strace -p {{process_id}}
we should see that the process is stuck on afutex()
call like above.Notes:
Expected behavior
PHP process should not stuck upon terminating when APM extension is enabled.
The text was updated successfully, but these errors were encountered: