Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression in consuming queues #321

Open
acrobat opened this issue Jul 19, 2017 · 10 comments
Open

Performance regression in consuming queues #321

acrobat opened this issue Jul 19, 2017 · 10 comments

Comments

@acrobat
Copy link
Member

acrobat commented Jul 19, 2017

I'm upgrading a php app which used a older release of bernard to 1.0.0-alpha5. This causes a huge regression is performance (because of the 5 second wait time in popMessage). This is on an empty queue

image

We use the PhpAmqpDriver. PR #263 introduced the change

I will try to debug some more and come up with a fix (Will update the issue later with some more info)


Running consumer on empty queue:
Before: 0.089s
After: 5.100s

@henrikbjorn
Copy link
Contributor

@acrobat The wait time is correct, as you are saying "block until i receive a message or duration time have passed". You a doing a consume in the same request as the http request?

@acrobat
Copy link
Member Author

acrobat commented Jul 19, 2017

No I run the consumer in a separate proces and console command. But why is that 5 second wait time check implemented? I would think it works this way:

  • Get message from queue
    -- There was a message in queue, pass the body to the consumer
    -- No message in queue (wait 10 ms) and pass null message

But the 5 sec wait doesn't seem necessary to me

@henrikbjorn
Copy link
Contributor

In a consumer, you want them to run forever in a detached process. So instead of going through a hot code path, every 10ms, if there is no message. Then we wait for one to be available so we can consume it as fast as possible.

Most of queueing api provides already have this e.g. SQS uses WaitTimeSeconds.

The 10ms usleep is a hack to not have PHP consume a gazillion amount of CPU when doing the for loop. (Open activity monitor and have the usleep removed, and readd to see the difference).

@henrikbjorn
Copy link
Contributor

A simpler example would be this.

  1. open Activity Monitor
  2. Run this in your terminal: php -r 'while (true) {};'
  3. See the CPU usage spike to 99%
  4. Try this in your terminal: php -r 'while (true) { usleep(10000); };'
  5. See the CPU usage stay at 1%

@acrobat
Copy link
Member Author

acrobat commented Sep 6, 2017

The reasons behind the changes sound logic to me. But in our application we use a single worker to watch, let's say, 6 queues (and route their messsages to 6 different services based on the queue they are in). If the first 5 queues are empty it takes 25 seconds to get to the 6th queue which has messages.

So if I understand this change correctly it is better to create 6 separate workers watching only a single queue?

@henrikbjorn
Copy link
Contributor

It might be. Someone did a RoundRobin consumer thingy, that might help you ? In that case the wait time can be lowered locally.

@acrobat
Copy link
Member Author

acrobat commented Oct 24, 2017

I've upgraded our bernard lib today and indeed the roundrobin queue is a problem when processing a amount of queues. Maybe it's a good idea to allow passing a default "wait time" to the queue factory so you can decide in your setup what a potentional wait time should be.

I've currently fixed it by extending the driver and override the popMessage method. But it could be fixed this way

If we change this line https://github.com/bernardphp/bernard/blob/master/src/Driver/PhpAmqpDriver.php#L104 (same for all drivers)

while (microtime(true) < $runtime) { to while (microtime(true) <= $runtime) {

This will allow users to set it to 0 and get previous behavior (atleast for amqp users)

What do you think?

@aembler
Copy link

aembler commented Mar 19, 2018

I'm seeing this behavior too. What we'd like to have is a multiple queues within our application, powered by the Tactician/Bernard bridge. So we'd have many commands like "RescanFileCommand", "EmptyTrashCommand", "CopyPageCommand", etc... and a single queue script run through the CLI:

concrete/bin/concrete5 queue:process

Which when looks through all our queues (using our configuration system). This creates a RoundRobinQueue, processing all the items as necessary. This does work, but right now it takes 30-40 seconds to copy 10 pages, presumably because it's iterating through all the drivers (most of which are empty), while the page waits for the operations to complete. But if you run

concrete/bin/concrete5 queue:process copy_page

It's nearly instantaneous. I don't mind forking code or overriding certain functionality, but we'd like to keep most of the drivers the same, and not create our own version of the drivers (since the drivers for the different queues are some of the most attractive features of Bernard).

@acrobat
Copy link
Member Author

acrobat commented Mar 22, 2018

I think this is something we can pick up when working towards the 1.0 stable release.

@henrikbjorn
Copy link
Contributor

Just be cautious of the previously mentioned cpu issue by having the wait time being low.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants