Scheduling #7

rgaudin · 2024-05-29T15:43:43Z

Scheduling is entirely done in the backend.

The backend knows about the Active workers (those that showed themselves within some interval –1h?) and thus the testable countries.

Schedule code is ran periodically:

If there's no idle worker ➡️ quit (await next period)
get list of countries for idle workers
create test entries for all possible countries in DB
notify worker that there are entries
worker either receives notification or polls API
worker retrieves detail about one test ; informs API that it is processing it
worker runs the test and submits the data to the API
worker is back on its loop

Schedule clean-up is also ran periodically and consists in expiring Test requests for which data never arrived.

elfkuzco · 2024-05-30T11:44:59Z

Given we are sticking to the model of having one worker run at a time due to costs, what do you think about moving the scheduling logic to the worker manager. It queries the backend server at intervals and when there's a response with a country to check, it starts a worker and manages its clean up too? The backend server is essentially going to be giving it entries to check.

rgaudin · 2024-05-30T11:48:00Z

I don't understand the difference between the two options. Can you elaborate?

elfkuzco · 2024-05-30T11:54:47Z

Scheduling is entirely done in the backend.
The backend knows about the Active workers (those that showed themselves within some interval –1h?) and thus the testable countries.

I interpreted this to mean that it is the backend web server that would do the scheduling and thus know about the active workers.
What I am proposing is this:

At intervals, the worker manager makes a request to the backend web server requesting for work. The web server gives it a country to make a run against.
The manager starts the actual worker with the configuration (possibly from the web server too), collects the results and posts it back to the web server.
Repeats the process.
I think with this, we don't have to worry about idle workers since the manager is starting each one and tearing it down immediately.

-- EDIT --
I was under the impression we were starting one worker at a time. Still, we can have the web server return a list of jobs and the manager would start all the workers for those jobs.

rgaudin · 2024-05-30T12:11:35Z

What you describe is similar to what is in the ticket, for the part on the worker… but you're not explaining how the server is supposed to “give it a country to make a run against”. That's precisely the role of this ticket.

See worker behavior in #8 which I believe matches your expectations.

Now for the server to be able to request a run (mostly pick a country), it needs to know which countries can be tested.

That's where there can be different approaches:

prepopulate requests as described in the ticket
reply to requests with one country from the worker's pool

Both can work but with the second one, we loose control over the schedule and cannot say “we want 2 tests per day at most” for instance ; which might be something we want to not unnecessarily load the mirrors.
In the same way, workers never stop downloading in this mode which might be fine for a single-worker scenario but with multiple workers, it's not.

WDYT?

elfkuzco · 2024-05-30T12:21:16Z

Okay, I think I understand it now.

rgaudin added the enhancement New feature or request label May 29, 2024

rgaudin assigned elfkuzco May 29, 2024

rgaudin added this to the MVP milestone May 29, 2024

elfkuzco mentioned this issue Jun 20, 2024

set up scheduler to create tasks for idle workers #20

Merged

elfkuzco linked a pull request Jun 20, 2024 that will close this issue

set up scheduler to create tasks for idle workers #20

Merged

elfkuzco closed this as completed in #20 Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scheduling #7

Scheduling #7

rgaudin commented May 29, 2024

elfkuzco commented May 30, 2024 •

edited

Loading

rgaudin commented May 30, 2024

elfkuzco commented May 30, 2024 •

edited

Loading

rgaudin commented May 30, 2024

elfkuzco commented May 30, 2024

Scheduling #7

Scheduling #7

Comments

rgaudin commented May 29, 2024

elfkuzco commented May 30, 2024 • edited Loading

rgaudin commented May 30, 2024

elfkuzco commented May 30, 2024 • edited Loading

rgaudin commented May 30, 2024

elfkuzco commented May 30, 2024

elfkuzco commented May 30, 2024 •

edited

Loading

elfkuzco commented May 30, 2024 •

edited

Loading