Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix transient timeout failures #457

Open
danielsf opened this issue Feb 17, 2022 · 1 comment
Open

Fix transient timeout failures #457

danielsf opened this issue Feb 17, 2022 · 1 comment

Comments

@danielsf
Copy link
Contributor

Over the last two days, I've seen the transient timeout failures resurface in ophys_etl_pipelines. See for instance:

https://app.circleci.com/pipelines/github/AllenInstitute/ophys_etl_pipelines/2417/workflows/f065ff74-74f7-45dd-ba5e-2cb4d807c14b/jobs/7870

I think that the failures always arise from code that invokes some variation on multiprocessing.Pool.

I cannot say for certain why CircleCI has a problem with multiprocessing.Pool. I would like to propose we solve the problem by replacing Pool with by-hand invocations of multiprocessing (i.e. directly starting Process instances, and calling join on them as necessary). I do not believe code that uses multiprocessing that way has ever provoked this failure.

Modules that use multiprocessing.Pool include

sine_dewarp
segmentation.modules.calculate_edges
dff
qc.video.correlation_graph

These are all of the modules that were giving me trouble when I was transitioning our CircleCI builds over to Docker containers that we built.

I am happy to discuss this further with anyone who thinks I am mistaken.

@danielsf
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant