You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the reconstruction volume is large, RAM needed becomes disproportional to the one CPU core:
Preparing 1 job with 1 CPU and 41 GB of memory per CPU.
Since HPC nodes don't usually have more than 15 GB RAM per CPU, this is leaving CPU horsepower on the table, as both PyTorch (compute) and zarr-python's numcodecs (I/O) can potentially be multi-threaded. This needs some tweaking of the threading though, as it used to cause problems in multi-processing.
The text was updated successfully, but these errors were encountered:
My understanding of the the current reconstruction pipeline is that for large arrays FFTs are the bottleneck, so I expect additional CPUs to give marginal improvements at best. Unless you have reason to believe torch can easily multiprocess an FFT(?).
I think this bottleneck might move to I/O when we move the FFTs to the GPU, though. I'll keep an eye out.
Unless you have reason to believe torch can easily multiprocess an FFT(?).
I would guess that torch can do multi-threaded FFT on the CPU just like it can on the GPU. There was a previous (maybe the current stable) version of recOrder/waveorder that benefits from multiple CPU threads for reconstructions.
When the reconstruction volume is large, RAM needed becomes disproportional to the one CPU core:
Since HPC nodes don't usually have more than 15 GB RAM per CPU, this is leaving CPU horsepower on the table, as both PyTorch (compute) and zarr-python's numcodecs (I/O) can potentially be multi-threaded. This needs some tweaking of the threading though, as it used to cause problems in multi-processing.
The text was updated successfully, but these errors were encountered: