You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Previous benchmark have been done with the "Node-to-node" metric to answer the question "can we replace a CPU node with a GPU node".
As we gear toward operation, this metric is no longer enough, should also be backed with more scientifically relevant metrics (Gridpoint, SYPD, SDPD which seems to be the GMAO preferred metric etc.).
We should also start measuring ourselves against the SCU17/18 Milan nodes and their 128 cores.
Electric consumptions and price are also previous metric we should carry.
Another angle is scaling and operational usefulness of each hardware, so that the narrative to the scientists is clear.
This process should involve the GMAO but remain lead by us as to make sure we can deliver.
Overall, pragmatism is key: we are not here to give roofline projection and peak FLOPS, we are here to deliver day-to-day usage.
Document metrics to be used, their impact and logic
Create a version document of methodology to be applied for each metrics
The text was updated successfully, but these errors were encountered:
FlorianDeconinck
changed the title
Define operational and HPC metrics
[GEOS] Define operational and HPC metrics
May 22, 2024
Has part of this work we should also do projection of requirements for running bigger simulations, now and every year upward.
Per Tsengdar"
Can we estimate how many GPUs and CPU-GPU configuration that we need to support this project in C1440-L181 resolution in FY26? Do we have access to what we need?
Previous benchmark have been done with the "Node-to-node" metric to answer the question "can we replace a CPU node with a GPU node".
As we gear toward operation, this metric is no longer enough, should also be backed with more scientifically relevant metrics (Gridpoint, SYPD, SDPD which seems to be the GMAO preferred metric etc.).
We should also start measuring ourselves against the SCU17/18 Milan nodes and their 128 cores.
Electric consumptions and price are also previous metric we should carry.
Another angle is scaling and operational usefulness of each hardware, so that the narrative to the scientists is clear.
This process should involve the GMAO but remain lead by us as to make sure we can deliver.
Overall, pragmatism is key: we are not here to give roofline projection and peak FLOPS, we are here to deliver day-to-day usage.
The text was updated successfully, but these errors were encountered: