Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve speed of loading small NetCDF files #6229

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

bouweandela
Copy link
Member

@bouweandela bouweandela commented Nov 13, 2024

🚀 Pull Request

Description

This reduces the time required to load small NetCDF files, especially those with many variables.

Using the files and timing script from #6223, this pull request speeds up loading the file with a single variable by 30% and the file with many variables by 200%.


Consult Iris pull request check list


Add any of the below labels to trigger actions on this PR:

  • benchmark_this Request that this pull request be benchmarked to check if it introduces performance shifts

Copy link

codecov bot commented Nov 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.83%. Comparing base (2443396) to head (d4b0ee7).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6229   +/-   ##
=======================================
  Coverage   89.83%   89.83%           
=======================================
  Files          88       88           
  Lines       23315    23317    +2     
  Branches     4338     4339    +1     
=======================================
+ Hits        20945    20947    +2     
  Misses       1644     1644           
  Partials      726      726           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bouweandela bouweandela added the benchmark_this Request that this pull request be benchmarked to check if it introduces performance shifts label Nov 13, 2024
Copy link
Contributor

⏱️ Performance Benchmark Report: 0e462c7

Performance shifts
| Change   | Before [2443396f]    | After [0e462c7d]    |   Ratio | Benchmark (Parameter)                                          |
|----------|----------------------|---------------------|---------|----------------------------------------------------------------|
| -        | 27.5±0.9ms           | 18.1±0.2ms          |    0.66 | load.LoadAndRealise.time_load((1280, 960, 5), False, 'NetCDF') |
| -        | 22.7±0.5ms           | 13.1±0.08ms         |    0.58 | load.LoadAndRealise.time_load((1280, 960, 5), True, 'NetCDF')  |
| -        | 22.8±0.3ms           | 13.2±0.1ms          |    0.58 | load.LoadAndRealise.time_load((2, 2, 1000), False, 'NetCDF')   |
| -        | 22.8±0.2ms           | 13.2±0.06ms         |    0.58 | load.LoadAndRealise.time_load((2, 2, 1000), True, 'NetCDF')    |
| -        | 21.0±0.1ms           | 11.6±0.1ms          |    0.55 | load.LoadAndRealise.time_load((50, 50, 2), False, 'NetCDF')    |
| -        | 21.4±0.3ms           | 11.7±0.1ms          |    0.55 | load.LoadAndRealise.time_load((50, 50, 2), True, 'NetCDF')     |
| -        | 365±8ms              | 83.2±0.6ms          |    0.23 | load.ManyVars.time_many_var_load                               |
| -        | 25.4±0.5ms           | 14.9±0.2ms          |    0.59 | load.TimeConstraint.time_time_constraint(20, 'NetCDF')         |
| -        | 24.2±0.2ms           | 14.5±0.2ms          |    0.6  | load.TimeConstraint.time_time_constraint(3, 'NetCDF')          |
| -        | 19.7±0.4ms           | 13.1±0.2ms          |    0.66 | load.ugrid.BasicLoading.time_load_file(1)                      |
| -        | 14.8±0.3ms           | 8.65±0.1ms          |    0.58 | load.ugrid.BasicLoading.time_load_mesh(1)                      |
| -        | 25.2±0.3ms           | 19.7±0.6ms          |    0.78 | load.ugrid.BasicLoading.time_load_mesh(200000)                 |
| -        | 19.6±0.3ms           | 13.7±0.4ms          |    0.7  | load.ugrid.BasicLoadingTime.time_load_file(1)                  |
| -        | 24.2±0.5ms           | 18.2±0.5ms          |    0.75 | load.ugrid.BasicLoadingTime.time_load_file(200000)             |
| -        | 14.7±0.2ms           | 8.84±0.4ms          |    0.6  | load.ugrid.BasicLoadingTime.time_load_mesh(1)                  |
| -        | 19.5±0.4ms           | 13.5±0.4ms          |    0.69 | load.ugrid.BasicLoadingTime.time_load_mesh(200000)             |
| -        | 20.4±0.3ms           | 14.3±0.2ms          |    0.7  | load.ugrid.Callback.time_load_file_callback(1)                 |
| -        | 20.5±0.4ms           | 14.7±0.3ms          |    0.72 | load.ugrid.CallbackTime.time_load_file_callback(1)             |
| -        | 25.9±0.4ms           | 19.6±0.6ms          |    0.76 | load.ugrid.CallbackTime.time_load_file_callback(200000)        |
Full benchmark results

Benchmarks that have improved:

| Change   | Before [2443396f]    | After [0e462c7d]    |   Ratio | Benchmark (Parameter)                                          |
|----------|----------------------|---------------------|---------|----------------------------------------------------------------|
| -        | 27.5±0.9ms           | 18.1±0.2ms          |    0.66 | load.LoadAndRealise.time_load((1280, 960, 5), False, 'NetCDF') |
| -        | 22.7±0.5ms           | 13.1±0.08ms         |    0.58 | load.LoadAndRealise.time_load((1280, 960, 5), True, 'NetCDF')  |
| -        | 22.8±0.3ms           | 13.2±0.1ms          |    0.58 | load.LoadAndRealise.time_load((2, 2, 1000), False, 'NetCDF')   |
| -        | 22.8±0.2ms           | 13.2±0.06ms         |    0.58 | load.LoadAndRealise.time_load((2, 2, 1000), True, 'NetCDF')    |
| -        | 21.0±0.1ms           | 11.6±0.1ms          |    0.55 | load.LoadAndRealise.time_load((50, 50, 2), False, 'NetCDF')    |
| -        | 21.4±0.3ms           | 11.7±0.1ms          |    0.55 | load.LoadAndRealise.time_load((50, 50, 2), True, 'NetCDF')     |
| -        | 365±8ms              | 83.2±0.6ms          |    0.23 | load.ManyVars.time_many_var_load                               |
| -        | 25.4±0.5ms           | 14.9±0.2ms          |    0.59 | load.TimeConstraint.time_time_constraint(20, 'NetCDF')         |
| -        | 24.2±0.2ms           | 14.5±0.2ms          |    0.6  | load.TimeConstraint.time_time_constraint(3, 'NetCDF')          |
| -        | 19.7±0.4ms           | 13.1±0.2ms          |    0.66 | load.ugrid.BasicLoading.time_load_file(1)                      |
| -        | 14.8±0.3ms           | 8.65±0.1ms          |    0.58 | load.ugrid.BasicLoading.time_load_mesh(1)                      |
| -        | 25.2±0.3ms           | 19.7±0.6ms          |    0.78 | load.ugrid.BasicLoading.time_load_mesh(200000)                 |
| -        | 19.6±0.3ms           | 13.7±0.4ms          |    0.7  | load.ugrid.BasicLoadingTime.time_load_file(1)                  |
| -        | 24.2±0.5ms           | 18.2±0.5ms          |    0.75 | load.ugrid.BasicLoadingTime.time_load_file(200000)             |
| -        | 14.7±0.2ms           | 8.84±0.4ms          |    0.6  | load.ugrid.BasicLoadingTime.time_load_mesh(1)                  |
| -        | 19.5±0.4ms           | 13.5±0.4ms          |    0.69 | load.ugrid.BasicLoadingTime.time_load_mesh(200000)             |
| -        | 20.4±0.3ms           | 14.3±0.2ms          |    0.7  | load.ugrid.Callback.time_load_file_callback(1)                 |
| -        | 20.5±0.4ms           | 14.7±0.3ms          |    0.72 | load.ugrid.CallbackTime.time_load_file_callback(1)             |
| -        | 25.9±0.4ms           | 19.6±0.6ms          |    0.76 | load.ugrid.CallbackTime.time_load_file_callback(200000)        |

Benchmarks that have stayed the same:

| Change   | Before [2443396f]    | After [0e462c7d]    |   Ratio | Benchmark (Parameter)                                                                       |
|----------|----------------------|---------------------|---------|---------------------------------------------------------------------------------------------|
|          | 60.3±0.8ms           | 59.9±0.7ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_COUNT(False)                              |
|          | 61.2±0.6ms           | 61.1±0.7ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_COUNT(True)                               |
|          | 217±2ms              | 214±4ms             |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_FAST_PERCENTILE(False)                    |
|          | 216±2ms              | 216±1ms             |    1    | aggregate_collapse.Aggregation.time_aggregated_by_FAST_PERCENTILE(True)                     |
|          | 39.2±0.5ms           | 38.9±0.3ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_GMEAN(False)                              |
|          | 40.2±0.3ms           | 39.5±0.4ms          |    0.98 | aggregate_collapse.Aggregation.time_aggregated_by_GMEAN(True)                               |
|          | 39.2±0.4ms           | 38.9±0.5ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_HMEAN(False)                              |
|          | 40.2±0.5ms           | 39.8±0.6ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_HMEAN(True)                               |
|          | 52.1±0.6ms           | 52.2±0.8ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MAX(False)                                |
|          | 52.6±0.6ms           | 52.7±2ms            |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MAX(True)                                 |
|          | 137±1ms              | 137±1ms             |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MAX_RUN(False)                            |
|          | 138±1ms              | 138±1ms             |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MAX_RUN(True)                             |
|          | 57.3±0.8ms           | 57.1±0.7ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MEAN(False)                               |
|          | 57.9±0.7ms           | 57.9±0.7ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MEAN(True)                                |
|          | 38.9±0.6ms           | 38.8±0.4ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MEDIAN(False)                             |
|          | 39.9±0.5ms           | 39.6±0.5ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_MEDIAN(True)                              |
|          | 51.4±1ms             | 51.6±1ms            |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MIN(False)                                |
|          | 52.2±1ms             | 52.4±1ms            |    1    | aggregate_collapse.Aggregation.time_aggregated_by_MIN(True)                                 |
|          | 1.10±0.01s           | 1.10±0.01s          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_PEAK(False)                               |
|          | 1.09±0.01s           | 1.09±0.01s          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_PEAK(True)                                |
|          | 497±20ms             | 496±20ms            |    1    | aggregate_collapse.Aggregation.time_aggregated_by_PERCENTILE(False)                         |
|          | 498±20ms             | 506±20ms            |    1.02 | aggregate_collapse.Aggregation.time_aggregated_by_PERCENTILE(True)                          |
|          | 37.3±0.5ms           | 36.9±0.5ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_PROPORTION(False)                         |
|          | 38.1±0.3ms           | 38.0±0.4ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_PROPORTION(True)                          |
|          | 68.6±0.8ms           | 68.4±0.8ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_RMS(False)                                |
|          | 70.4±0.9ms           | 69.9±1ms            |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_RMS(True)                                 |
|          | 71.2±0.4ms           | 71.5±1ms            |    1    | aggregate_collapse.Aggregation.time_aggregated_by_STD_DEV(False)                            |
|          | 71.4±1ms             | 71.7±0.9ms          |    1    | aggregate_collapse.Aggregation.time_aggregated_by_STD_DEV(True)                             |
|          | 67.3±0.9ms           | 66.7±0.9ms          |    0.99 | aggregate_collapse.Aggregation.time_aggregated_by_VARIANCE(False)                           |
|          | 67.4±0.7ms           | 67.9±0.4ms          |    1.01 | aggregate_collapse.Aggregation.time_aggregated_by_VARIANCE(True)                            |
|          | 25.9±0.4ms           | 25.7±0.3ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_COUNT(False)                               |
|          | 30.3±0.4ms           | 29.9±0.3ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_COUNT(True)                                |
|          | 149±1ms              | 150±1ms             |    1    | aggregate_collapse.Aggregation.time_collapsed_by_FAST_PERCENTILE(False)                     |
|          | 164±2ms              | 166±2ms             |    1.01 | aggregate_collapse.Aggregation.time_collapsed_by_FAST_PERCENTILE(True)                      |
|          | 24.0±0.3ms           | 24.0±0.2ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_GMEAN(False)                               |
|          | 28.3±0.3ms           | 28.0±0.3ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_GMEAN(True)                                |
|          | 23.7±0.3ms           | 23.7±0.2ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_HMEAN(False)                               |
|          | 28.1±0.4ms           | 28.1±0.3ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_HMEAN(True)                                |
|          | 24.8±0.3ms           | 24.6±0.2ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_MAX(False)                                 |
|          | 28.8±0.5ms           | 28.9±0.7ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_MAX(True)                                  |
|          | 38.5±0.7ms           | 38.5±0.5ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_MAX_RUN(False)                             |
|          | 42.3±0.7ms           | 42.5±1ms            |    1    | aggregate_collapse.Aggregation.time_collapsed_by_MAX_RUN(True)                              |
|          | 25.3±0.4ms           | 25.2±0.4ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_MEAN(False)                                |
|          | 29.5±0.4ms           | 29.2±0.4ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_MEAN(True)                                 |
|          | 25.1±0.4ms           | 25.2±0.5ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_MEDIAN(False)                              |
|          | 29.4±0.5ms           | 29.4±0.5ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_MEDIAN(True)                               |
|          | 24.8±0.6ms           | 24.8±0.6ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_MIN(False)                                 |
|          | 28.8±0.4ms           | 28.6±0.3ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_MIN(True)                                  |
|          | 550±5ms              | 545±4ms             |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_PEAK(False)                                |
|          | 550±5ms              | 554±5ms             |    1.01 | aggregate_collapse.Aggregation.time_collapsed_by_PEAK(True)                                 |
|          | 165±2ms              | 165±1ms             |    1    | aggregate_collapse.Aggregation.time_collapsed_by_PERCENTILE(False)                          |
|          | 184±3ms              | 186±2ms             |    1.01 | aggregate_collapse.Aggregation.time_collapsed_by_PERCENTILE(True)                           |
|          | 23.6±0.3ms           | 23.6±0.2ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_PROPORTION(False)                          |
|          | 28.2±0.5ms           | 27.7±0.3ms          |    0.98 | aggregate_collapse.Aggregation.time_collapsed_by_PROPORTION(True)                           |
|          | 27.4±0.5ms           | 27.5±0.6ms          |    1    | aggregate_collapse.Aggregation.time_collapsed_by_RMS(False)                                 |
|          | 31.5±0.7ms           | 31.1±0.4ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_RMS(True)                                  |
|          | 26.8±0.4ms           | 26.5±0.2ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_STD_DEV(False)                             |
|          | 31.1±0.4ms           | 30.5±0.4ms          |    0.98 | aggregate_collapse.Aggregation.time_collapsed_by_STD_DEV(True)                              |
|          | 26.4±0.6ms           | 26.2±0.6ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_VARIANCE(False)                            |
|          | 30.4±0.6ms           | 30.2±0.3ms          |    0.99 | aggregate_collapse.Aggregation.time_collapsed_by_VARIANCE(True)                             |
|          | 94.5±0.8ms           | 94.4±1ms            |    1    | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_MEAN(False)                     |
|          | 95.4±0.6ms           | 95.3±1ms            |    1    | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_MEAN(True)                      |
|          | 108±1ms              | 108±1ms             |    1    | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_RMS(False)                      |
|          | 108±1ms              | 108±0.8ms           |    1    | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_RMS(True)                       |
|          | 64.6±1ms             | 64.2±0.7ms          |    0.99 | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_SUM(False)                      |
|          | 64.7±0.5ms           | 65.3±1ms            |    1.01 | aggregate_collapse.WeightedAggregation.time_w_aggregated_by_SUM(True)                       |
|          | 31.7±0.7ms           | 31.0±0.7ms          |    0.98 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_MEAN(False)                      |
|          | 35.8±0.7ms           | 35.9±0.6ms          |    1    | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_MEAN(True)                       |
|          | 34.0±1ms             | 33.6±1ms            |    0.99 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_RMS(False)                       |
|          | 37.1±1ms             | 36.7±0.8ms          |    0.99 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_RMS(True)                        |
|          | 27.0±0.7ms           | 26.7±0.3ms          |    0.99 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_SUM(False)                       |
|          | 31.4±0.5ms           | 30.9±0.3ms          |    0.98 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_SUM(True)                        |
|          | 337±3ms              | 337±4ms             |    1    | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_WPERCENTILE(False)               |
|          | 352±3ms              | 357±5ms             |    1.01 | aggregate_collapse.WeightedAggregation.time_w_collapsed_by_WPERCENTILE(True)                |
|          | 1.12±0.02ms          | 1.10±0.01ms         |    0.99 | cube.CubeCreation.time_create(False, 'construct')                                           |
|          | 398±6μs              | 397±3μs             |    1    | cube.CubeCreation.time_create(False, 'instantiate')                                         |
|          | 940±10μs             | 948±8μs             |    1.01 | cube.CubeCreation.time_create(True, 'construct')                                            |
|          | 584±6μs              | 565±6μs             |    0.97 | cube.CubeCreation.time_create(True, 'instantiate')                                          |
|          | 235±3ms              | 238±2ms             |    1.01 | cube.CubeEquality.time_equality(False, False, 'all_equal')                                  |
|          | 130±2ms              | 128±2ms             |    0.98 | cube.CubeEquality.time_equality(False, False, 'coord_inequality')                           |
|          | 263±2ms              | 264±1ms             |    1.01 | cube.CubeEquality.time_equality(False, False, 'data_inequality')                            |
|          | 16.4±0.2μs           | 16.6±0.1μs          |    1.01 | cube.CubeEquality.time_equality(False, False, 'metadata_inequality')                        |
|          | 342±5ms              | 340±4ms             |    0.99 | cube.CubeEquality.time_equality(False, True, 'all_equal')                                   |
|          | 232±2ms              | 232±2ms             |    1    | cube.CubeEquality.time_equality(False, True, 'coord_inequality')                            |
|          | 369±2ms              | 365±4ms             |    0.99 | cube.CubeEquality.time_equality(False, True, 'data_inequality')                             |
|          | 17.0±0.3μs           | 17.0±0.2μs          |    1    | cube.CubeEquality.time_equality(False, True, 'metadata_inequality')                         |
|          | 240±3ms              | 238±2ms             |    0.99 | cube.CubeEquality.time_equality(True, False, 'all_equal')                                   |
|          | 127±2ms              | 128±2ms             |    1.01 | cube.CubeEquality.time_equality(True, False, 'coord_inequality')                            |
|          | 262±1ms              | 263±2ms             |    1.01 | cube.CubeEquality.time_equality(True, False, 'data_inequality')                             |
|          | 53.0±0.4μs           | 52.3±0.4μs          |    0.99 | cube.CubeEquality.time_equality(True, False, 'metadata_inequality')                         |
|          | 341±4ms              | 340±6ms             |    1    | cube.CubeEquality.time_equality(True, True, 'all_equal')                                    |
|          | 233±4ms              | 232±2ms             |    0.99 | cube.CubeEquality.time_equality(True, True, 'coord_inequality')                             |
|          | 367±3ms              | 366±2ms             |    1    | cube.CubeEquality.time_equality(True, True, 'data_inequality')                              |
|          | 54.4±0.6μs           | 54.8±0.6μs          |    1.01 | cube.CubeEquality.time_equality(True, True, 'metadata_inequality')                          |
|          | 795±10μs             | 792±4μs             |    1    | import_iris.Iris.time__concatenate                                                          |
|          | 182±2μs              | 180±2μs             |    0.99 | import_iris.Iris.time__constraints                                                          |
|          | 109±1μs              | 110±1μs             |    1.01 | import_iris.Iris.time__data_manager                                                         |
|          | 93.2±0.5μs           | 94.7±0.8μs          |    1.02 | import_iris.Iris.time__deprecation                                                          |
|          | 138±1μs              | 139±0.9μs           |    1.01 | import_iris.Iris.time__lazy_data                                                            |
|          | 919±20μs             | 918±7μs             |    1    | import_iris.Iris.time__merge                                                                |
|          | 76.5±1μs             | 76.5±0.7μs          |    1    | import_iris.Iris.time__representation                                                       |
|          | 605±9μs              | 595±4μs             |    0.98 | import_iris.Iris.time_analysis                                                              |
|          | 140±0.9μs            | 143±1μs             |    1.02 | import_iris.Iris.time_analysis__area_weighted                                               |
|          | 110±0.5μs            | 110±1μs             |    1    | import_iris.Iris.time_analysis__grid_angles                                                 |
|          | 244±3μs              | 241±3μs             |    0.98 | import_iris.Iris.time_analysis__interpolation                                               |
|          | 191±2μs              | 191±2μs             |    1    | import_iris.Iris.time_analysis__regrid                                                      |
|          | 112±0.7μs            | 113±1μs             |    1.01 | import_iris.Iris.time_analysis__scipy_interpolate                                           |
|          | 141±2μs              | 142±0.8μs           |    1    | import_iris.Iris.time_analysis_calculus                                                     |
|          | 335±3μs              | 336±5μs             |    1    | import_iris.Iris.time_analysis_cartography                                                  |
|          | 94.6±1μs             | 94.8±0.5μs          |    1    | import_iris.Iris.time_analysis_geomerty                                                     |
|          | 219±2μs              | 220±3μs             |    1    | import_iris.Iris.time_analysis_maths                                                        |
|          | 97.7±1μs             | 99.5±0.8μs          |    1.02 | import_iris.Iris.time_analysis_stats                                                        |
|          | 175±2μs              | 176±2μs             |    1.01 | import_iris.Iris.time_analysis_trajectory                                                   |
|          | 306±4μs              | 306±6μs             |    1    | import_iris.Iris.time_aux_factory                                                           |
|          | 85.0±1μs             | 84.8±0.6μs          |    1    | import_iris.Iris.time_common                                                                |
|          | 163±2μs              | 164±2μs             |    1.01 | import_iris.Iris.time_common_lenient                                                        |
|          | 1.34±0.01ms          | 1.36±0.02ms         |    1.01 | import_iris.Iris.time_common_metadata                                                       |
|          | 138±0.6μs            | 139±0.9μs           |    1    | import_iris.Iris.time_common_mixin                                                          |
|          | 1.19±0.01ms          | 1.20±0.01ms         |    1.01 | import_iris.Iris.time_common_resolve                                                        |
|          | 199±2μs              | 200±2μs             |    1.01 | import_iris.Iris.time_config                                                                |
|          | 124±0.7μs            | 123±0.9μs           |    1    | import_iris.Iris.time_coord_categorisation                                                  |
|          | 366±4μs              | 364±3μs             |    1    | import_iris.Iris.time_coord_systems                                                         |
|          | 754±7μs              | 754±5μs             |    1    | import_iris.Iris.time_coords                                                                |
|          | 658±7μs              | 660±5μs             |    1    | import_iris.Iris.time_cube                                                                  |
|          | 224±3μs              | 226±4μs             |    1.01 | import_iris.Iris.time_exceptions                                                            |
|          | 77.2±0.6μs           | 77.5±0.8μs          |    1    | import_iris.Iris.time_experimental                                                          |
|          | 186±1μs              | 184±1μs             |    0.99 | import_iris.Iris.time_fileformats                                                           |
|          | 251±1μs              | 253±3μs             |    1.01 | import_iris.Iris.time_fileformats__ff                                                       |
|          | 2.74±0.02ms          | 2.75±0.02ms         |    1    | import_iris.Iris.time_fileformats__ff_cross_references                                      |
|          | 79.2±0.9μs           | 79.7±1μs            |    1.01 | import_iris.Iris.time_fileformats__pp_lbproc_pairs                                          |
|          | 115±0.6μs            | 115±1μs             |    1    | import_iris.Iris.time_fileformats_abf                                                       |
|          | 412±3μs              | 407±4μs             |    0.99 | import_iris.Iris.time_fileformats_cf                                                        |
|          | 5.33±0.07ms          | 5.37±0.08ms         |    1.01 | import_iris.Iris.time_fileformats_dot                                                       |
|          | 75.0±2μs             | 75.4±0.5μs          |    1.01 | import_iris.Iris.time_fileformats_name                                                      |
|          | 259±4μs              | 258±3μs             |    0.99 | import_iris.Iris.time_fileformats_name_loaders                                              |
|          | 120±1μs              | 119±1μs             |    0.99 | import_iris.Iris.time_fileformats_netcdf                                                    |
|          | 124±0.8μs            | 125±2μs             |    1.01 | import_iris.Iris.time_fileformats_nimrod                                                    |
|          | 212±4μs              | 215±3μs             |    1.02 | import_iris.Iris.time_fileformats_nimrod_load_rules                                         |
|          | 791±7μs              | 794±4μs             |    1    | import_iris.Iris.time_fileformats_pp                                                        |
|          | 185±1μs              | 184±2μs             |    0.99 | import_iris.Iris.time_fileformats_pp_load_rules                                             |
|          | 134±3μs              | 137±1μs             |    1.02 | import_iris.Iris.time_fileformats_pp_save_rules                                             |
|          | 551±2μs              | 548±5μs             |    0.99 | import_iris.Iris.time_fileformats_rules                                                     |
|          | 220±2μs              | 217±2μs             |    0.99 | import_iris.Iris.time_fileformats_structured_array_identification                           |
|          | 83.9±0.9μs           | 83.3±0.4μs          |    0.99 | import_iris.Iris.time_fileformats_um                                                        |
|          | 164±1μs              | 160±0.7μs           |    0.98 | import_iris.Iris.time_fileformats_um__fast_load                                             |
|          | 139±2μs              | 139±1μs             |    1    | import_iris.Iris.time_fileformats_um__fast_load_structured_fields                           |
|          | 76.1±0.7μs           | 75.8±0.7μs          |    1    | import_iris.Iris.time_fileformats_um__ff_replacement                                        |
|          | 82.3±0.4μs           | 83.2±0.5μs          |    1.01 | import_iris.Iris.time_fileformats_um__optimal_array_structuring                             |
|          | 994±10μs             | 1.02±0.01ms         |    1.02 | import_iris.Iris.time_fileformats_um_cf_map                                                 |
|          | 139±1μs              | 137±0.7μs           |    0.99 | import_iris.Iris.time_io                                                                    |
|          | 175±3μs              | 175±2μs             |    1    | import_iris.Iris.time_io_format_picker                                                      |
|          | 288±2μs              | 289±3μs             |    1    | import_iris.Iris.time_iris                                                                  |
|          | 128±0.8μs            | 127±1μs             |    1    | import_iris.Iris.time_iterate                                                               |
|          | 8.28±0.03ms          | 8.34±0.1ms          |    1.01 | import_iris.Iris.time_palette                                                               |
|          | 1.91±0.01ms          | 1.90±0.01ms         |    1    | import_iris.Iris.time_plot                                                                  |
|          | 106±0.9μs            | 105±2μs             |    0.99 | import_iris.Iris.time_quickplot                                                             |
|          | 2.20±0.04ms          | 2.27±0.03ms         |    1.03 | import_iris.Iris.time_std_names                                                             |
|          | 1.78±0.01ms          | 1.78±0.02ms         |    1    | import_iris.Iris.time_symbols                                                               |
|          | 16.6±0.8ms           | 17.1±0.9ms          |    1.03 | import_iris.Iris.time_tests                                                                 |
|          | 257±4μs              | 253±2μs             |    0.98 | import_iris.Iris.time_third_party_cartopy                                                   |
|          | 4.75±0.02ms          | 4.74±0.03ms         |    1    | import_iris.Iris.time_third_party_cf_units                                                  |
|          | 119±1μs              | 119±0.8μs           |    1    | import_iris.Iris.time_third_party_cftime                                                    |
|          | 2.83±0.01ms          | 2.83±0.02ms         |    1    | import_iris.Iris.time_third_party_matplotlib                                                |
|          | 1.55±0ms             | 1.54±0ms            |    0.99 | import_iris.Iris.time_third_party_numpy                                                     |
|          | 175±2μs              | 171±2μs             |    0.98 | import_iris.Iris.time_third_party_scipy                                                     |
|          | 101±1μs              | 100±1μs             |    0.99 | import_iris.Iris.time_time                                                                  |
|          | 334±2μs              | 328±4μs             |    0.98 | import_iris.Iris.time_util                                                                  |
|          | 71.6±0.8μs           | 72.4±1μs            |    1.01 | iterate.IZip.time_izip                                                                      |
|          | 9.78±0.1ms           | 9.77±0.1ms          |    1    | load.LoadAndRealise.time_load((1280, 960, 5), False, 'FF')                                  |
|          | 9.90±0.05ms          | 9.85±0.1ms          |    0.99 | load.LoadAndRealise.time_load((1280, 960, 5), False, 'PP')                                  |
|          | 9.85±0.1ms           | 9.76±0.1ms          |    0.99 | load.LoadAndRealise.time_load((1280, 960, 5), True, 'FF')                                   |
|          | 9.95±0.1ms           | 9.87±0.09ms         |    0.99 | load.LoadAndRealise.time_load((1280, 960, 5), True, 'PP')                                   |
|          | 1.56±0.01s           | 1.51±0.01s          |    0.97 | load.LoadAndRealise.time_load((2, 2, 1000), False, 'FF')                                    |
|          | 1.57±0.01s           | 1.54±0.01s          |    0.98 | load.LoadAndRealise.time_load((2, 2, 1000), False, 'PP')                                    |
|          | 1.56±0.01s           | 1.50±0.01s          |    0.97 | load.LoadAndRealise.time_load((2, 2, 1000), True, 'FF')                                     |
|          | 1.57±0.01s           | 1.53±0.01s          |    0.97 | load.LoadAndRealise.time_load((2, 2, 1000), True, 'PP')                                     |
|          | 5.07±0.03ms          | 5.01±0.02ms         |    0.99 | load.LoadAndRealise.time_load((50, 50, 2), False, 'FF')                                     |
|          | 5.07±0.06ms          | 4.99±0.04ms         |    0.99 | load.LoadAndRealise.time_load((50, 50, 2), False, 'PP')                                     |
|          | 5.06±0.06ms          | 5.02±0.07ms         |    0.99 | load.LoadAndRealise.time_load((50, 50, 2), True, 'FF')                                      |
|          | 5.09±0.05ms          | 4.99±0.04ms         |    0.98 | load.LoadAndRealise.time_load((50, 50, 2), True, 'PP')                                      |
|          | 27.2±3ms             | 24.9±3ms            |    0.92 | load.LoadAndRealise.time_realise((1280, 960, 5), False, 'FF')                               |
|          | 19.8±0.5ms           | 19.9±0.4ms          |    1.01 | load.LoadAndRealise.time_realise((1280, 960, 5), False, 'NetCDF')                           |
|          | 13.6±1ms             | 13.4±1ms            |    0.98 | load.LoadAndRealise.time_realise((1280, 960, 5), False, 'PP')                               |
|          | 26.4±0.9ms           | 26.7±2ms            |    1.01 | load.LoadAndRealise.time_realise((1280, 960, 5), True, 'FF')                                |
|          | 79.5±0.6ms           | 79.8±0.6ms          |    1    | load.LoadAndRealise.time_realise((1280, 960, 5), True, 'NetCDF')                            |
|          | 25.9±1ms             | 25.9±2ms            |    1    | load.LoadAndRealise.time_realise((1280, 960, 5), True, 'PP')                                |
|          | 491±4ms              | 487±4ms             |    0.99 | load.LoadAndRealise.time_realise((2, 2, 1000), False, 'FF')                                 |
|          | 3.08±0.1ms           | 2.86±0.1ms          |    0.93 | load.LoadAndRealise.time_realise((2, 2, 1000), False, 'NetCDF')                             |
|          | 493±3ms              | 489±4ms             |    0.99 | load.LoadAndRealise.time_realise((2, 2, 1000), False, 'PP')                                 |
|          | 505±4ms              | 501±6ms             |    0.99 | load.LoadAndRealise.time_realise((2, 2, 1000), True, 'FF')                                  |
|          | 3.08±0.1ms           | 3.01±0.1ms          |    0.98 | load.LoadAndRealise.time_realise((2, 2, 1000), True, 'NetCDF')                              |
|          | 500±3ms              | 500±5ms             |    1    | load.LoadAndRealise.time_realise((2, 2, 1000), True, 'PP')                                  |
|          | 1.71±0.08ms          | 1.67±0.06ms         |    0.97 | load.LoadAndRealise.time_realise((50, 50, 2), False, 'FF')                                  |
|          | 2.95±0.09ms          | 2.84±0.07ms         |    0.96 | load.LoadAndRealise.time_realise((50, 50, 2), False, 'NetCDF')                              |
|          | 1.77±0.07ms          | 1.69±0.08ms         |    0.95 | load.LoadAndRealise.time_realise((50, 50, 2), False, 'PP')                                  |
|          | 1.75±0.06ms          | 1.77±0.08ms         |    1.01 | load.LoadAndRealise.time_realise((50, 50, 2), True, 'FF')                                   |
|          | 3.00±0.1ms           | 2.87±0.09ms         |    0.96 | load.LoadAndRealise.time_realise((50, 50, 2), True, 'NetCDF')                               |
|          | 1.73±0.08ms          | 1.71±0.06ms         |    0.99 | load.LoadAndRealise.time_realise((50, 50, 2), True, 'PP')                                   |
|          | 9.99±0.1ms           | 9.82±0.2ms          |    0.98 | load.STASHConstraint.time_stash_constraint((1280, 960, 5), 'FF')                            |
|          | 10.2±0.09ms          | 9.87±0.1ms          |    0.97 | load.STASHConstraint.time_stash_constraint((1280, 960, 5), 'PP')                            |
|          | 1.56±0.01s           | 1.52±0.01s          |    0.97 | load.STASHConstraint.time_stash_constraint((2, 2, 1000), 'FF')                              |
|          | 1.59±0.01s           | 1.55±0.01s          |    0.97 | load.STASHConstraint.time_stash_constraint((2, 2, 1000), 'PP')                              |
|          | 5.08±0.03ms          | 5.07±0.04ms         |    1    | load.STASHConstraint.time_stash_constraint((2, 2, 2), 'FF')                                 |
|          | 5.06±0.07ms          | 5.08±0.04ms         |    1    | load.STASHConstraint.time_stash_constraint((2, 2, 2), 'PP')                                 |
|          | 9.01±0.09ms          | 8.93±0.1ms          |    0.99 | load.StructuredFF.time_structured_load((1280, 960, 5), False)                               |
|          | 5.85±0.04ms          | 5.91±0.08ms         |    1.01 | load.StructuredFF.time_structured_load((1280, 960, 5), True)                                |
|          | 1.53±0.01s           | 1.52±0.01s          |    0.99 | load.StructuredFF.time_structured_load((2, 2, 1000), False)                                 |
|          | 534±3ms              | 538±7ms             |    1.01 | load.StructuredFF.time_structured_load((2, 2, 1000), True)                                  |
|          | 4.28±0.05ms          | 4.26±0.06ms         |    1    | load.StructuredFF.time_structured_load((2, 2, 2), False)                                    |
|          | 4.13±0.03ms          | 4.16±0.04ms         |    1.01 | load.StructuredFF.time_structured_load((2, 2, 2), True)                                     |
|          | 164±1ms              | 164±2ms             |    1    | load.TimeConstraint.time_time_constraint(20, 'FF')                                          |
|          | 167±1ms              | 166±3ms             |    0.99 | load.TimeConstraint.time_time_constraint(20, 'PP')                                          |
|          | 33.0±0.3ms           | 31.9±0.6ms          |    0.97 | load.TimeConstraint.time_time_constraint(3, 'FF')                                           |
|          | 33.2±0.4ms           | 32.4±0.2ms          |    0.98 | load.TimeConstraint.time_time_constraint(3, 'PP')                                           |
|          | 55.3±0.6ms           | 49.6±0.7ms          |    0.9  | load.ugrid.BasicLoading.time_load_file(200000)                                              |
|          | 65.7±0.8ms           | 59.6±1ms            |    0.91 | load.ugrid.Callback.time_load_file_callback(200000)                                         |
|          | 2.91±0.09ms          | 2.88±0.2ms          |    0.99 | load.ugrid.DataRealisation.time_realise_data(10000)                                         |
|          | 4.84±0.1ms           | 4.87±0.3ms          |    1    | load.ugrid.DataRealisation.time_realise_data(200000)                                        |
|          | 37.3±1ms             | 36.9±1ms            |    0.99 | load.ugrid.DataRealisationTime.time_realise_data(10000)                                     |
|          | 802±6ms              | 802±7ms             |    1    | load.ugrid.DataRealisationTime.time_realise_data(200000)                                    |
|          | 428±4ms              | 430±4ms             |    1    | merge_concat.Concatenate.time_concatenate(False)                                            |
|          | 425±4ms              | 424±3ms             |    1    | merge_concat.Concatenate.time_concatenate(True)                                             |
|          | 109±0.3M             | 109±0.2M            |    1    | merge_concat.Concatenate.tracemalloc_concatenate(False)                                     |
|          | 109±0.3M             | 109±0.2M            |    1    | merge_concat.Concatenate.tracemalloc_concatenate(True)                                      |
|          | 58.1±0.5ms           | 58.2±1ms            |    1    | merge_concat.Merge.time_merge                                                               |
|          | 1.62±0.4M            | 1.61±0.4M           |    1    | merge_concat.Merge.tracemalloc_merge                                                        |
|          | 500±10ns             | 462±2ns             |    0.92 | mesh.utils.regions_combine.CombineRegionsComputeRealData.time_compute_data(50)              |
|          | 213±2ms              | 214±2ms             |    1    | mesh.utils.regions_combine.CombineRegionsComputeRealData.time_compute_data(500)             |
|          | 658±1k               | 658±0.9k            |    1    | mesh.utils.regions_combine.CombineRegionsComputeRealData.tracemalloc_compute_data(50)       |
|          | 60.1±0M              | 60.1±0M             |    1    | mesh.utils.regions_combine.CombineRegionsComputeRealData.tracemalloc_compute_data(500)      |
|          | 17.1±0.2ms           | 16.8±0.09ms         |    0.99 | mesh.utils.regions_combine.CombineRegionsCreateCube.time_create_combined_cube(50)           |
|          | 18.8±0.4ms           | 19.0±0.6ms          |    1.01 | mesh.utils.regions_combine.CombineRegionsCreateCube.time_create_combined_cube(500)          |
|          | 808±0.9k             | 808±0.7k            |    1    | mesh.utils.regions_combine.CombineRegionsCreateCube.tracemalloc_create_combined_cube(50)    |
|          | 12.7±0M              | 12.7±0M             |    1    | mesh.utils.regions_combine.CombineRegionsCreateCube.tracemalloc_create_combined_cube(500)   |
|          | 119±2ms              | 120±2ms             |    1.01 | mesh.utils.regions_combine.CombineRegionsFileStreamedCalc.time_stream_file2file(50)         |
|          | 680±7ms              | 681±8ms             |    1    | mesh.utils.regions_combine.CombineRegionsFileStreamedCalc.time_stream_file2file(500)        |
|          | 1.2±0.02M            | 1.21±0.01M          |    1.01 | mesh.utils.regions_combine.CombineRegionsFileStreamedCalc.tracemalloc_stream_file2file(50)  |
|          | 96.2±0.01M           | 96.2±0.03M          |    1    | mesh.utils.regions_combine.CombineRegionsFileStreamedCalc.tracemalloc_stream_file2file(500) |
|          | 78.6±1ms             | 78.5±2ms            |    1    | mesh.utils.regions_combine.CombineRegionsSaveData.time_save(50)                             |
|          | 628±6ms              | 628±10ms            |    1    | mesh.utils.regions_combine.CombineRegionsSaveData.time_save(500)                            |
|          | 1.18±0.01M           | 1.18±0.01M          |    1    | mesh.utils.regions_combine.CombineRegionsSaveData.tracemalloc_save(50)                      |
|          | 96.2±0.02M           | 96.2±0.01M          |    1    | mesh.utils.regions_combine.CombineRegionsSaveData.tracemalloc_save(500)                     |
|          | 2.1752849999999997   | 2.1752849999999997  |    1    | mesh.utils.regions_combine.CombineRegionsSaveData.track_filesize_saved(50)                  |
|          | 216.01528499999998   | 216.01528499999998  |    1    | mesh.utils.regions_combine.CombineRegionsSaveData.track_filesize_saved(500)                 |
|          | 6.49±0.02ms          | 6.54±0.03ms         |    1.01 | plot.AuxSort.time_aux_sort                                                                  |
|          | 83.0±4ms             | 83.4±2ms            |    1.01 | regridding.CurvilinearRegridding.time_regrid_pic                                            |
|          | 136±3M               | 136±3M              |    1    | regridding.CurvilinearRegridding.tracemalloc_regrid_pic                                     |
|          | 101±0.9ms            | 100±0.8ms           |    0.99 | regridding.HorizontalChunkedRegridding.time_regrid_area_w                                   |
|          | 49.3±0.6ms           | 49.8±0.5ms          |    1.01 | regridding.HorizontalChunkedRegridding.time_regrid_area_w_new_grid                          |
|          | 106±0.03M            | 106±0.03M           |    1    | regridding.HorizontalChunkedRegridding.tracemalloc_regrid_area_w                            |
|          | 147±0.02M            | 147±0.02M           |    1    | regridding.HorizontalChunkedRegridding.tracemalloc_regrid_area_w_new_grid                   |
|          | 4.49±0.04ms          | 4.52±0.1ms          |    1.01 | save.NetcdfSave.time_netcdf_save_cube(50, False)                                            |
|          | 80.6±0.6ms           | 80.5±1ms            |    1    | save.NetcdfSave.time_netcdf_save_cube(50, True)                                             |
|          | 52.1±1ms             | 53.6±1ms            |    1.03 | save.NetcdfSave.time_netcdf_save_cube(600, False)                                           |
|          | 583±3ms              | 582±4ms             |    1    | save.NetcdfSave.time_netcdf_save_cube(600, True)                                            |
|          | 91.4±2ns             | 90.0±1ns            |    0.99 | save.NetcdfSave.time_netcdf_save_mesh(50, False)                                            |
|          | 62.9±0.8ms           | 62.5±0.5ms          |    0.99 | save.NetcdfSave.time_netcdf_save_mesh(50, True)                                             |
|          | 89.6±2ns             | 88.1±1ns            |    0.98 | save.NetcdfSave.time_netcdf_save_mesh(600, False)                                           |
|          | 514±3ms              | 513±3ms             |    1    | save.NetcdfSave.time_netcdf_save_mesh(600, True)                                            |
|          | 28.9±0.07k           | 28.9±0.1k           |    1    | save.NetcdfSave.tracemalloc_netcdf_save(50, False)                                          |
|          | 1.59±0.1M            | 1.63±0.2M           |    1.02 | save.NetcdfSave.tracemalloc_netcdf_save(50, True)                                           |
|          | 28.9±0.07k           | 28.9±0.09k          |    1    | save.NetcdfSave.tracemalloc_netcdf_save(600, False)                                         |
|          | 225±20M              | 225±30M             |    1    | save.NetcdfSave.tracemalloc_netcdf_save(600, True)                                          |
|          | 42.6±0.3ms           | 42.4±0.5ms          |    1    | stats.PearsonR.time_lazy                                                                    |
|          | 9.17±0.2ms           | 9.09±0.2ms          |    0.99 | stats.PearsonR.time_real                                                                    |
|          | 24.2±1M              | 24.4±0.8M           |    1.01 | stats.PearsonR.tracemalloc_lazy                                                             |
|          | 18.4±0.01M           | 18.4±0.01M          |    1    | stats.PearsonR.tracemalloc_real                                                             |
|          | 22.9±0.3ms           | 23.5±0.8ms          |    1.02 | trajectory.TrajectoryInterpolation.time_trajectory_linear                                   |
|          | 61.5±0.8ms           | 61.2±0.3ms          |    1    | trajectory.TrajectoryInterpolation.time_trajectory_nearest                                  |
|          | 23.3±0.01M           | 23.3±0.01M          |    1    | trajectory.TrajectoryInterpolation.tracemalloc_trajectory_linear                            |
|          | 12.1±0.03M           | 12.1±0.03M          |    1    | trajectory.TrajectoryInterpolation.tracemalloc_trajectory_nearest                           |

Generated by GHA run 11821338993

@bouweandela
Copy link
Member Author

@schlunma This seems to do great on the small example files you provided and the small benchmark test cases. Could you try this with your real-sized files to see if it is any good in practice?

@schlunma
Copy link
Contributor

These were real-sized files (the model saves only one time slice per file).

@bouweandela bouweandela changed the title Improve speed of loading NetCDF files Improve speed of loading small NetCDF files Nov 15, 2024
@bouweandela bouweandela marked this pull request as ready for review November 15, 2024 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark_this Request that this pull request be benchmarked to check if it introduces performance shifts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants