Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf and Pre-Allocation #396

Merged
merged 11 commits into from
Nov 28, 2024
Merged

Perf and Pre-Allocation #396

merged 11 commits into from
Nov 28, 2024

Conversation

willtebbutt
Copy link
Member

@willtebbutt willtebbutt commented Nov 28, 2024

This PR

  1. makes a function generated, because type inference sometimes falls over for it in some situations which matter, and
  2. moves the pre-allocation functionality in-house. It's largely a generalisation of stuff that's already been written in DI.jl, but having it here lets us test it thoroughly here, and then get DI.jl to just use it. This increases the level of confidence that it's possible for me to have that when a user uses DI.jl, they're getting the best possible level of performance.

Todo:

  • a few more tests for value_and_pullback!! with a cache
  • mark these new functions as "experimental" (I think the design is fine, but I don't want to commit to it yet)
  • verify locally that I'm able to use these functions to substantially simplify the implementation of the Mooncake extension in DI.jl.

Copy link

codecov bot commented Nov 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Files with missing lines Coverage Δ
src/fwds_rvs_data.jl 96.84% <100.00%> (ø)
src/interface.jl 94.82% <100.00%> (+3.16%) ⬆️
src/tangents.jl 86.98% <100.00%> (ø)

... and 2 files with indirect coverage changes

Copy link
Contributor

github-actions bot commented Nov 28, 2024

Performance Ratio:
Ratio of time to compute gradient and time to compute function.
Warning: results are very approximate! See here for more context.

┌────────────────────────────┬──────────┬─────────┬─────────────┬─────────┐
│                      Label │ Mooncake │  Zygote │ ReverseDiff │  Enzyme │
│                     String │   String │  String │      String │  String │
├────────────────────────────┼──────────┼─────────┼─────────────┼─────────┤
│                   sum_1000 │     70.7 │     1.0 │        5.61 │ missing │
│                  _sum_1000 │      6.7 │  1260.0 │        33.2 │ missing │
│               sum_sin_1000 │     2.24 │     1.7 │        10.6 │ missing │
│              _sum_sin_1000 │     2.58 │   279.0 │        12.9 │ missing │
│                   kron_sum │     54.9 │    3.31 │       187.0 │ missing │
│              kron_view_sum │     53.5 │    8.57 │       198.0 │ missing │
│      naive_map_sin_cos_exp │     2.51 │ missing │        7.09 │ missing │
│            map_sin_cos_exp │     2.83 │    1.54 │        6.12 │ missing │
│      broadcast_sin_cos_exp │     2.58 │    2.26 │        1.45 │ missing │
│                 simple_mlp │      7.6 │    3.27 │        12.0 │ missing │
│                     gp_lml │     11.4 │    4.49 │     missing │ missing │
│ turing_broadcast_benchmark │     3.18 │ missing │        24.9 │ missing │
│         large_single_block │     4.01 │  4660.0 │        30.9 │ missing │
└────────────────────────────┴──────────┴─────────┴─────────────┴─────────┘

@willtebbutt willtebbutt merged commit d2d97a2 into main Nov 28, 2024
74 checks passed
@willtebbutt willtebbutt deleted the wct/perf-fixes branch November 28, 2024 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant