Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dual number–based second derivatives for ForwardDiff #310

Merged
merged 16 commits into from
Jun 7, 2024

Conversation

gerlero
Copy link
Contributor

@gerlero gerlero commented Jun 6, 2024

As discussed in #305, I'm starting by copying the implementations from JuliaDiff/AbstractDifferentiation.jl#122 almost verbatim.

@codecov-commenter
Copy link

codecov-commenter commented Jun 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.08%. Comparing base (61c2741) to head (07ccb26).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #310      +/-   ##
==========================================
+ Coverage   96.05%   96.08%   +0.02%     
==========================================
  Files          95       95              
  Lines        4365     4398      +33     
==========================================
+ Hits         4193     4226      +33     
  Misses        172      172              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gdalle
Copy link
Member

gdalle commented Jun 6, 2024

You can disregard the Enzyme failures, they're obviously unrelated

@gdalle
Copy link
Member

gdalle commented Jun 6, 2024

Can you try using the functions in the utils.jl file to make this scalar-/vector- agnostic?

@gerlero
Copy link
Contributor Author

gerlero commented Jun 6, 2024

Can you try using the functions in the utils.jl file to make this scalar-/vector- agnostic?

I'll give it a try

@gdalle
Copy link
Member

gdalle commented Jun 6, 2024

Cool, thanks! It should be similar to the pushforward one, albeit more intricate

@gerlero
Copy link
Contributor Author

gerlero commented Jun 6, 2024

I'm using the functions from utils.jl now, but the code still won't work with vector inputs due to my use of make_dual(T, x, one(x)) (as one(x) is not defined for vectors).

@gerlero
Copy link
Contributor Author

gerlero commented Jun 6, 2024

It seems my last comment was due to some confusion. In a previous comment you only mention the need to handle $\mathbb{R}\to\mathbb{R}^n$ functions, so vector inputs should not expected to work with the derivative functions. Is this correct?

@gerlero
Copy link
Contributor Author

gerlero commented Jun 6, 2024

After a small fixup, the new functionality should be working now. I've tested it and it seems to handle vector-valued (w/ scalar input) functions just fine.

@gdalle
Copy link
Member

gdalle commented Jun 6, 2024

Yes, derivative is for scalar to scalar or scalar to vector. Thank you for the contribution, I'll allow the workflows and review tomorrow

@gdalle
Copy link
Member

gdalle commented Jun 6, 2024

Your code is never touched because you did not overload preparation, or provide methods which accept the additional "extras" argument. I'll fix it tomorrow, but if you want to do it yourself, you can take inspiration from the other methods in the extension

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

I'll fix it tomorrow, but if you want to do it yourself, you can take inspiration from the other methods in the extension

I tried it but had to give up (at least for now), as I'm not familiar at all with how extras and pushforward interact with each other (I tried returning a NoDerivativeExtras from prepare_derivative, but then I get ERROR: type NoDerivativeExtras has no field pushforward_extras which is not obvious to me how to fix). I'm guessing you have a much more solid idea in your mind for this part.

@gerlero gerlero marked this pull request as ready for review June 7, 2024 01:05
@gdalle gdalle changed the title Add dual number–based implementations for ForwardDiff Add dual number–based second derivatives for ForwardDiff Jun 7, 2024
@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

Indeed, the current version of the package gives errors like this one, which are very hard to parse:

type NoDerivativeExtras has no field pushforward_extras
  Stacktrace:
    [1] getproperty(x::DifferentiationInterface.NoDerivativeExtras, f::Symbol)
      @ Base ./Base.jl:37
    [2] value_and_derivative!(f::DifferentiationInterfaceTest.var"#num_to_arr#13"{StaticArraysCore.SMatrix{2, 3, Int64, 6}}, der::StaticArraysCore.MMatrix{2, 3, Float64, 6}, backend::AutoForwardDiff{2, Symbol}, x::Float64, extras::DifferentiationInterface.NoDerivativeExtras)
      @ DifferentiationInterface ~/work/DifferentiationInterface.jl/DifferentiationInterface.jl/DifferentiationInterface/src/first_order/derivative.jl:91

I thus coded #313 which should make errors much more pleasant, let's see how it looks after merging main into the current branch

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

And the new error is much more informative! I'll add these indications to the docs, but essentially it tells you that you forgot to implement some necessary methods:

MethodError: no method matching derivative(::DifferentiationInterfaceTest.var"#num_to_arr#13"{StaticArraysCore.SMatrix{2, 3, Int64, 6}}, ::AutoForwardDiff{2, Symbol}, ::Float64, ::DifferentiationInterface.NoDerivativeExtras)
  
  Closest candidates are:
    derivative(::F, ::Any, ::AutoForwardDiff, ::Any, ::DifferentiationInterfaceForwardDiffExt.ForwardDiffTwoArgDerivativeExtras) where F
     @ DifferentiationInterfaceForwardDiffExt ~/work/DifferentiationInterface.jl/DifferentiationInterface.jl/DifferentiationInterface/ext/DifferentiationInterfaceForwardDiffExt/twoarg.jl:95
    derivative(::F, ::Any, ::ADTypes.AbstractADType, ::Any, ::DifferentiationInterface.PushforwardDerivativeExtras) where F
     @ DifferentiationInterface ~/work/DifferentiationInterface.jl/DifferentiationInterface.jl/DifferentiationInterface/src/first_order/derivative.jl:147
    derivative(::F, ::Any, ::ADTypes.AbstractADType, ::Any) where F
     @ DifferentiationInterface ~/work/DifferentiationInterface.jl/DifferentiationInterface.jl/DifferentiationInterface/src/first_order/derivative.jl:127
    ...

Whenever you implement a new operator, you need to provide

  • the preparation step prepare_operator
  • the four variants operator, operator!, value_and_operator, value_and_operator!

Sorry to use you as a bit of a lab rat, but you're the first one to try and contribute something nontrivial to DI (for which I'm grateful!). That means it's a good test for whether our process is clear enough or not (obviously not)

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

Whenever you implement a new operator, you need to provide

  • the preparation step prepare_operator
  • the four variants operator, operator!, value_and_operator, value_and_operator!

Sorry to use you as a bit of a lab rat, but you're the first one to try and contribute something nontrivial to DI (for which I'm grateful!). That means it's a good test for whether our process is clear enough or not (obviously not)

No problem. Thanks! I've added derivative now too.

Not sure about derivative! and value_and_derivative!, though: are those supposed to return ForwardDiff.DiffResults structs or some other DI-specific format?

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

You can ask the docstrings: we return a tuple of the value and its derivative, whatever type they have, so no DiffResults.

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

Do you want me to finish the PR? I definitely can if you get tired of the DI kinks. But having one more autodiff-interested individual look at the guts of the code is rather instructive for me

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

I'll make my attempt to finish it

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

All right, I've added the missing methods for derivative! and value_and_derivative! (not sure if there's anything left to do).

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

I know what's going on!
Sorry for not clocking it earlier, but I think you're targeting the wrong operator here. You noticed that DI.derivative wasn't implemented in the ForwardDiff extension for one-argument functions. The reason is that it the derivative exactly a pushforward, and the existing pushforward code is basically optimal there.
The operator you're interested in is second_derivative, and if you look at the table here, you'll find its 4 variants are the following ones:

  • second_derivative
  • second_derivative!
  • value_derivative_and_second_derivative
  • value_derivative_and_second_derivative!

So you actually need to code these, plus a suitable prepare_second_derivative.
Again, if this is no fun for you, I can take it off your hands anytime.

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

(in addition there are test failures due to typos but that's an easy fix)

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

You can also drop value_and_derivative and its in-place counterpart. This too is taken care of by value_and_pushforward in an essentially optimal way.

Are we sure that doesn't rely on DiffResults at all?

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

Are we sure that doesn't rely on DiffResults at all?

  • derivative(f!, y, backend, x) (with its variants) relies on the ForwardDiff + DiffResults API, see here
  • derivative(f, backend, x) (with its variants) has no equivalent API to rely on, at least not a public one, which is why I don't think we gain much by going beyond what already exists for the pushforward?

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

  • derivative(f, backend, x) (with its variants) has no equivalent API to rely on, at least not a public one, which is why I don't think we gain much by going beyond what already exists for the pushforward?

Great. I'm still not sure what the fallback is for each method so I just wanted to check (a point of having value_and_derivative/value_derivative_and_second_derivative AFAIC is to avoid having to use the slower DiffResults API).

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

Whenever it exists, I prefer sticking to the backend's existing API, in particular the public parts. My rationale is that I can't afford to try and optimize 13 backends better than their own individual creators, so I make do with what said creators give me. ForwardDiff might be a bit of an exception because technically the Dual parts of the package are not public, but you get the general idea. If DiffResults is slow, it's on them, not on me.

If you want to investigate how value_and_derivative(f, backend, x) works for ForwardDiff without an explicit implementation, it falls back on pushforward via the following call chain:

Generic

https://github.com/gdalle/DifferentiationInterface.jl/blob/a99715453fe269f5a8f00d66d1a81fe3480acbd3/DifferentiationInterface/src/first_order/derivative.jl#L58-L65

https://github.com/gdalle/DifferentiationInterface.jl/blob/a99715453fe269f5a8f00d66d1a81fe3480acbd3/DifferentiationInterface/src/first_order/derivative.jl#L75-L77

https://github.com/gdalle/DifferentiationInterface.jl/blob/a99715453fe269f5a8f00d66d1a81fe3480acbd3/DifferentiationInterface/src/first_order/derivative.jl#L91-L95

Specific

https://github.com/gdalle/DifferentiationInterface.jl/blob/a99715453fe269f5a8f00d66d1a81fe3480acbd3/DifferentiationInterface/ext/DifferentiationInterfaceForwardDiffExt/onearg.jl#L30-L37

https://github.com/gdalle/DifferentiationInterface.jl/blob/a99715453fe269f5a8f00d66d1a81fe3480acbd3/DifferentiationInterface/ext/DifferentiationInterfaceForwardDiffExt/onearg.jl#L3-L19

Do you think we can get better than this?

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

Whenever it exists, I prefer sticking to the backend's existing API, in particular the public parts. My rationale is that I can't afford to try and optimize 13 backends better than their own individual creators, so I make do with what said creators give me. ForwardDiff might be a bit of an exception because technically the Dual parts of the package are not public, but you get the general idea. If DiffResults is slow, it's on them, not on me.

Yes, I know, and appreciate the exception you're making here! In fact, I have an open PR (JuliaDiff/ForwardDiff.jl#678) at ForwardDiff asking for this to be implemented there, but with no luck so far.

If you want to investigate how value_and_derivative(f, backend, x) works for ForwardDiff without an explicit implementation, it falls back on pushforward via the following call chain:

Do you think we can get better than this?

Reading through the code, I don't think so. Also, in my benchmark case this PR is now running as fast as AbstractDifferentiation.

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

Spoiler alert, I think (hope) the tests will fail because you didn't implement second_derivative or second_derivative!

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

Spoiler alert, I think (hope) the tests will fail because you didn't implement second_derivative or second_derivative!

Ouch! I'll add them.

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

Is it just me or are the checks running forever?

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

Sorry, I had an extra parameter (not sure where it came from) in second_derivative(!). Let's see now.

I'm also now running the tests locally to see what happens.

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

This file should be enough. Don't run the whole test suite you will need an hour or so

https://github.com/gdalle/DifferentiationInterface.jl/blob/main/DifferentiationInterfaceTest/test/forwarddiff.jl

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

This file should be enough. Don't run the whole test suite you will need an hour or so

https://github.com/gdalle/DifferentiationInterface.jl/blob/main/DifferentiationInterfaceTest/test/forwarddiff.jl

That's what I did. Caught another bug :/. Running again to see if it's fixed

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

When the tests fail, do you usually find it clear what you need to fix?

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

Should be fixed now. I'm getting a couple of efficiency test errors with ForwardDiff, but I'm not sure they are related.

When the tests fail, do you usually find it clear what you need to fix?

Yes, you get the stack trace right there as soon as a test fails, which is nice. IDK how it could be made better.

Copy link
Member

@gdalle gdalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

@gdalle gdalle merged commit e93d16a into JuliaDiff:main Jun 7, 2024
49 checks passed
@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

Thank you! Can I ask for a new release with this change?

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

Yes, i was just waiting to check that the tests pass on main.
Did you figure out why the efficiency tests were failing? Are you on a different Julia version perhaps?

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

Did you figure out why the efficiency tests were failing? Are you on a different Julia version perhaps?

Not really; I just saw that something is allocating when it shouldn't. FWIW these are the errors:

...
[ Info: Testing Single/ForwardDiff/efficiency.jl
derivative! - vecexp! : Float64 -> Vector{Float64}: Test Failed at /Users/gabriel/DifferentiationInterface.jl/DifferentiationInterface/test/Single/ForwardDiff/efficiency.jl:49dDiff (forward) - 1/1
  Expression: row[:allocs] == 0rio - 2/2
   Evaluated: 1.0 == 0
  arguments:      2
Stacktrace:       inplace
 [1] macro expansioncexp!
   @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/Test/src/Test.jl:672 [inlined]
 [2] macro expansion0,)
   @ ~/DifferentiationInterface.jl/DifferentiationInterface/test/Single/ForwardDiff/efficiency.jl:49 [inlined]
 [3] macro expansion0,)
   @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
 [4] macro expansion
   @ ~/DifferentiationInterface.jl/DifferentiationInterface/test/Single/ForwardDiff/efficiency.jl:49 [inlined]
 [5] macro expansion
   @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
 [6] top-level scope
   @ ~/DifferentiationInterface.jl/DifferentiationInterface/test/Single/ForwardDiff/efficiency.jl:18
jacobian! - vecexp! : Vector{Float64} -> Vector{Float64}: Test Failed at /Users/gabriel/DifferentiationInterface.jl/DifferentiationInterface/test/Single/ForwardDiff/efficiency.jl:49
  Expression: row[:allocs] == 0
   Evaluated: 1.0 == 0

Stacktrace:
 [1] macro expansion
   @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/Test/src/Test.jl:672 [inlined]
 [2] macro expansion
   @ ~/DifferentiationInterface.jl/DifferentiationInterface/test/Single/ForwardDiff/efficiency.jl:49 [inlined]
 [3] macro expansion
   @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
 [4] macro expansion
   @ ~/DifferentiationInterface.jl/DifferentiationInterface/test/Single/ForwardDiff/efficiency.jl:49 [inlined]
 [5] macro expansion
   @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
 [6] top-level scope
   @ ~/DifferentiationInterface.jl/DifferentiationInterface/test/Single/ForwardDiff/efficiency.jl:18
[ Info: Testing Single/ForwardDiff/test.jl
...

I'm on 1.10.4 on macOS arm64.

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

Very strange!
Anyway, the registration is triggered, should be merged in 15 minutes: JuliaRegistries/General#108479
Can I ask what packages you might wanna use DI for? Keeping this issue updated brings me great joy: #134

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

Can I ask what packages you might wanna use DI for? Keeping this issue updated brings me great joy: #134

I'll be using it in Fronts to replace AbstractDifferentiation as it is used in this file.

I should be ready to migrate as soon as the new release is live. I'll make sure to mention the relevant PR in #134.

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

Interesting! Since it will also allow you to try other backends, I wonder if symbolic backends might be close to optimal for such a one-dimensional case. What kind of functions do you often end up differentiating?

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

Interesting! Since it will also allow you to try other backends, I wonder if symbolic backends might be close to optimal for such a one-dimensional case.

Thanks. It's probably worth a try. I should first learn how symbolic differentiation works with DI, but I think it's worth testing to see if I can use it to shave a few milliseconds off common cases.

What kind of functions do you often end up differentiating?

Functions like this one (plus others also present in the PorousModels directory):

https://github.com/gerlero/Fronts.jl/blob/78bea2f1f6f2f363a461f2c3c7a2e912fd5ea072/src/PorousModels/letd.jl#L45

@gdalle
Copy link
Member

gdalle commented Jun 7, 2024

For symbolic differentiation, preparation (the extras shenanigans) is very costly but absolutely essential, since it compiles executable versions of your function and its derivatives. At the moment I haven't super optimized that part because hardly anyone uses it. But if you want to use it, I'll definitely squeeze out some more performance

@gerlero
Copy link
Contributor Author

gerlero commented Jun 7, 2024

Thanks. I'll give it a try.

Any recommendation for which symbolic backend I can start with? FastDifferentiation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants