-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DifferentiationInterface supports Tapir! #107
Comments
I don't see a reason why not. I'll leave other questions to @willtebbutt |
Ahh interesting. Yes, Tapir is very precise about the types it uses to represent (co)tangents. Do you have a list of types that you are interested in supporting, or some general conventions for (co)tangent types in DifferentiationInterface? That might be a good place to start if we're thinking about how to interface the two.
You should just pass the function I think. The kind of thing I'm imagining you mean is something like Tapir.value_and_gradient!!(rrule!!, sin, 5.0) for which you'll get (-0.9589242746631385, (NoTangent(), 0.28366218546322625)) If it fits with your interface, I would recommend just returning the element of the gradient tuple onwards. Is this the kind of example that you have in mind?
We don't presently support second-order AD, and adding support is also not yet on our roadmap :) I should probably make that more clear in the readme.
Agreed -- we should definitely have support for this in the interface, as support for mutating functions is one of Tapir's interesting features. Can I ask -- what is currently preventing Tapir from handling mutating functions when used from DI? |
No, we try to be as agnostic as possible in that regard. But the typical case where it fails is the computation of a Jacobian, where we backpropagate basis arrays of tangent type
Yes, this currently works, but I'm wondering if it saves time to mark
That's the thing though, we do, and it is entirely built from first-order operators. So the real question is whether Tapir is able to differentiate over its own differentials.
Mostly the fact that I haven't tried it, I'll let you know how it goes |
Understood.
Ahhh I see. Yeah, we'll definitely have to set up custom conversions for this. For context: Tapir insists that each primal type (say, My initial thoughts are that something along the following lines might make sense: # default definition
function convert_to_tapir_tangent(::Type{primal_type}, tangent::T) where {primal_type, T}
if Tapir.tangent_type(primal_type) == T
# the tangent type is already what Tapir requires
return tangent
else
# the tangent type isn't what Tapir needs
throw(error("An informative error message"))
end
end
# specific conversions
function convert_to_tapir_tangent(::Type{Vector{Float64}}, tangent::FillArrays.OneElement{Float64})
return collect(tangent)
end I'm not sure if this is entirely what you need -- I guess this is kind of vaguely similar to your Note that
Ahh I see. At the minute we don't support activity analysis, except insofar as primals which are non-differentiable always have So I think my original answer stands. If a user is using a callable struct as the function argument, they will wind up differentiating w.r.t. its fields. This might hit performance, but it shouldn't hit correctness.
Ahhh okay. Currently no, because
Cool -- I look forward to seeing how this winds up looking! |
But what if I mark this callable struct as
What's cool about DI is that we can mix backends for second-order though. The most efficient way to compute a Hessian is forward-over-reverse, so we're free to test any forward outer backend with Tapir as inner backend, and if the forward backend can differentiate through this |
For a primal of type zero_codual(ones(5)) will produce something like CoDual(ones(5), zeros(5)) It's just saying "initialise the tangent bit of the codual to zero", as opposed to "this thing will always be zero". Tapir does not presently have an equivalent of
Perhaps -- would be interesting to know. |
Would you mind taking a look at JuliaDiff/DifferentiationInterface.jl#126 ? This is my first shot at mutation with Tapir and I don't understand what fails |
I'd be curious to see whether IIUC, Zygote's second-order derivative also uses ForwardDiff over Zygote's reverse diff. |
@gdalle, do you have any plan to test GPU-compatibility for AD backends? |
Indeed, and this is especially true for higher-order where you need to combine forward with reverse. The right place for this kind of stuff seems to be outside of first-order AD backends, for instance in DI.
True, it is the setting of
I am not at all familiar with GPU computations, and I don't think GitHub offers GPU servers for free to run our CI, so it's hard to test the real thing. Like type-stability, support for weird arrays is a "bonus", in the sense that not all backends provide it. Hence I'm unsure how best to test it in the long run, because if we test it for every backend
Any suggestions are welcome! |
Now that Tapir is registered, I have added it to the list of officially supported backends:
https://github.com/gdalle/DifferentiationInterface.jl
The associated code is in this file:
https://github.com/gdalle/DifferentiationInterface.jl/blob/main/ext/DifferentiationInterfaceTapirExt/DifferentiationInterfaceTapirExt.jl
A few questions for you:
f
itself, should I usezero_codual
in the out-of-placevalue_and_pullback
as well? I think I tried it and Tapir was unhappy.Missing features:
f!(y, x)
Do you want to advertise DifferentiationInterface.jl as a user-friendly interface to Tapir for the time being? It is not registered yet but I think it might soon be
cc @adrhill
The text was updated successfully, but these errors were encountered: