-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nested gradient with hessian #1264
Comments
It boils down to generalizing julia> Zygote.seed(ps,Val(12))
ERROR: MethodError: no method matching size(::NamedTuple{(:bias,), Tuple{Vector{Float64}}})
Closest candidates are:
size(::Union{LinearAlgebra.QR, LinearAlgebra.QRCompactWY, LinearAlgebra.QRPivoted}) at C:\Users\Luffy\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\LinearAlgebra\src\qr.jl:567
size(::Union{LinearAlgebra.QR, LinearAlgebra.QRCompactWY, LinearAlgebra.QRPivoted}, ::Integer) at C:\Users\Luffy\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\LinearAlgebra\src\qr.jl:566
size(::Union{LinearAlgebra.Cholesky, LinearAlgebra.CholeskyPivoted}) at C:\Users\Luffy\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\LinearAlgebra\src\cholesky.jl:494
...
Stacktrace:
[1] seed(x::NamedTuple{(:bias,), Tuple{Vector{Float64}}}, ::Val{12}, offset::Int64) (repeats 2 times)
@ Zygote C:\Users\Luffy\.julia\packages\Zygote\IoW2g\src\lib\forward.jl:7
[2] top-level scope
@ REPL[160]:1
[3] top-level scope
@ C:\Users\Luffy\.julia\packages\CUDA\tTK8Y\src\initialization.jl:52 |
This comment was marked as duplicate.
This comment was marked as duplicate.
Zygote's jacobian function isn't Zygote-differentiable. There's no major barrier to making it so, someone just has to do it. I think the most relevant issue for this is #953 . That's pure-Zygote, unlike |
Mathematically, I did not differentiate the jacobian itself. The Jacobian here should be treated as a constant. This could be a dumb question but why can't Zygote tell from the input of |
Zygote does not know this, unfortunately. It works backwards from the final result, in complete ignorance of which branches of the expression tree ultimately lead to gradients you do or do not want. |
More specifically, Zygote will run the AD transform on any function which doesn't have an existing |
Ok, I get it now. What also confuses me is the error message here. It |
Just because Zygote fails on a function due to it using mutation does not mean the solution is to write an adjoint for it. Alternatives may include rewriting the offending bits to not use mutation, using |
In essence, It looks like the same kind of failure as the reverse on forward on reverse given that they have similar error messages. For forward on forward on reverse, after adding a method to |
The reason for this is that looking inside unknown functions is literally what it does for a living. It doesn't stop at the call to Many failures thus have the same error message. Zygote over ForwardDiff is its own ball of problems besides mutation. |
Ah you're right, just a little whining |
I have a piece of code that involves multiple errors and it took me a long time to isolate each one 🥲 |
Oh I know. My only advice is to start small and add things... I've marked the jacobian posts "duplicate" since these are exactly #1268 now. |
Should I change the title of this issue to " |
There are two hessian functions, and they are very short: Lines 74 to 89 in 7f2b169
The all-zygote one would in principal be differentiable, if #1268 were solved.
However, 3rd order derivatives using Zygote are unlikely to ever be a good idea. I think the tests contain one like this, and it is very very slow:
|
Sure it's slow. Forward over forward over reverse is more reasonable. So this goes back to supporting |
Reverse on forward on reverse:
Forward on forward on reverse
It would be great to support the second mode. Looks like it won't take too much to achieve that. If I change
ps
to a vector it can work smoothly.The text was updated successfully, but these errors were encountered: