-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Method errors with Flux.jl and Knet.jl #47
Comments
It seems that these are not subtypes of |
I had a go at making this happen, and have an almost-working implementation which acts on Flux's Would you be interested to have something like this? I meant to make a PR, and then I found one bug in my gradients (hence "almost") and put it aside. However this was for v0.7, and I see now you're busy re-writing things... I had a quick look at updating it, and I think that I would probably need to delete three lines of existing code (after which your tests still pass) to re-use the same approach. |
I am pretty new to the topic of computation graphs and automatic differentiation, but I am very interested in it and would like to explore it further. In particular, how does it deal with mutating operations like That being said, yes it would be great if TensorOperations could be useful for the machine learning packages. However, I would think that the design is such that mostly the additional code should be defined in e.g. Flux, no, just overloading a few methods from TensorOperations for the If there is anything that could be done in TensorOperations.jl to facilitate this overloading, I am certainly open to suggestions. |
I am new to all this too, but begin to understand a little how Flux's model works. The essence of this is to overload functions like For this to work, what I need to change in tensormacro.jl is to return not the mutated
Is that accurate? Perhaps there is a reason to return Besides such changes, there are two parts to write, a set of reverse functions (e.g. if forward was trace, then reverse is some contraction of delta with the upstream gradient) which are not specific to Flux, and a set of overloadings which are. I was thinking that the latter could be loaded here via |
I am a bit confused or missing something: if the overloaded method accepts both |
The issue is that TrackedArrays are immutable. Perhaps one could still cheat.... my strategy was instead that I put today's version in this gist, maybe clearer to see it? Something is still failing for one |
I agree that removing the final In what sense are rmul!(A::MyWrappedArray, alpha) = (rmul!(A.data, alpha); return A) and I would think that for Not sure if |
OK, re In addition to the data they have tracking information One could make make The other slightly odd issue is that the |
I'm trying to use TensorOperations to make a tensor regression. Basically it's a sequence of (in my case) 3rd-order tensor contractions. But I need to treat some of the tensors as if they were trainable parameters and then use gradient descent to optimize them. I get method errors with both knet and Flux because of these use their own wrapped versions of Arrays.
Here's an example of what I'm trying to do:
But I get an error with Knet trying to do gradient calculations:
Similar error when I use Flux.
The text was updated successfully, but these errors were encountered: