You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the first attention block,
You get hidden features using a filter AttnW to conv. with state outputs,
I believe that's the formula W h_k,
BUT you also have state outputs pass through a linear layer and get y
Then you add y and hidden features then pass through an tanh then multiply by a matrix
In the paper, I only see tanh(W h_k)
Also in your code there is AttnV,
where I can't find corresponding description in the paper.
The paper only has gate V and gate W
Could you kindly explain this?
I am really confusing.
Thank you!
The text was updated successfully, but these errors were encountered:
YuanTingHsieh
changed the title
Is the model in code match what describes in the paper?
Is the model in code matches what is described in the paper?
Jun 22, 2018
In the first attention block,
You get hidden features using a filter AttnW to conv. with state outputs,
I believe that's the formula W h_k,
BUT you also have state outputs pass through a linear layer and get y
Then you add y and hidden features then pass through an tanh then multiply by a matrix
In the paper, I only see tanh(W h_k)
Also in your code there is AttnV,
where I can't find corresponding description in the paper.
The paper only has gate V and gate W
Could you kindly explain this?
I am really confusing.
Thank you!
The text was updated successfully, but these errors were encountered: