-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
number of LSTM blocks and cells #1
Comments
Hi Yang, Glad to know that you found the code helpful. The distinction between cell and blocks eroded over time, most of modern LSTM architecuters have one cell per block, (which in my opinion is simple), regarding your question, "why not much attention was paid to this difference", I am afraid I might not have a very clear answer, I would say: -> Do we have any empirical evidence which suggests that LSTMs with multiple cells architecture works better than "one cell per block" architecture ? [I am not aware of any such evidence, If we don't have any such evidence then people will prefer less cumbersome model]; Also same applies for peep-hole connections, some people don't use them as they don't find them very helpful I hope above explanation helps a bit, [1] explains the difference between various LSTM architectures. |
Dear Yaseen, Thanks for the quick and informative reply. Thanks, On 7 May 2016 at 03:49, Usama Yaseen [email protected] wrote:
|
I have to look at few papers again to make sure I don't miss anything, but these days I am travelling and don't even have access to my laptop, you have to wait at-least one week for the reply (I am sorry it cannot be earlier than that :/) |
ok, i can wait for that. i can play with the most simple one those days. ^_^ best
|
Thank Yaseen and Xin Yang. This is also my problem.And now, i can more understand. |
Dear Yaseen, Second, it has a error when debugging train.py . |
Dear Yaseen,
thanks for your clean code.
As you know, there have the conceptions 'LSTM block' and 'LSTM cell'. But in a lot of LSTM example codes, including yours, there seems to be no attention was paid to this difference. In the codes, only cells are created, while no blocks.
After reading and thinking about this problem, I got the conclusion that: the LSTM with m blocks with n cells and the LSTM with one block with m*n cells are actually the same.
Then, how do you think about this problem and could you give me any hints about this issue?
Thanks,
Xin Yang
The text was updated successfully, but these errors were encountered: