We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
请问标签embedding的时候,维度(dh)是不是需要和标签数量L(或label_num)一样大? 不然的话先看(假如维度512),attention=qk:(length, 512)(label_num,512)==》(length, label_num),--继续计算--- attentionV=(length,label_num)*(label_num,512)==》(length,512),这就是LAN的结果,这个结果去映射标签,假如标签只有128个,那512去对应128个标签映射肯定是有问题的啊?
所以文中的维度,也就是文中的512是不是必须和标签数量L(或label_num)相等,不然没法和输出对应呀
The text was updated successfully, but these errors were encountered:
好像明白了,最后的输出层的attention,没有再去和V相乘,对吗?
Sorry, something went wrong.
嗯嗯对的,最后一层没有乘回去。dh只需要跟encoder的一样就可以。
谢谢你,
No branches or pull requests
请问标签embedding的时候,维度(dh)是不是需要和标签数量L(或label_num)一样大?
不然的话先看(假如维度512),attention=qk:(length, 512)(label_num,512)==》(length, label_num),--继续计算--- attentionV=(length,label_num)*(label_num,512)==》(length,512),这就是LAN的结果,这个结果去映射标签,假如标签只有128个,那512去对应128个标签映射肯定是有问题的啊?
所以文中的维度,也就是文中的512是不是必须和标签数量L(或label_num)相等,不然没法和输出对应呀
The text was updated successfully, but these errors were encountered: