You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm wondering if it's possible to use sentences as training units because normally the window is put on the sentence right? If we use documents the last word of a sentence will has a right window of 5 words which shouldn't have been included.
One can argue that it suffices to give the list of lists of sentences as input, however
fromsvd2vecimportsvd2vecdocuments= ["this is a test right left".split(
), "this is the second test left right".split()]
svd=svd2vec(documents, window=2, min_count=1, size=2)
As one can expect, the error would disappear if one gives a larger list:
fromsvd2vecimportsvd2vecdocuments= ["this is a test right left".split(
)*100, "this is the second test left right".split()*100]
svd=svd2vec(documents, window=2, min_count=1, size=2)
Tks again!
The text was updated successfully, but these errors were encountered:
I found then when the document was large, each document can be of short length:
fromsvd2vecimportsvd2vecdocuments= ["this is a test right left".split(
)*2, "this is the second test left right".split()*2] *10svd=svd2vec(documents, window=2, min_count=0, size=4)
This one works.
So why don't you use sentence as unit? Does the author of the paper specifies that or it's for some computation reasons?
Hello,
Tks for this fantastic implementation.
I'm wondering if it's possible to use sentences as training units because normally the window is put on the sentence right? If we use documents the last word of a sentence will has a right window of 5 words which shouldn't have been included.
One can argue that it suffices to give the list of lists of sentences as input, however
gives
As one can expect, the error would disappear if one gives a larger list:
Tks again!
The text was updated successfully, but these errors were encountered: