Is there a way to get vocabulary (vector representation of all words) of the pre-trained model of word embedding? #9295
Replies: 3 comments 1 reply
-
I face the same issue, I want to verify the representativeness of the embeddings by running it through an encoder-decoder model. I need the vocabulary to assess the accuracy. |
Beta Was this translation helpful? Give feedback.
-
There is no public method to make the whole vocab-vector available in WordEmbeddingsModel (specially in Python), we can open a feature request for this. |
Beta Was this translation helpful? Give feedback.
-
I believe this is doable for all 3 annotators (WordEmbeddingsModel, Doc2VecModel, and Word2VecModel) in Spark NLP 5.1.1 release: https://github.com/JohnSnowLabs/spark-nlp/releases/tag/5.1.1 |
Beta Was this translation helpful? Give feedback.
-
When I make something like:
Is it possible to make a call for a method to get the whole vocabulary of the model?
Something like a dictionary with all words and their vector representation.
Beta Was this translation helpful? Give feedback.
All reactions