Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

default embedding #472

Open
nashid opened this issue May 21, 2020 · 5 comments
Open

default embedding #472

nashid opened this issue May 21, 2020 · 5 comments

Comments

@nashid
Copy link

nashid commented May 21, 2020

If we do not provide embedding like word2vec, how does it know to represent the words?

Does it use one hot encoding by default or ngram, CBOW, skip grams?

@luozhouyang
Copy link

No. If you do not provide the pretrained embeddings, it will create an trainable variable, and initialize it by some algorithm. When you train the model on your data, this variable will be updated too.

@nashid
Copy link
Author

nashid commented Jun 7, 2020

@luozhouyang I understand if we do not provide the pre-trained embedding it uses the default implementation of embedding in this framework.

However, I would like to know what algorithm is used to build the embedding.

@luozhouyang
Copy link

luozhouyang commented Jun 8, 2020

Word embeddings here are actually an 2-d tensor, with shape (vocab_size, embedding_size).
This tensor will be updated along with other params by BP.

@nashid
Copy link
Author

nashid commented Jul 16, 2021

@luozhouyang I understand this. But what algorithm it is using (like word2vec, gloVe, ...)?

@luozhouyang
Copy link

No special algorithm is used. Not word2vec, not GloVe, just a learnable 2-d matrix.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants