Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: Error When Using Custom Embeddings with None Contextualized Model #149

Open
XavierSpycy opened this issue Jun 27, 2024 · 0 comments

Comments

@XavierSpycy
Copy link

  • Contextualized Topic Models version:
  • Python version:
  • Operating System:

Description

There seems to be a potential bug in the data_preparation.py script, specifically at this line. I propose adjusting the conditional statement to:

if self.contextualized_model is None and custom_embeddings is None:

instead of:

if self.contextualized_model is None:

Currently, when self.contextualized_model is set to None and custom_embeddings are provided (such as when using externally sourced embeddings), the code erroneously raises an error. This issue occurs because the conditional logic does not adequately account for the scenario where custom_embeddings is used independently of self.contextualized_model.

What I Did

Here's how the error can be reproduced:

# train_docs, test_docs = ..., ...
# preprocessed_train_docs, preprocessed_test_docs = ..., ...
# embeddings_train, embeddings_test = ..., ...

qt = TopicModelDataPreparation()
train_dataset = qt.fit(text_for_contextual=train_docs, text_for_bow=preprocessed_train_docs, custom_embeddings=embeddings_train)
test_dataset = qt.transform(text_for_contextual=test_docs, text_for_bow=preprocessed_test_docs, custom_embeddings=embeddings_test) # This line raises an error. 

The expected behavior is that when custom_embeddings are provided, the method should proceed without requiring self.contextualized_model. This adjustment will allow the use of alternative embeddings without triggering unnecessary errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant