Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Other Datasets Problem #9

Open
makimon123 opened this issue Nov 26, 2024 · 3 comments
Open

Other Datasets Problem #9

makimon123 opened this issue Nov 26, 2024 · 3 comments

Comments

@makimon123
Copy link

Dear Author, I attempted to apply this method on other datasets; however I have observed that the mu_pdist、sigma_pdist and logits distributions are very concentrated during training , even though the distributions of mean and std themselves seem fine.

@makimon123
Copy link
Author

and the final training results are relatively poor. I suspect this is because the logits have not been trained well. plz

@SanghyukChun
Copy link
Collaborator

If you are trying to apply this method for from-scratch training (without any pre-trained weight), it would be difficult to optimize. I recently released a new probabilistic VLM project for from-scratch training:

Probabilistic Language-Image Pre-Training
https://arxiv.org/abs/2410.18857
https://github.com/naver-ai/prolip

There is no full training code yet, but you can easily implement the new loss function:
https://github.com/naver-ai/prolip/blob/89aed36968f055fca897dc51c25156b19412c56c/src/prolip/loss.py#L100-L115

If you need to use PCME++ loss for from-scratch training, you will need additional deterministic loss for a stable convergence. As shown in my new paper
image

@makimon123
Copy link
Author

Thank you very much for your help! I have tried the new method you suggested, but unfortunately, I am encountering an issue where the values in sigma_pdist remain abnormal and the distribution is very concentrated. This phenomenon has not improved during the training process.

I am wondering if this could be related to the data dimensions. In my dataset, both the mean and log variance are encoded with the shape (Batchsize, Dim), specifically (256, 512). I would appreciate your thoughts on whether the dimensionality could be contributing to this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants