You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could you help me enhance my understanding of how the perplexity parameter works. There are two questions.
Looking at the implementation, do I get it right that a reasonable upper bound on perplexity is equal to 1/3 of the minimal expected cluster size (for simplicity, assume we know what cluster sizes to expect).
On your home page, there is a question (“I get a strange ‘ball’ with uniformly distributed points”) and your suggestion is to reduce perplexity. Do you think the same “ball” effect can be see when perplexity is too low? If yes, how do you suggest we define a lower bound for perplexity?
Regarding 2), I have this digit images data set with 40,000 points that is supposed to contain 10 clusters of about the same size. When I subsample 2000 points and run default Rtsne (its implementation is very similar to yours) the embedding looks nice. However, it is far worse on the full data set. I figured it was because the default perplexity of 30 was too low compared to the typical cluster size, 4000, so I reset it to 30*20 = 600 and obtained a very nice embedding.
When the expected result is unknown, I guess one could try to use a similar subsampling approach to figure out how to increase perplexity. I was wondering if you know of a more analytical method or a rule of thumb.
Regards,
Nik Tuzov, PhD
The text was updated successfully, but these errors were encountered:
Dear Dr. van der Maaten:
Could you help me enhance my understanding of how the perplexity parameter works. There are two questions.
Looking at the implementation, do I get it right that a reasonable upper bound on perplexity is equal to 1/3 of the minimal expected cluster size (for simplicity, assume we know what cluster sizes to expect).
On your home page, there is a question (“I get a strange ‘ball’ with uniformly distributed points”) and your suggestion is to reduce perplexity. Do you think the same “ball” effect can be see when perplexity is too low? If yes, how do you suggest we define a lower bound for perplexity?
Regarding 2), I have this digit images data set with 40,000 points that is supposed to contain 10 clusters of about the same size. When I subsample 2000 points and run default Rtsne (its implementation is very similar to yours) the embedding looks nice. However, it is far worse on the full data set. I figured it was because the default perplexity of 30 was too low compared to the typical cluster size, 4000, so I reset it to 30*20 = 600 and obtained a very nice embedding.
When the expected result is unknown, I guess one could try to use a similar subsampling approach to figure out how to increase perplexity. I was wondering if you know of a more analytical method or a rule of thumb.
Regards,
Nik Tuzov, PhD
The text was updated successfully, but these errors were encountered: