You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
it allows seeding non-exact matches, which in turn allows capturing potentially interesting matches where most (or none!) of the match is exact
it helps us solve the multiple readings problem if we represent sound content as a vector so that there is not a 1-to-1 relationship between graphs and phonemes. this would let us compare using more abstract metrics like cosine distance, instead of text-specific metrics like edit distance.
this is useful for two reasons:
we could start by looking into any libraries that do locality-sensitive hashing, like datasketch or the popular annoy. there's a great explanation of LSH here and a detailed one related to document comparison in 3.4.1 of Mining Massive Datasets.
The text was updated successfully, but these errors were encountered: