Skip to content
This repository has been archived by the owner on Mar 1, 2022. It is now read-only.

Calculate mispellings in previous reflection document and use that metric to purposefully mispell the new document equally often #17

Open
carlthome opened this issue Oct 19, 2014 · 6 comments
Assignees

Comments

@carlthome
Copy link
Member

i.e. a "Humanizer"

@carlthome
Copy link
Member Author

A good assumption could be that common keyboard misspellings apply (i.e. close keys). There's probably data on line about common misspellings also so a simple synonym lookup with mispelt words could be used.

@nandezer
Copy link
Contributor

I'm aiming to calculate the probabilities of misspelled words per sentences on a text, but I cannot manage to undestand how the the WordNet class works. Any explanation would be very much appreciated.
(I'll leave that part in a coment for now, in the branch interface-WritingStyleAnalyzer-code, class WAnalyzerS.java)

@carlthome
Copy link
Member Author

The WordNet class has nothing to do with this issue.

@carlthome
Copy link
Member Author

Some classification of how bad of a misspelling it is might be a good idea. Test!

@nandezer
Copy link
Contributor

The text is humanized
(the algorithm on what words to select is a bit poor, but it change several words for the most close one, according to the Jaro–Winkler distance algorithm)

@nandezer
Copy link
Contributor

Now if I can not find a word in the long file of correct + misspelled words it switch a random character of the word for one close character of the keyboard.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants