Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnnaturalGrams Python Extension Module #12

Open
7 tasks
eddieantonio opened this issue Nov 29, 2014 · 0 comments
Open
7 tasks

UnnaturalGrams Python Extension Module #12

eddieantonio opened this issue Nov 29, 2014 · 0 comments
Assignees

Comments

@eddieantonio
Copy link
Member

MITLM is kind of a dodgy piece of work. So, (un)naturally, we'll replace it.

What this will fix

  • Remove our dependency on ZeroMQ
  • Remove our dependency on a shifty outside process
  • Allows us to use weighted n-grams

What needs to be present for a full replacement

  • Compute the cross-entropy of some tokens against the corpus
  • Predict what follows a given token prefix
  • Train the corpus with tokens
  • Create the Python extension module wrapper for the C library
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants