A Simple Bayes (a.k.a. Naive Bayes) implementation in Elixir.
- Multinomial Naive Bayes algorithm
- No external dependencies
- Ignores stop words
- Additive smoothing
- TF-IDF
- Optional keywords weighting
bayes = SimpleBayes.init
|> SimpleBayes.train(:apple, "red sweet")
|> SimpleBayes.train(:apple, "green", weight: 0.5)
|> SimpleBayes.train(:apple, "round", weight: 2)
|> SimpleBayes.train(:banana, "sweet")
|> SimpleBayes.train(:banana, "green", weight: 0.5)
|> SimpleBayes.train(:banana, "yellow long", weight: 2)
|> SimpleBayes.train(:orange, "red")
|> SimpleBayes.train(:orange, "yellow sweet", weight: 0.5)
|> SimpleBayes.train(:orange, "round", weight: 2)
bayes |> SimpleBayes.classify_one("Maybe green maybe red but definitely round and sweet.")
# => :apple
bayes |> SimpleBayes.classify("Maybe green maybe red but definitely round and sweet.")
# => [
# apple: 0.18519202529366116,
# orange: 0.14447781772131096,
# banana: 0.10123406763124557
# ]
In your application's config/config.exs
:
config :simple_bayes, default_weight: 1
config :simple_bayes, smoothing: 0.001
config :simple_bayes, stop_words: ~w(
a about above after again against all am an and any are aren't as at be
because been before being below between both but by can't cannot could
couldn't did didn't do does doesn't doing don't down during each few for from
further had hadn't has hasn't have haven't having he he'd he'll he's her here
here's hers herself him himself his how how's i i'd i'll i'm i've if in into
is isn't it it's its itself let's me more most mustn't my myself no nor not of
off on once only or other ought our ours ourselves out over own same shan't
she she'd she'll she's should shouldn't so some such than that that's the
their theirs them themselves then there there's these they they'd they'll
they're they've this those through to too under until up very was wasn't we
we'd we'll we're we've were weren't what what's when when's where where's
which while who who's whom why why's with won't would wouldn't you you'd
you'll you're you've your yours yourself yourselves
)
Licensed under MIT.