Simple app for classifying phishing content.
It uses:
-
mix deps.get
-
iex -S mix
-
In Elixir shell:
-
Fishy.run("Click this link to win a prize!")
or
- paste content into
content.txt
and runFile.read!("lib/content.txt") |> Fishy.run()
:timer.tc(Fishy, :run, ["Click this link to win a prize!"])
{2097277,
%{
predictions: [
%{label: "phishing", score: 0.9999933242797852},
%{label: "benign", score: 6.691957423754502e-6}
]
}}
Value 2097277
is in microseconds.
Model bert-finetuned-phishing is a fine-tuned version of bert-large-uncased with this dataset: phishing-dataset. It supports only English.
To support other languages, multilingual BERT bert-base-multilingual-cased would need to be fine-tuned with phishing and non-phishing examples across different languages.
Instructions for fine-tuning with Bumblebee: https://hexdocs.pm/bumblebee/fine_tuning.html