Skip to content

SteliosGian/spam-filtering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spam Filtering with TensorFlow

Build Status LinkedIn

Table of Contents
  1. About The Project
  2. Getting Started
  3. Roadmap

About The Project

This projects predicts whether a message is either Spam or Not Spam using a Long Short-term Memory Network(LSTM).
The predictions are served through an API which is run in a Docker container.

The dataset for this project is taken from Kaggle. Download it and place it in the "src/main/data/" directory.

Built With

Getting Started

This projects starts a server where the API is running.

To start, move to the docker directory by running

cd docker/

After that, run the following commands. First, we build the docker image using docker-compose.

docker compose build

And then, run the application inside the docker container.

docker compose up

After that, the API will be available at http://0.0.0.0:8000/predict

In the form, you can type a message an it will predict if the message is a spam or not.

How To Train

The LSTM model is trained using the train.py file located at "src/main/train.py".

Additionally, if the MLflow server is up and running, the metrics and hyperparameters are tracked and saved.

To train the model, follow these steps:

cd src
python3 main/train.py --source main/data/spam.csv --pipe_path main/trained_pipe --model_path main/trained_models

The "source", "pipe_path", and "model_path" are mandatory arguments to train the model. There are additional optional arguments to specify the hyperparameters to be used. These can be found in the "train.py" file.

Prerequisites

Install the required Python libraries from the "requirements.txt" file.

Notes

Roadmap

  • Containerize the application ☑
  • Set up CI/CD pipeline ☑
  • Deploy on Heroku