The aim of this project is to classify tweets into a 'happy' and a 'sad' class. To fulfill that aim, we implemented different machine learning algorithms and finally choose a Recurrent Neural Network as the best method. We achieved an accuracy of about 86% which is close to the state of the art.
- Python 3
- Tensorflow
- Keras
- ScikitLearn
- Download the datasets on CrowdAI
- Put them into a directory called data/
- Download twitter GloVe embeddings over here
- Put them into a directory called data/glove.twitter.27B
- Create an empty directory called output/ and put into it the trained model you downloaded over here
- Run the file run.py with python 3. It should produce a file prediction.csv that you can upload directly on CrowdAI
- run.py, a simple script that load the saved model and predict the classes of the test dataset
- mode_selection.py, a simple script that we used to cross-validate the accuracy of different classification algorithms
- rnn.py, the implementation of our Recurrent Neural Network
- plot.py, different functions that we used to make the plot of the report