Code for BEA 13 paper "Co-Attention Based Neural Network for Source-Dependent Essay Scoring"
Python 2 for data/preprocess_asap.py (will be upgraded to Python 3).
- I recommend that on installation, you do not add to your PATH variable. This way, it doesn't interfere with your current Python workflow.
- Then, when you need to run the preprocessing script, you'll run it something like:
- c:/Python27/python.exe preprocess_asap.py
Python 3 for the rest
- tensorflow 2.0.0 beta
- gensim
- gensim may have more dependencies, such as VS tools
- nltk
- sklearn
- Run python2 data/preprocess_asap.py for data splitting.
- Download GloVe pretrained embedding from: https://nlp.stanford.edu/projects/glove
- Extract glove.6B.50d.txt to the glove folder
- Run python3 attn_network.py [options] for training and evaluation
To run on Windows, do all of the commands for Linux/MacOS. Then you'll need to remove two "\n" symbols from the preprocessing script.
- open data/preprocess_asap.py in your preferred text editor
- on lines 28 and 31 in the preprocessing script, you'll find: f_write.write("\r\n")
- remove the \n from both lines
After preprocessing the data, the program will start the training process. At the end of each epoch, the logger will output the development and testing set scores. The highest will be kept and outputted after all epochs are complete. You can toggle the which task (the default is ASAP3), the number of epochs (the default is 50), and more by looking at the arguments in lines 21-51 of attn_network.py.
Additionally, if you want to look at specific essays with their predicted and actual scores:
- go to the checkpoints folder
- after training, there should be a text file with one number per line. the line number corresponds to the essay number in the test data.
- in the various fold directories, open the test.tsv file and compare with the predicted scored in from step 2.
- making specific essays, as well as their predicted scores and real scores, more accessible.
- likely a Python script
- updating the preprocessing_asap.py to be compatible with Python 3
If you use the code, please cite the following paper:
@inproceedings{zhang2018co,
title={Co-Attention Based Neural Network for Source-Dependent Essay Scoring},
author={Zhang, Haoran and Litman, Diane},
booktitle={Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications},
pages={399--409},
year={2018}
}