This is an implementation for the Chunking task as listed under CoNLL 2000 Dataset.
It is a BiLstm and CRF implementation. The architecture is based off the paper titled Bidirectional LSTM-CRF Models for Sequence Tagging
Rohit's repo on Named Entity Extraction using movies dataset was a very good starting point for this implementation. Some of the code has been used as is in the implementation.
Tested with Python >= 1.7.0 & Python <= 1.15.0
Keras 2.2.4
Note: The code is written for a CPU implementation.
The script creates auxilarry files during the run for the predicted tags of the input sentences. Although the model predicts all the tags, I have only implemented evaluation of Precision, Recall and F1Score for Noun Phrase chunks. You can easily extend this to also evaluate Verb Phrases and PPN etc.
I have included the jupyter notebook file and the corresponding python3 vanilla file version of the same.
jupyter notebook
python3 BiLstm_+_crf_for_chunking.py
Working on making this compatible with Tensorflow > 2.0 and corresponding Keras versions.