One usable torch-based implementation of transformer for learning purpose. Copied from harvard nlp.
transformer.ipynb
: Notebook version of the original Encoder-Decoder structure described inAttention is All you Need
.
transformer.py
: The original Encoder-Decoder structure described inAttention is All you Need
.T4Tmodel.py
: Transformer for Translation Model. Used inExample 1
.BERTmodel.py
: Implementation of BERT. Used inExample 2
,Example 3
,example 4
.
Check out at ./examples
.
Example 1
: One Chinese to English translation model, trained on WMT 2018 en-zh dataset.Example 2
: Pretrain a miniBert, and finetune it to apply to some downstream tasks.Example 3
: Fine-tuning our bert model on squad - one QA task dataset.example 4
: TODO Fine-tuning our bert model on Imdb - one emotional classification task dataset.
- Attention is all you need: https://arxiv.org/abs/1706.03762
- Annotated Transformer, Havard NLP: http://nlp.seas.harvard.edu/annotated-transformer
- Pytorch Doc: https://pytorch.org/docs
- BERT: https://arxiv.org/abs/1810.04805
- d2l: https://d2l.ai/