mcQA : Multiple Choice Questions Answering

Answering multiple choice questions with Language Models.

News 📢

🚧 This project is currently under development. Stay tuned ! 🤩

Jun 6th, 2020

Refactored data subpackage, the library now supports RACE, Synonym, Swag and ARC data sets.
Upgrade to transformers==2.10.0.

Installation

With pip

pip install mcqa

From source

git clone https://github.com/mcqa-suite/mcqa.git
cd mcQA
pip install -e .

Getting started

Data preparation

To train a mcQA model, you need to create a csv file with n+2 columns, n being the number of choices for each question. The first column should be the context sentence, the n following columns should be the choices for that question and the last column is the selected answer.

Below is an example of a 3 choice question (taken from the CoS-E dataset) :

Context sentence	Choice 1	Choice 2	Choice 3	Label
People do what during their time off from work?	take trips	brow shorter	become hysterical	take trips

If you have a trained mcQA model and want to infer on a dataset, it should have the same format as the train data, but the label column.

See example data preparation below:

from mcqa.data import MCQAData

mcqa_data = MCQAData(bert_model="bert-base-uncased", lower_case=True, max_seq_length=256) 
                     
train_dataset = mcqa_data.read(data_file='swagaf/data/train.csv', is_training=True)
test_dataset = mcqa_data.read(data_file='swagaf/data/test.csv', is_training=False)

Model training

from mcqa.models import Model

mdl = Model(bert_model="bert-base-uncased", device="cuda") 
            
mdl.fit(train_dataset, train_batch_size=32, num_train_epochs=20)

Prediction

preds = mdl.predict(test_dataset, eval_batch_size=32)

Evaluation

from sklearn.metrics import accuracy_score
from mcqa.data import get_labels

print(accuracy_score(preds, get_labels(train_dataset)))

References

Type	Title	Author	Year
📰 Paper	Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets	Mor Geva, Yoav Goldberg, Jonathan Berant	2019
📰 Paper	Explain Yourself! Leveraging Language Models for Commonsense Reasoning	Nazneen Fatema Rajani, Bryan McCann, Caiming Xiong and Richard Socher	2019
📰 Paper	SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference	Rowan Zellers, Yonatan Bisk, Roy Schwartz and Yejin Choi	2018
📰 Paper	Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering	Todor Mihaylov, Peter Clark, Tushar Khot, Ashish Sabharwal	2018
📰 Paper	CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge	Alon Talmor, Jonathan Herzig, Nicholas Lourie, Jonathan Berant	2018
📰 Paper	RACE: Large-scale ReAding Comprehension Dataset From Examinations	Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang and Eduard Hovy	2017
💻 Framework	Scikit-learn: Machine Learning in Python	Pedregosa et al.	2011
💻 Framework	PyTorch	Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan	2016
💻 Framework	Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.	Hugging Face	2018
📹 Video	Stanford CS224N: NLP with Deep Learning Lecture 10 – Question Answering	Christopher Manning	2019

LICENSE

Apache-2.0

Contributing

Read our Contributing Guidelines.

Citation

@misc{Taycir2019,
  author = {mcQA-suite},
  title = {mcQA},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/mcQA-suite/mcQA/}}
}

Name		Name	Last commit message	Last commit date
Latest commit History 153 Commits
.circleci		.circleci
.github		.github
docs		docs
examples		examples
mcqa		mcqa
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mcQA : Multiple Choice Questions Answering

News 📢

Jun 6th, 2020

Installation

With pip

From source

Getting started

Data preparation

Model training

Prediction

Evaluation

References

LICENSE

Contributing

Citation

About

Releases

Packages

Contributors 4

Languages

License

mcQA-suite/mcQA

Folders and files

Latest commit

History

Repository files navigation

mcQA : Multiple Choice Questions Answering

News 📢

Jun 6th, 2020

Installation

With pip

From source

Getting started

Data preparation

Model training

Prediction

Evaluation

References

LICENSE

Contributing

Citation

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages