MoreHopQA: More Than Multi-hop Reasoning

This repository contains the code to run the evaluation and analyses for MorehopQA: More Than Multi-hop Reasoning.
We also provide the dataset on Huggingface. For more details please also see our paper.

The dataset

We propose a new multi-hop dataset, MoreHopQA, which shifts from extractive to generative answers. Our dataset is created by utilizing three existing multi-hop datasets: HotpotQA, 2Wiki-MultihopQA, and MuSiQue. Instead of relying solely on factual reasoning, we enhance the existing multi-hop questions by adding another layer of questioning.

Our dataset is created through a semi-automated process, resulting in a dataset with 1118 samples that have undergone human verification.

For each sample, we share our 6 evaluation cases, including the new question, the original question, all the necessary subquestions, and a composite question from the second entity to the final answer (case 3 below)

Setup

First, create conda env and activate:

conda env create -f conda_env.yml
conda activate genhop

If running on cuda 11, install pytorch 2 for cuda 11:

pip3 install --upgrade --force-reinstall torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

To check, start a terminal with python 3 and check that

import torch
torch.cuda.is_available()

returns True.

To evaluate answer via NER, it is necessary to install the spacy model

python3 -m spacy download en_core_web_sm

Additionally, to run models from OpenAI, add the OpenAI API Key by

export OPENAI_API_KEY=*api_key*

macOS

To run on macOS, it might be necessary to install no-mkl versions of numpy and pandas.

conda install nomkl

then

conda install numpy pandas

followed by

conda remove mkl mkl-service

Run

To evaluate all models from the paper, run

run_evaluation.sh

To reproduce our result tables, we provide the summarize_results.ipynb notebook.

License

The MorehopQA dataset is licensed under CC BY 4.0

If you find this dataset helpful, please consider citing our paper

@misc{schnitzler2024morehopqa,
      title={MoreHopQA: More Than Multi-hop Reasoning}, 
      author={Julian Schnitzler and Xanh Ho and Jiahao Huang and Florian Boudin and Saku Sugawara and Akiko Aizawa},
      year={2024},
      eprint={2406.13397},
      archivePrefix={arXiv}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
datasets		datasets
figures		figures
models		models
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda_env.yml		conda_env.yml
evaluate.py		evaluate.py
postprocess.py		postprocess.py
run_evaluation.py		run_evaluation.py
run_evaluation.sh		run_evaluation.sh
summarize_results.ipynb		summarize_results.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MoreHopQA: More Than Multi-hop Reasoning

The dataset

Setup

macOS

Run

License

About

Languages

License

Alab-NII/morehopqa

Folders and files

Latest commit

History

Repository files navigation

MoreHopQA: More Than Multi-hop Reasoning

The dataset

Setup

macOS

Run

License

About

Resources

License

Stars

Watchers

Forks

Languages