Skip to content

The official implementation of "Open Relation Extraction With Non-existent and Multi-span Relationships", KR 2022.

License

Notifications You must be signed in to change notification settings

farahhuifanyang/QuORE

Repository files navigation

Open Relation Extraction With Non-existent and Multi-span Relationships

This is the official code repository for "Open Relation Extraction With Non-existent and Multi-span Relationships" by Huifan Yang, Da-Wei Li, Zekun Li , Donglin Yang and Bin Wu. The video talk and slide are available. Please cite & star this work if it is useful to you.

Table of contents

Introduction

Task

An illustration of our task

An illustration of our task: open relation extraction with single-span, multi-span, and non-existent relationships. (We present cases in English and Chinese due to the datasets of the two languages used in this paper.)

Our work

  • We define two further tasks of open relation extraction with non-existent and multi-span relationships considering the practical demands of ORE.
  • By re-constructing some existing ORE datasets, we derive and publicize four augmented datasets with non-existent relationships and a multi-span relation dataset.
  • We propose a query-based multi-head framework QuORE to extract single/multi-span relations and detect non-existent relationships effectively.

Model Usage

The commands below need to be run from the root directory of the repository.

First, install prerequisites with

pip install -r requirements.txt
  • Train:
allennlp train configs/[config_file] -s [model_directory] --include-package src
  • Predict:
allennlp predict [model_directory]/model.tar.gz [predict_file] --predictor machine-comprehension --cuda-device 0 --output-file [predict_directory]/predictions.jsonl --use-dataset-reader --include-package src
  • Evaluate:
allennlp evaluate [model_directory]/model.tar.gz [eval_file] --cuda-device 0 --output-file [eval_directory]/eval.json --include-package src

Datasets

We publicize our re-constructed datasets in the release. The data format of training, development and test sets is the same. The data format and sample data can be found in the directory sample_data.