This repository contains the official code for AAAI 2024 Oral paper "Structured Probabilistic Coding".
Structured Probabilistic Coding (SPC) is a supervised representation learning technology serving as an encoder-only probabilistic coding framework with structured regularization from the target space.
By learning compact and informative representations from input related to the target task, SPC enhances the generalization ability of pre-trained language models for better language understanding.
Experimental results on 12 natural language understanding tasks demonstrate that SPC effectively improves the performance of PLMs for classification and regression.
- [Mar 2024]: Added support for the multi-task version of SPC.
- [Feb 2024]: Code is available on GitHub.
- [Dec 2023]: Paper is available on arXiv.
- [Dec 2023]: Paper is accepted by AAAI 2024 (Oral).
- Clone the repository
git clone https://github.com/zerohd4869/SPC.git
cd ./SPC
- Download the data and pre-trained model parameters
Download 12 datasets mentioned in the paper from here, and extract the files into the /SPC/data/
directory.
This repo already contains 7 of these datasets by default, so this step is optional.
Download the roberta-base
model parameters from here and place them in the /SPC/ptms/roberta-base/
directory.
SPC is a backbone-free representation learning method. When using it, you could choose an appropriate backbone model and initialized parameter checkpoints for your task or dataset.
- Install dependencies
# env: Python 3.7.16, Tesla A100 80GB
pip install -r spc_requirements.txt
- Run examples
For classification:
# EmojiEval dataset
nohup bash script/run_train_emojieval.sh > spc_roberta_emojieval.out &
# EmotionEval dataset
nohup bash script/run_train_emotioneval.sh > spc_roberta_emotioneval.out &
# HatEval dataset
nohup bash script/run_train_hateval.sh > spc_roberta_hateval.out &
# IronyEval dataset
nohup bash script/run_train_ironyeval.sh > spc_roberta_ironyeval.out &
# OffensEval dataset
nohup bash script/run_train_offenseval.sh > spc_roberta_offenseval.out &
# SentiEval dataset
nohup bash script/run_train_sentieval.sh > spc_roberta_sentieval.out &
# StanceEval dataset
nohup bash script/run_train_stanceeval.sh > spc_roberta_stanceeval.out &
# ISEAR dataset
nohup bash script/run_train_isear.sh > spc_roberta_isear.out &
# MELD dataset
nohup bash script/run_train_meld.sh > spc_roberta_meld.out &
# GoEmotions dataset
nohup bash script/run_train_goemotions.sh > spc_roberta_goemotions.out &
For regression:
# STS-B dataset
nohup bash script/run_train_sbsb.sh > spc_roberta_stsb.out &
# CLAIRE dataset
nohup bash script/run_train_claire.sh > spc_roberta_claire.out &
Apply for a new task/dataset
-
Data preparation and loading script. Download the new dataset (take
NewDataset
as an example) and place the unzip files in the/SPC/data/
directory. Add the label information of this dataset to the dictionary fileSPC/data/task2label.json
. Then, refer to the template/SPC/datasets/new_dataset_script.py
to write the corresponding reading script for the dataset and place the file in the/SPC/datasets/
directory. Also, add the dataset and task information to the fileSPC/task.py
at the corresponding location. -
Refer to the Quick Start section above to write the corresponding sh script and run it.
During the training process for SPC, the primary hyperparameters for adjustment along with their suggested ranges are as follows:
var_weight (beta): [0.01, 0.1, 1, 10]
clu_weight (gamma): [0.01, 0.1, 1, 10]
weight_decay: [0, 0.001]
dropout: [0, 0.2]
normalize_flag: False, True
Other hyperparameters can be adjusted based on experimental conditions and specific task requirements, such as epochs, patience, warmup_ratio, bs, max_length, etc.
Apply all tasks in a multi-task paradigm
# 6 tasks/datasets in TweetEval
nohup bash script/run_train_mtl_tweeteval.sh > spc_roberta_mtl_tweeteval.out &
If you are interested in this work and want to use the code in this repo, please star this repo and cite it as:
@inproceedings{DBLP:conf/aaai/0001WLZH24,
author = {Dou Hu and
Lingwei Wei and
Yaxin Liu and
Wei Zhou and
Songlin Hu},
title = {Structured Probabilistic Coding},
booktitle = {{AAAI}},
pages = {12491--12501},
publisher = {{AAAI} Press},
year = {2024}
}