⭐ The datasets files of MentalManip are available in this folder. You can also download on Hugging Face.
This is the repository for ACL'24 accepted paper: MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations [ACL paper].
Mental manipulation, a significant form of abuse in interpersonal conversations, presents a challenge to identify due to its context-dependent and often subtle nature. The detection of manipulative language is essential for protecting potential victims, yet the field of Natural Language Processing (NLP) currently faces a scarcity of resources and research on this topic. Our study addresses this gap by introducing a new dataset, named MentalManip, which consists of 4,000 annotated movie dialogues. This dataset enables a comprehensive analysis of mental manipulation, pinpointing both the techniques utilized for manipulation and the vulnerabilities targeted in victims. Our research further explores the effectiveness of leading-edge models in recognizing manipulative dialogue and its components through a series of experiments with various configurations. The results demonstrate that these models inadequately identify and categorize manipulative content. Attempts to improve their performance by fine-tuning with existing datasets on mental health and toxicity have not overcome these limitations. We anticipate that MentalManip will stimulate further research, leading to progress in both understanding and mitigating the impact of mental manipulation in conversations.
MentalManip/
├── README.md
├── mentalmanip_dataset/ # contains the final MentalManip dataset
├── experiments/ # Code for all the experiments
│ ├── datasets/ # Datasets for the experiments
│ ├── manipulation_detection/ # Code for the manipulation detection task
│ ├── technique_vulnerability/ # Code for the technique and vulnerability classification task
├── statistic_analysis/ # Code for generating statistical figures in the paper
Please check under the dataset folder.
We recommend installing the following packages and versions before running the code:
Packages | Version |
---|---|
Pytorch | 2.1.2 |
Transformers | 4.36.2 |
Tokenizers | 0.15.0 |
Openai | 1.6.1 |
Scipy | 1.11.4 |
Seaborn | 0.12.2 |
Sentence-transformers | 2.3.0 |
tqdm | 4.65.0 |
Pandas | 2.1.4 |
scikit-learn | 1.2.2 |
peft | 0.7.1 |
trl | 0.7.7 |
If you use conda to manage environment, you can add these channels to ensure you can download the above packages.
$ conda config --add channels conda-forge pytorch nvidia
All the code for the experiments is in the experiments/
folder.
We provide example command lines in runfile1 and runfile2 files for running the detection and classification tasks.
For example, to run Llama-2-13b model on the Manipulation Detection task on MentalManip_con dataset under zero-shot prompting setting:
$ CUDA_VISIBLE_DEVICES=0,1 python zeroshot_prompt.py --model llama-13b \
--data ../datasets/mentalmanip_con.csv \
--log_dir ./logs
To fine-tuning llama-2-13b model on MentalManip_con dataset (first train and save model, then evaluate)
$ CUDA_VISIBLE_DEVICES=0,1 python finetune.py --model llama-13b \
--mode train \
--eval_data mentalmanip_con \
--train_data mentalmanip
$ CUDA_VISIBLE_DEVICES=0,1 python finetune.py --model llama-13b \
--mode eval \
--eval_data mentalmanip_con \
--train_data mentalmanip
- Please check your environment setting and make sure all required packages are installed in proper versions.
- Before running Chatgpt, please place your correct api key in the code.
- Before running Llama-2, please make sure you have requested access to the models in the official Meta Llama 2 repositories.
This folder contains code for reproducing the statistical analysis in the paper.
This code file contains functions to:
- Draw distribution of techniques and vulnerabilities of MentalManip datasets.
- Draw distribution of sentiment scores of MentalManip datasets.
- Draw con-currence heat maps of techniques and vulnerabilities.
- Draw embedding space.
This code file contains functions to:
- Calculate the statistics of MentalManip dataset and other datasets.
- Draw ccdf of utterance number distribution.
- Do sentiment analysis.
- Draw embedding space.
@inproceedings{MentalManip,
title={MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations},
author={Yuxin Wang,
Ivory Yang,
Saeed Hassanpour,
Soroush Vosoughi},
booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
pages={3747--3764},
year={2024},
url={https://aclanthology.org/2024.acl-long.206},
}