PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models

This repository provides training scripts for fine-tuning LLMs using our preference datasets as described in the paper.

Dataset

The datasets are hosted on Hugging Face Hub. There are two preference datasets:

Left-leaning preference dataset
Right-leaning preference dataset

Repository Structure

configs/ - Contains the training recipes for LLMs.
data/ - Contains the dataset wrappers.
finetune/ - Contains the fine-tuning script.

Dependencies

The codebase depends on torchtune and huggingface.

Fine-Tuning the Model

To fine-tune the model, follow these steps:

Download the model weights using torchtune's tune download.
Ensure the configuration file under configs/ is correctly pointing to the downloaded model.

Run the fine-tuning process using torchtune:

tune run finetune/dpo_finetune.py --config configs/<config file> checkpointer.output_dir=<path to save the fine-tuned model> output_dir=<path to save the outputs and logs> dataset._component_=<data.datasets.politune_left|data.datasets.politune_right>

For example:

tune run finetune/dpo_finetune.py --config configs/llama8b_lora_dpo_single_device.yaml checkpointer.output_dir=checkpoints/ output_dir=output/ dataset._component_=data.datasets.politune_left

Citation

If you use this codebase or the datasets in your work, please cite our paper:

@inproceedings{agiza2024politune,
  title={PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models},
  author={Agiza, Ahmed and Mostagir, Mohamed and Reda, Sherief},
  booktitle={Proceedings of the 2024 AAAI/ACM Conference on AI, Ethics, and Society},
  pages={},
  year={2024}
}

License

MIT License. See LICENSE file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models

Dataset

Repository Structure

Dependencies

Fine-Tuning the Model

Citation

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

PoliTune: Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in Large Language Models

Dataset

Repository Structure

Dependencies

Fine-Tuning the Model

Citation

License