Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
jankinf committed Jun 23, 2024
1 parent 4c5b824 commit fd04fc8
Show file tree
Hide file tree
Showing 2 changed files with 174 additions and 30 deletions.
29 changes: 29 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: ci
on:
push:
branches:
- master
- main
permissions:
contents: write
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure Git Credentials
run: |
git config user.name github-actions[bot]
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
- uses: actions/setup-python@v5
with:
python-version: 3.x
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
- uses: actions/cache@v4
with:
key: mkdocs-material-${{ env.cache_id }}
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install mkdocs-material
- run: mkdocs gh-deploy -f env/mkdocs.yml --force
175 changes: 145 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,19 @@

<h2 align="center">Benchmarking Trustworthiness of Multimodal Large Language Models:<br>
A Comprehensive Study
</h2>

<font size=8>
<p align="center"> Benchmarking Trustworthiness of Multimodal Large Language Models: </p>
<p align="center"> A Comprehensive Study </p>
<font size=3>
<p align="center"> This is the official repository for the <b>MultiTrust</b> toolbox </p>
</font>


<div align="center" style="font-size: 16px;">
<a href="https://multi-trust.github.io/" style="margin-right: 5px;">🍎 Project Page </a>
<a href="https://arxiv.org/abs/2406.07057" style="margin-right: 5px;">📖 arXiv Paper </a>
<a href="https://github.com/thu-ml/MMTrustEval" style="margin-right: 5px;">📊 Dataset </a>
<a href="https://multi-trust.github.io/#leaderboard">🏆 Leaderboard </a>
🍎 <a href="https://multi-trust.github.io/">Project Page</a> &nbsp&nbsp
📖 <a href="https://arxiv.org/abs/2406.07057">arXiv Paper</a> &nbsp&nbsp
📊 <a href="https://github.com/thu-ml/MMTrustEval">Dataset</a> &nbsp&nbsp
🏆 <a href="https://multi-trust.github.io/#leaderboard">Leaderboard</a>
</div>


<font size=3>
<p align="center"> This is the official repository for the <b>MultiTrust</b> toolbox </p>
</font>
<br>

<div align="center">
<img src="https://img.shields.io/badge/Benchmark-Truthfulness-yellow" alt="Truthfulness" />
Expand All @@ -25,13 +22,14 @@
<img src="https://img.shields.io/badge/Benchmark-Fairness-orange" alt="Fairness" />
<img src="https://img.shields.io/badge/Benchmark-Privacy-green" alt="Privacy" />
</div>
<br>

![framework](docs/structure/framework.jpg)


# Getting Started
## Getting Started

## Environment
### 💡 Environment

- Option A: Pip install
```shell
Expand Down Expand Up @@ -94,52 +92,169 @@
ssh -p 11180 root@[your_ip_here]
```

## Dataset
### :envelope: Dataset

### License
#### License
The codebase is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

### Download Data
#### Download Data
Install related datasets into [data] folder (TODO)

### Incomplete Data released
#### Incomplete Data released
Please note that only a partial dataset will be included with the codebase. This is to prevent data leakage as we are hosting another competition based on this codebase.

### Restrictions
#### Restrictions
The partial dataset provided contains sensitive and potentially high-risk information. We urge all users to handle this data with utmost care and responsibility. Unauthorized use, sharing, or mishandling of this data can lead to serious privacy breaches and legal consequences. By accessing this dataset, you agree to comply with all applicable privacy laws and regulations, and to implement appropriate security measures to protect the data from unauthorized access or misuse.



## Model Checkpoints
### 📍 Model Checkpoints

Install related checkpoints into [playground] folder


# Reproduce results in Our paper
## 📈 Reproduce results in Our paper

| ID | Task Name | Metrics | Type |
|-----|-----------------------------------|-----------------------------------------|-------------|
| T.1 | Basic World Understanding | Accuracy ($\uparrow$) | Dis.\&Gen. |
| T.2 | Advanced Cognitive Inference | Accuracy ($\uparrow$) | Dis. |
| T.3 | VQA under Instruction Enhancement | Accuracy ($\uparrow$) | Gen. |
| T.4 | QA under Visual Assistance | Accuracy ($\uparrow$) | Gen. |
| T.5 | Text Misleading VQA | Accuracy ($\uparrow$) | Gen. |
| T.6 | Visual Confusion VQA | Accuracy ($\uparrow$) | Gen. |
| T.7 | Visual Misleading QA | Accuracy ($\uparrow$) | Dis. |
| S.1 | Risk Identification | Accuracy ($\uparrow$) | Dis.\&Gen. |
| S.2 | Image Description | Toxicity Score ($\downarrow$), RtA ($\uparrow$) | Gen. |
| S.3 | Toxicity Content Generation | Toxicity Score ($\downarrow$), RtA ($\uparrow$) | Gen. |
| S.4 | Plain Typographic Jailbreaking | ASR ($\downarrow$), RtA ($\uparrow$) | Gen. |
| S.5 | Optimized Multimodal Jailbreaking | ASR ($\downarrow$), RtA ($\uparrow$) | Gen. |
| S.6 | Cross-modal Influence on Jailbreaking | ASR ($\downarrow$), RtA ($\uparrow$) | Gen. |
| R.1 | VQA for Artistic Style images | Score ($\uparrow$) | Gen. |
| R.2 | VQA for Sensor Style images | Score ($\uparrow$) | Gen. |
| R.3 | Sentiment Analysis for OOD texts | Accuracy ($\uparrow$) | Dis. |
| R.4 | Image Captioning under Untarget attack | Accuracy ($\uparrow$) | Gen. |
| R.5 | Image Captioning under Target attack | Attack Success Rate ($\downarrow$) | Gen. |
| R.6 | Textual adversarial attack | Accuracy ($\uparrow$) | Dis. |
| F.1 | Stereotype Content Detection | Containing Rate ($\downarrow$) | Gen. |
| F.2 | Agreement on Stereotypes | Agreement Percentage ($\downarrow$) | Dis. |
| F.3 | Classification of Stereotypes | Accuracy ($\uparrow$) | Dis. |
| F.4 | Stereotype Query Test | RtA ($\uparrow$) | Gen. |
| F.5 | Preference Selection in VQA | RtA ($\uparrow$) | Gen. |
| F.6 | Profession Prediction | Pearson’s correlation ($\uparrow$) | Gen. |
| F.7 | Preference Selection in QA | RtA ($\uparrow$) | Gen. |
| P.1 | Visual Privacy Recognition | Accuracy, F1 ($\uparrow$) | Dis. |
| P.2 | Privacy-sensitive QA Recognition | Accuracy, F1 ($\uparrow$) | Dis. |
| P.3 | InfoFlow Expectation | Pearson's Correlation ($\uparrow$) | Gen. |
| P.4 | PII Query with Visual Cues | RtA ($\uparrow$) | Gen. |
| P.5 | Privacy Leakage in Vision | RtA ($\uparrow$), Accuracy ($\uparrow$) | Gen. |
| P.6 | PII Leakage in Conversations | RtA ($\uparrow$), Accuracy ($\uparrow$) | Gen. |
Running scripts under `scripts/run` can calculate the results of specific tasks, while scripts under `scrpts/score` can be used to calculate evaluation scores based on the results.
## Get results
### 📌 Get results
```
bash scripts/run/privacy_scripts/p1-vispriv-recognition.sh
# bash scripts/run/**/*.sh
scripts/run
├── fairness_scripts
│ ├── f1-stereo-generation.sh
│ ├── f2-stereo-agreement.sh
│ ├── f3-stereo-classification.sh
│ ├── f3-stereo-topic-classification.sh
│ ├── f4-stereo-query.sh
│ ├── f5-vision-preference.sh
│ ├── f6-profession-pred.sh
│ └── f7-subjective-preference.sh
├── privacy_scripts
│ ├── p1-vispriv-recognition.sh
│ ├── p2-vqa-recognition-vispr.sh
│ ├── p3-infoflow.sh
│ ├── p4-pii-query.sh
│ ├── p5-visual-leakage.sh
│ └── p6-pii-leakage-in-conversation.sh
├── robustness_scripts
│ ├── r1-ood-artistic.sh
│ ├── r2-ood-sensor.sh
│ ├── r3-ood-text.sh
│ ├── r4-adversarial-untarget.sh
│ ├── r5-adversarial-target.sh
│ └── r6-adversarial-text.sh
├── safety_scripts
│ ├── s1-nsfw-image-description.sh
│ ├── s2-risk-identification.sh
│ ├── s3-toxic-content-generation.sh
│ ├── s4-typographic-jailbreaking.sh
│ ├── s5-multimodal-jailbreaking.sh
│ └── s6-crossmodal-jailbreaking.sh
└── truthfulness_scripts
├── t1-basic.sh
├── t2-advanced.sh
├── t3-instruction-enhancement.sh
├── t4-visual-assistance.sh
├── t5-text-misleading.sh
├── t6-visual-confusion.sh
└── t7-visual-misleading.sh
```
## Get scores
### 📌 Get scores
```
python scripts/score/privacy/p1-vispriv-recognition.py
# python scripts/score/**/*.py
scripts/score
├── fairness
│ ├── f1-stereo-generation.py
│ ├── f2-stereo-agreement.py
│ ├── f3-stereo-classification.py
│ ├── f3-stereo-topic-classification.py
│ ├── f4-stereo-query.py
│ ├── f5-vision-preference.py
│ ├── f6-profession-pred.py
│ └── f7-subjective-preference.py
├── privacy
│ ├── p1-vispriv-recognition.py
│ ├── p2-vqa-recognition-vispr.py
│ ├── p3-infoflow.py
│ ├── p4-pii-query.py
│ ├── p5-visual-leakage.py
│ └── p6-pii-leakage-in-conversation.py
├── robustness
│ ├── r1-ood_artistic.py
│ ├── r2-ood_sensor.py
│ ├── r3-ood_text.py
│ ├── r4-adversarial_untarget.py
│ ├── r5-adversarial_target.py
│ └── r6-adversarial_text.py
├── safefy
│ ├── s1-nsfw-image-description.py
│ ├── s2-risk-identification.py
│ ├── s3-toxic-content-generation.py
│ ├── s4-typographic-jailbreaking.py
│ ├── s5-multimodal-jailbreaking.py
│ └── s6-crossmodal-jailbreaking.py
└── truthfulness
├── t1-basic.py
├── t2-advanced.py
├── t3-instruction-enhancement.py
├── t4-visual-assistance.py
├── t5-text-misleading.py
├── t6-visual-confusion.py
└── t7-visual-misleading.py
```
## 📈 Results
### 📌 Results
![result](docs/structure/overall.png)
# Docs
Run following command to see the docs.
## 📚 Docs
Run following command to see the docs(locally).
```shell
mkdocs serve -f env/mkdocs.yml -a 0.0.0.0:8000
```
# :black_nib: Citation
## :black_nib: Citation
If you find our work helpful for your research, please consider citing our work.
```bibtex
Expand Down

0 comments on commit fd04fc8

Please sign in to comment.