Skip to content

Commit

Permalink
Merge remote-tracking branch 'mmtrusteval/main'
Browse files Browse the repository at this point in the history
  • Loading branch information
jankinf authored and Aries-iai committed Jul 15, 2024
2 parents 2d0a1d0 + 75a5ef9 commit c708efe
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 4 deletions.
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,8 @@ A Comprehensive Study
![framework](docs/structure/framework.jpg)


**MultiTrust** is a comprehensive benchmark designed to assess and enhance the trustworthiness of MLLMs across five key dimensions: truthfulness, safety, robustness, fairness, and privacy. It integrates a rigorous evaluation strategy involving 32 diverse tasks and self-curated datasets to expose new trustworthiness challenges.
> **MultiTrust** is a comprehensive benchmark designed to assess and enhance the trustworthiness of MLLMs across five key dimensions: truthfulness, safety, robustness, fairness, and privacy. It integrates a rigorous evaluation strategy involving 32 diverse tasks to expose new trustworthiness challenges.
---

## 🚀 News
* **`2024.07.07`** 🌟 We released the latest results for [GPT-4o](https://openai.com/index/hello-gpt-4o/), [Claude-3.5](https://www.anthropic.com/news/claude-3-5-sonnet), and [Phi-3](https://ollama.com/library/phi3) on our [project website](https://multi-trust.github.io/)
Expand Down
4 changes: 2 additions & 2 deletions data4multitrust/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Here is the instructions to prepare the dataset to reproduce results in [MultiTr


## Download Data
Install related datasets into this directory from this [link](https://drive.google.com/drive/folders/1Fh6tidH1W2aU3SbKVggg6cxWqT021rE0?usp=drive_link) and rename the this directory as `data`.
Please fill in this [form](https://docs.google.com/forms/d/e/1FAIpQLSd9ZXKXzqszUoLhRT5fD9ggsSZtbmYNKgFPVekSaseYU69a_Q/viewform?usp=sf_link) to obtain the download link of MultiTrust dataset. Then, you could install related datasets into this directory and rename the this directory as `data`.



Expand All @@ -14,4 +14,4 @@ Please note that only a part of datasets are released for now, because we are ho
Here, to support the usage of our platform and the reproduction of our results, we make the data for some tasks public, including: T.1 (Basic World Understanding), T.7 (Visual Misleading QA), S.3 (Toxicity Content Generation), S.4 (Plain Typographic Jailbreaking), R.1 (VQA for Artistic Style Images), R.6 (Textual Adversarial Attack), F.6 (Profession Prediction), F.7 (Preference Selection in QA), P.3 (InfoFlow Expectation) and P.4 (PII Query with Visual Cues).

## Restrictions
The provided dataset potentially contains sensitive and high-risk information. We urge all users to handle this data with utmost care and responsibility. Unauthorized use, sharing, or mishandling of this data can lead to serious privacy breaches and legal consequences. By accessing this dataset, you agree to comply with all applicable privacy laws and regulations, and to implement appropriate security measures to protect the data from unauthorized access or misuse.
The provided dataset potentially contains sensitive and high-risk information. We urge all users to handle this data with utmost care and responsibility. Unauthorized use, sharing, or mishandling of this data can lead to serious privacy breaches and legal consequences. By accessing this dataset, you agree to comply with all applicable privacy laws and regulations, and to implement appropriate security measures to protect the data from unauthorized access or misuse.

0 comments on commit c708efe

Please sign in to comment.