diff --git a/README.md b/README.md index cb92722..93d0204 100644 --- a/README.md +++ b/README.md @@ -23,9 +23,8 @@ A Comprehensive Study ![framework](docs/structure/framework.jpg) -**MultiTrust** is a comprehensive benchmark designed to assess and enhance the trustworthiness of MLLMs across five key dimensions: truthfulness, safety, robustness, fairness, and privacy. It integrates a rigorous evaluation strategy involving 32 diverse tasks and self-curated datasets to expose new trustworthiness challenges. +> **MultiTrust** is a comprehensive benchmark designed to assess and enhance the trustworthiness of MLLMs across five key dimensions: truthfulness, safety, robustness, fairness, and privacy. It integrates a rigorous evaluation strategy involving 32 diverse tasks to expose new trustworthiness challenges. ---- ## 🚀 News * **`2024.07.07`** 🌟 We released the latest results for [GPT-4o](https://openai.com/index/hello-gpt-4o/), [Claude-3.5](https://www.anthropic.com/news/claude-3-5-sonnet), and [Phi-3](https://ollama.com/library/phi3) on our [project website](https://multi-trust.github.io/) ! diff --git a/data4multitrust/README.md b/data4multitrust/README.md index e07ccb7..9f1256e 100644 --- a/data4multitrust/README.md +++ b/data4multitrust/README.md @@ -4,7 +4,7 @@ Here is the instructions to prepare the dataset to reproduce results in [MultiTr ## Download Data -Install related datasets into this directory from this [link](https://drive.google.com/drive/folders/1Fh6tidH1W2aU3SbKVggg6cxWqT021rE0?usp=drive_link) and rename the this directory as `data`. +Please fill in this [form](https://docs.google.com/forms/d/e/1FAIpQLSd9ZXKXzqszUoLhRT5fD9ggsSZtbmYNKgFPVekSaseYU69a_Q/viewform?usp=sf_link) to obtain the download link of MultiTrust dataset. Then, you could install related datasets into this directory and rename the this directory as `data`. @@ -14,4 +14,4 @@ Please note that only a part of datasets are released for now, because we are ho Here, to support the usage of our platform and the reproduction of our results, we make the data for some tasks public, including: T.1 (Basic World Understanding), T.7 (Visual Misleading QA), S.3 (Toxicity Content Generation), S.4 (Plain Typographic Jailbreaking), R.1 (VQA for Artistic Style Images), R.6 (Textual Adversarial Attack), F.6 (Profession Prediction), F.7 (Preference Selection in QA), P.3 (InfoFlow Expectation) and P.4 (PII Query with Visual Cues). ## Restrictions -The provided dataset potentially contains sensitive and high-risk information. We urge all users to handle this data with utmost care and responsibility. Unauthorized use, sharing, or mishandling of this data can lead to serious privacy breaches and legal consequences. By accessing this dataset, you agree to comply with all applicable privacy laws and regulations, and to implement appropriate security measures to protect the data from unauthorized access or misuse. \ No newline at end of file +The provided dataset potentially contains sensitive and high-risk information. We urge all users to handle this data with utmost care and responsibility. Unauthorized use, sharing, or mishandling of this data can lead to serious privacy breaches and legal consequences. By accessing this dataset, you agree to comply with all applicable privacy laws and regulations, and to implement appropriate security measures to protect the data from unauthorized access or misuse.