metadata
dataset_info:
features:
- name: hazard_category
dtype: string
- name: hazard_subcategory
dtype: string
- name: hazard_subsubcategory
dtype: string
- name: case_id
dtype: string
- name: case_text
dtype: string
- name: unsafe_image_id
dtype: string
- name: unsafe_image_description
dtype: string
- name: prompt_text
dtype: string
- name: prompt_type
dtype: string
- name: unsafe_image_url
dtype: string
- name: unsafe_image_license
dtype: string
- name: unsafe_image_cw
dtype: string
splits:
- name: german
num_bytes: 70718
num_examples: 200
- name: russian
num_bytes: 76499
num_examples: 200
- name: chinese
num_bytes: 70778
num_examples: 200
- name: hindi
num_bytes: 84054
num_examples: 200
- name: spanish
num_bytes: 70689
num_examples: 200
- name: italian
num_bytes: 69545
num_examples: 200
- name: french
num_bytes: 73103
num_examples: 200
- name: english
num_bytes: 139996
num_examples: 400
- name: korean
num_bytes: 73217
num_examples: 200
- name: arabic
num_bytes: 71779
num_examples: 200
- name: farsi
num_bytes: 75732
num_examples: 200
download_size: 351210
dataset_size: 876110
configs:
- config_name: default
data_files:
- split: german
path: data/german-*
- split: russian
path: data/russian-*
- split: chinese
path: data/chinese-*
- split: hindi
path: data/hindi-*
- split: spanish
path: data/spanish-*
- split: italian
path: data/italian-*
- split: french
path: data/french-*
- split: english
path: data/english-*
- split: korean
path: data/korean-*
- split: arabic
path: data/arabic-*
- split: farsi
path: data/farsi-*
license: cc-by-4.0
language:
- ar
- fr
- en
- de
- zh
- ko
- fa
- hi
- it
- ru
- es
size_categories:
- 1K<n<10K
task_categories:
- image-text-to-text
Dataset Card for the MSTS Benchmark
Here, you can find our paper and code. Note that for reproducing the exact results, we refer the user to the GitHub repo that provides download and preprocessing scripts for the images.
Example usage:
from datasets import load_dataset
ds = load_dataset("felfri/MSTS")
# or select specific language
lang = 'german'
ds = load_dataset("felfri/MSTS", split=lang)
Disclaimer
The MSTS dataset contains content that may be offensive or upsetting in nature. Topics include, but are not limited to, discriminatory language and discussions of abuse, violence, self-harm, exploitation, and other potentially upsetting subject matter. Please only engage with the data in accordance with your own personal risk tolerance. The data are intended for research purposes, especially research that can make models less harmful.
Citation Information
Please consider citing our work if you use data and/or code from this repository.
@misc{röttger2025mstsmultimodalsafetytest,
title={MSTS: A Multimodal Safety Test Suite for Vision-Language Models},
author={Paul Röttger and Giuseppe Attanasio and Felix Friedrich and Janis Goldzycher and Alicia Parrish and Rishabh Bhardwaj and Chiara Di Bonaventura and Roman Eng and Gaia El Khoury Geagea and Sujata Goswami and Jieun Han and Dirk Hovy and Seogyeong Jeong and Paloma Jeretič and Flor Miriam Plaza-del-Arco and Donya Rooein and Patrick Schramowski and Anastassia Shaitarova and Xudong Shen and Richard Willats and Andrea Zugarini and Bertie Vidgen},
year={2025},
eprint={2501.10057},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.10057},
}