metadata

license: apache-2.0
datasets:
  - openbmb/RLAIF-V-Dataset
language:
  - en

Model Card for RLAIF-V

GitHub

RLAIF-V-12B is a multimodal large language model (MLLM) that exhibits super GPT-4V trustworthiness. The model is built up on OmniLMM from the MiniCPM-V series.

We utilize a novel framework, RLAIF-V, which aligns MLLMs in a fully open-source paradigm. This framework maximally exploits the open-source feedback from two key perspectives, including high-quality feedback data and an online feedback learning algorithm.

Model Details

Key Features

🏅 Super GPT-4V Trustworthiness: By learning from open-source AI feedback, RLAIF-V-12B achieves super GPT-4V trustworthiness in both generative and discriminative tasks.
💪 Maintaining Well Performance on General Abilities: On benchmarks tested with the general abilities (e.g. LLaVA Bench, MMStar), RLAIF-V-12B also exhibits good performance.

fig1

Examples

fig2-1 fig2-1

Model Description

Related model: OmniLMM-12B
Trained on data: RLAIF-V-Dataset