Text Generation
Transformers
Safetensors
English
omnilmm
conversational
Inference Endpoints
RLAIF-V-12B / README.md
HaoyeZhang's picture
Update README.md
e0c7f4a verified
|
raw
history blame
1.96 kB
metadata
license: apache-2.0
datasets:
  - openbmb/RLAIF-V-Dataset
language:
  - en

Model Card for RLAIF-V

GitHub

RLAIF-V-12B is a multimodal large language model (MLLM) that exhibits super GPT-4V trustworthiness. The model is built up on OmniLMM from the MiniCPM-V series.

We utilize a novel framework, RLAIF-V, which aligns MLLMs in a fully open-source paradigm. This framework maximally exploits the open-source feedback from two key perspectives, including high-quality feedback data and an online feedback learning algorithm.

Model Details

Key Features

  • 🏅 Super GPT-4V Trustworthiness: By learning from open-source AI feedback, RLAIF-V-12B achieves super GPT-4V trustworthiness in both generative and discriminative tasks.
  • 💪 Maintaining Well Performance on General Abilities: On benchmarks tested with the general abilities (e.g. LLaVA Bench, MMStar), RLAIF-V-12B also exhibits good performance.

fig1

Examples

fig2-1 fig2-1

Model Description