allenai
/

Molmo-72B-0924

@@ -15,13 +15,13 @@ tags:
 <img src="molmo_logo.png" alt="Logo for the Molmo Project" style="width: auto; height: 50px;">
-# Molmo 72B-D
 Molmo is a family of open vision-language models developed by the Allen Institute for AI. Molmo models are trained on PixMo, a dataset of 1 million, highly-curated image-text pairs. It has state-of-the-art performance among multimodal models with a similar size while being fully open-source. You can find all models in the Molmo family [here](https://huggingface.co/collections/allenai/molmo-66f379e6fe3b8ef090a8ca19).
 **Learn more** about the Molmo family [in our announcement blog post](https://molmo.allenai.org/blog).
-Molmo 72B-D is based on [Qwen2-72B](https://huggingface.co/Qwen/Qwen2-72B) and uses [OpenAI CLIP](https://huggingface.co/openai/clip-vit-large-patch14-336) as vision backbone.
-It performs comfortably between GPT-4V and GPT-4o on both academic benchmarks and human evaluation.
 This checkpoint is a **preview** of the Molmo release. All artifacts used in creating Molmo (PixMo dataset, training code, evaluations, intermediate checkpoints) will be made available at a later date, furthering our commitment to open-source AI development and reproducibility.
@@ -46,7 +46,7 @@ import requests
 # load the processor
 processor = AutoProcessor.from_pretrained(
-    'allenai/Molmo-7B-D-0924',
     trust_remote_code=True,
     torch_dtype='auto',
     device_map='auto'
@@ -54,7 +54,7 @@ processor = AutoProcessor.from_pretrained(
 # load the model
 model = AutoModelForCausalLM.from_pretrained(
-    'allenai/Molmo-7B-D-0924',
     trust_remote_code=True,
     torch_dtype='auto',
     device_map='auto'
@@ -91,8 +91,8 @@ print(generated_text)
 | Model                       | Average Score on 11 Academic Benchmarks | Human Preference Elo Rating |
 |-----------------------------|-----------------------------------------|-----------------------------|
-| Molmo 72B                   | 81.2                                    | 1077                        |
-| **Molmo 7B-D (this model)** | **77.3**                                | **1056**                    |
 | Molmo 7B-O                  | 74.6                                    | 1051                        |
 | MolmoE 1B                   | 68.6                                    | 1032                        |
 | GPT-4o                      | 78.5                                    | 1079                        |

 <img src="molmo_logo.png" alt="Logo for the Molmo Project" style="width: auto; height: 50px;">
+# Molmo 72B
 Molmo is a family of open vision-language models developed by the Allen Institute for AI. Molmo models are trained on PixMo, a dataset of 1 million, highly-curated image-text pairs. It has state-of-the-art performance among multimodal models with a similar size while being fully open-source. You can find all models in the Molmo family [here](https://huggingface.co/collections/allenai/molmo-66f379e6fe3b8ef090a8ca19).
 **Learn more** about the Molmo family [in our announcement blog post](https://molmo.allenai.org/blog).
+Molmo 72B is based on [Qwen2-72B](https://huggingface.co/Qwen/Qwen2-72B) and uses [OpenAI CLIP](https://huggingface.co/openai/clip-vit-large-patch14-336) as vision backbone.
+Molmo-72B achieves the highest academic benchmark score and ranks second on human evaluation, just slightly behind GPT-4o.
 This checkpoint is a **preview** of the Molmo release. All artifacts used in creating Molmo (PixMo dataset, training code, evaluations, intermediate checkpoints) will be made available at a later date, furthering our commitment to open-source AI development and reproducibility.
 # load the processor
 processor = AutoProcessor.from_pretrained(
+    'allenai/Molmo-72B-0924',
     trust_remote_code=True,
     torch_dtype='auto',
     device_map='auto'
 # load the model
 model = AutoModelForCausalLM.from_pretrained(
+    'allenai/Molmo-72B-0924',
     trust_remote_code=True,
     torch_dtype='auto',
     device_map='auto'
 | Model                       | Average Score on 11 Academic Benchmarks | Human Preference Elo Rating |
 |-----------------------------|-----------------------------------------|-----------------------------|
+| **Molmo 72B (this model)**  | **81.2**                                | **1077**                    |
+| Molmo 7B-D                  | 77.3                                    | 1056                        |
 | Molmo 7B-O                  | 74.6                                    | 1051                        |
 | MolmoE 1B                   | 68.6                                    | 1032                        |
 | GPT-4o                      | 78.5                                    | 1079                        |