allenai
/

OLMo-1B-0724-hf

@@ -32,14 +32,14 @@ The core models released in this batch are the following:
 To load a specific model revision with HuggingFace, simply add the argument `revision`:
 ```bash
-olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-7B-hf", revision="step1000-tokens4B")
 ```
 All revisions/branches are listed in the file `revisions.txt`.
 Or, you can access all the revisions for the models via the following code snippet:
 ```python
 from huggingface_hub import list_repo_refs
-out = list_repo_refs("allenai/OLMo-1.7-7B-hf")
 branches = [b.name for b in out.branches]
 ```
@@ -62,15 +62,13 @@ branches = [b.name for b in out.branches]
     - Evaluation code: https://github.com/allenai/OLMo-Eval
     - Further fine-tuning code: https://github.com/allenai/open-instruct
 - **Paper:** [Link](https://arxiv.org/abs/2402.00838)
-- **W&B Logs:** [pretraining](https://wandb.ai/ai2-llm/OLMo-7B/groups/OLMo-1.7-7B), [annealing](https://wandb.ai/ai2-llm/OLMo-7B/groups/OLMo-1.7-7B-anneal)
 ## Uses
 ### Inference
-Install Transformers [from source](https://huggingface.co/docs/transformers/en/installation#install-from-source), or update to the next version when this [PR](https://github.com/huggingface/transformers/pull/29890) is integrated.
-Now, proceed as usual with HuggingFace:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-1B-hf")
@@ -95,12 +93,6 @@ print(olmo_pipe("Language modeling is "))
 Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-1B-hf", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
 The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
-Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.
-```bash
-    raise ImportError(
-ImportError: This modeling file requires the following packages that were not found in your environment: hf_olmo. Run `pip install hf_olmo`
-```
 ### Fine-tuning
 Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
 1. Fine-tune with the OLMo repository:

 To load a specific model revision with HuggingFace, simply add the argument `revision`:
 ```bash
+olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-1B-hf", revision="step1000-tokens2B")
 ```
 All revisions/branches are listed in the file `revisions.txt`.
 Or, you can access all the revisions for the models via the following code snippet:
 ```python
 from huggingface_hub import list_repo_refs
+out = list_repo_refs("allenai/OLMo-1.7-1B-hf")
 branches = [b.name for b in out.branches]
 ```
     - Evaluation code: https://github.com/allenai/OLMo-Eval
     - Further fine-tuning code: https://github.com/allenai/open-instruct
 - **Paper:** [Link](https://arxiv.org/abs/2402.00838)
+<!-- - **W&B Logs:** [pretraining](https://wandb.ai/ai2-llm/OLMo-7B/groups/OLMo-1.7-7B), [annealing](https://wandb.ai/ai2-llm/OLMo-7B/groups/OLMo-1.7-7B-anneal) -->
 ## Uses
 ### Inference
+Install Transformers. Then proceed as usual with HuggingFace:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-1B-hf")
 Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-1B-hf", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
 The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
 ### Fine-tuning
 Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
 1. Fine-tune with the OLMo repository: