Text Generation
Transformers
PyTorch
Safetensors
English
hf_olmo
conversational
custom_code
shanearora commited on
Commit
0f0f781
1 Parent(s): 2ae49c0

Update README.md

Browse files

Change README so that it works for all transformers versions as long as `ai2-olmo` is >=v0.3.0

Files changed (1) hide show
  1. README.md +5 -16
README.md CHANGED
@@ -24,7 +24,7 @@ We release all code, checkpoints, logs (coming soon), and details involved in tr
24
  OLMo 7B Instruct and OLMo SFT are two adapted versions of these models trained for better question answering.
25
  They show the performance gain that OLMo base models can achieve with existing fine-tuning techniques.
26
 
27
- *Note:* This model requires installing `ai2-olmo` with pip and using HuggingFace Transformers<=4.39. New versions of the model will be released soon with compatibility improvements.
28
 
29
  ## Model Details
30
 
@@ -82,11 +82,9 @@ pip install ai2-olmo
82
  ```
83
  Now, proceed as usual with HuggingFace:
84
  ```python
85
- import hf_olmo
86
-
87
- from transformers import AutoModelForCausalLM, AutoTokenizer
88
- olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-Instruct")
89
- tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-7B-Instruct")
90
  chat = [
91
  { "role": "user", "content": "What is language modeling?" },
92
  ]
@@ -99,17 +97,8 @@ response = olmo.generate(input_ids=inputs.to(olmo.device), max_new_tokens=100, d
99
  print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
100
  >> '<|user|>\nWhat is language modeling?\n<|assistant|>\nLanguage modeling is a type of natural language processing (NLP) task or machine learning task that...'
101
  ```
102
- Alternatively, with the pipeline abstraction:
103
- ```python
104
- import hf_olmo
105
-
106
- from transformers import pipeline
107
- olmo_pipe = pipeline("text-generation", model="allenai/OLMo-7B-Instruct")
108
- print(olmo_pipe("What is language modeling?"))
109
- >> '[{'generated_text': 'What is language modeling?\nLanguage modeling is a type of natural language processing (NLP) task...'}]'
110
- ```
111
 
112
- Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-Instruct", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
113
  The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
114
 
115
  Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.
 
24
  OLMo 7B Instruct and OLMo SFT are two adapted versions of these models trained for better question answering.
25
  They show the performance gain that OLMo base models can achieve with existing fine-tuning techniques.
26
 
27
+ *Note:* This model requires installing `ai2-olmo` with pip and using `ai2-olmo`>=0.3.0 or HuggingFace Transformers<=4.39. New versions of the model will be released soon with compatibility improvements.
28
 
29
  ## Model Details
30
 
 
82
  ```
83
  Now, proceed as usual with HuggingFace:
84
  ```python
85
+ from hf_olmo import OLMoForCausalLM, OLMoTokenizerFast
86
+ olmo = OLMoForCausalLM.from_pretrained("allenai/OLMo-7B-Instruct")
87
+ tokenizer = OLMoTokenizerFast.from_pretrained("allenai/OLMo-7B-Instruct")
 
 
88
  chat = [
89
  { "role": "user", "content": "What is language modeling?" },
90
  ]
 
97
  print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
98
  >> '<|user|>\nWhat is language modeling?\n<|assistant|>\nLanguage modeling is a type of natural language processing (NLP) task or machine learning task that...'
99
  ```
 
 
 
 
 
 
 
 
 
100
 
101
+ You can make this slightly faster by quantizing the model, e.g. `OLMoForCausalLM.from_pretrained("allenai/OLMo-7B-Instruct", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
102
  The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
103
 
104
  Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.