amanrangapur
commited on
Commit
•
83f13fa
1
Parent(s):
7b1b2c7
Update README.md
Browse files
README.md
CHANGED
@@ -15,21 +15,18 @@ language:
|
|
15 |
|
16 |
OLMo2 7B November 2024 is an updated version of the original [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) model rocking a ____ point increase in ____, among other evaluations improvements, from an improved version of the Dolma dataset and staged training.
|
17 |
|
18 |
-
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
19 |
-
|
20 |
-
|
21 |
|
22 |
-
|
23 |
-
|
24 |
-
The core models released in this batch are the following:
|
25 |
| Size | Training Tokens | Layers | Hidden Size | Attention Heads | Context Length |
|
26 |
|------|--------|---------|-------------|-----------------|----------------|
|
27 |
-
| [OLMo2-7B July 2024](https://huggingface.co/allenai/
|
28 |
-
| [OLMo2- 13B July 2024](https://huggingface.co/allenai/
|
29 |
|
30 |
## Inference
|
31 |
|
32 |
-
|
33 |
```python
|
34 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
35 |
olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo2-7B-1124")
|
@@ -44,8 +41,16 @@ print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
|
|
44 |
>> 'Language modeling is the first step to build natural language generation...'
|
45 |
```
|
46 |
|
47 |
-
|
48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
|
50 |
We have released checkpoints for these models, for every 1000 training steps.
|
51 |
The naming convention is `stepXXX-tokensYYYB`.
|
|
|
15 |
|
16 |
OLMo2 7B November 2024 is an updated version of the original [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) model rocking a ____ point increase in ____, among other evaluations improvements, from an improved version of the Dolma dataset and staged training.
|
17 |
|
18 |
+
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
19 |
+
These models are trained on the Dolma dataset. We are releasing all code, checkpoints, logs (coming soon), and associated training details.
|
20 |
+
The core models released in this batch include the following:
|
21 |
|
|
|
|
|
|
|
22 |
| Size | Training Tokens | Layers | Hidden Size | Attention Heads | Context Length |
|
23 |
|------|--------|---------|-------------|-----------------|----------------|
|
24 |
+
| [OLMo2-7B July 2024](https://huggingface.co/allenai/OLMo-7B-0724-hf) | 4 Trillion | 32 | 4096 | 32 | 4096 |
|
25 |
+
| [OLMo2- 13B July 2024](https://huggingface.co/allenai/OLMo-1B-0724-hf) | 5 Trillion | 40 | 5120 | 42 | 4096 |
|
26 |
|
27 |
## Inference
|
28 |
|
29 |
+
You can use OLMo with the standard HuggingFace transformers library:
|
30 |
```python
|
31 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
32 |
olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo2-7B-1124")
|
|
|
41 |
>> 'Language modeling is the first step to build natural language generation...'
|
42 |
```
|
43 |
|
44 |
+
For faster performance, you can quantize the model using the following method:
|
45 |
+
```python
|
46 |
+
AutoModelForCausalLM.from_pretrained("allenai/OLMo2-7B-1124",
|
47 |
+
torch_dtype=torch.float16,
|
48 |
+
load_in_8bit=True) # Requires bitsandbytes
|
49 |
+
```
|
50 |
+
The quantized model is more sensitive to data types and CUDA operations. To avoid potential issues, it's recommended to pass the inputs directly to CUDA using:
|
51 |
+
```python
|
52 |
+
inputs.input_ids.to('cuda')
|
53 |
+
```
|
54 |
|
55 |
We have released checkpoints for these models, for every 1000 training steps.
|
56 |
The naming convention is `stepXXX-tokensYYYB`.
|