sedrickkeh
commited on
Commit
•
ab7c317
1
Parent(s):
ba14d96
Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,27 @@ DCLM-1B is a 1.4 billion parameter language model trained on the DCLM-Baseline d
|
|
13 |
|
14 |
The instruction tuned version of this model is available here: https://huggingface.co/TRI-ML/DCLM-1B-IT
|
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
## Evaluation
|
17 |
|
18 |
We evaluate DCLM-1B using the [llm-foundry](https://github.com/mosaicml/llm-foundry) eval suite, and compare to recently released small models on key benchmarks.
|
|
|
13 |
|
14 |
The instruction tuned version of this model is available here: https://huggingface.co/TRI-ML/DCLM-1B-IT
|
15 |
|
16 |
+
## Quickstart
|
17 |
+
First install open_lm
|
18 |
+
```
|
19 |
+
pip install git+https://github.com/mlfoundations/open_lm.git
|
20 |
+
```
|
21 |
+
|
22 |
+
Then you can load the model using HF's Auto classes as follows:
|
23 |
+
```python
|
24 |
+
from open_lm.hf import *
|
25 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
26 |
+
tokenizer = AutoTokenizer.from_pretrained("TRI-ML/DCLM-1B")
|
27 |
+
model = AutoModelForCausalLM.from_pretrained("TRI-ML/DCLM-1B")
|
28 |
+
|
29 |
+
inputs = tokenizer(["Machine learning is"], return_tensors="pt")
|
30 |
+
gen_kwargs = {"max_new_tokens": 50, "top_p": 0.8, "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.1}
|
31 |
+
output = model.generate(inputs['input_ids'], **gen_kwargs)
|
32 |
+
output = tokenizer.decode(output[0].tolist(), skip_special_tokens=True)
|
33 |
+
print(output)
|
34 |
+
```
|
35 |
+
|
36 |
+
|
37 |
## Evaluation
|
38 |
|
39 |
We evaluate DCLM-1B using the [llm-foundry](https://github.com/mosaicml/llm-foundry) eval suite, and compare to recently released small models on key benchmarks.
|