TRI-ML
/

DCLM-1B

sedrickkeh commited on Jul 25

Commit

ab7c317

•

1 Parent(s): ba14d96

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -13,6 +13,27 @@ DCLM-1B is a 1.4 billion parameter language model trained on the DCLM-Baseline d
 The instruction tuned version of this model is available here: https://huggingface.co/TRI-ML/DCLM-1B-IT
 ## Evaluation
 We evaluate DCLM-1B using the [llm-foundry](https://github.com/mosaicml/llm-foundry) eval suite, and compare to recently released small models on key benchmarks.

 The instruction tuned version of this model is available here: https://huggingface.co/TRI-ML/DCLM-1B-IT
+## Quickstart
+First install open_lm
+```
+pip install git+https://github.com/mlfoundations/open_lm.git
+```
+Then you can load the model using HF's Auto classes as follows:
+```python
+from open_lm.hf import *
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("TRI-ML/DCLM-1B")
+model = AutoModelForCausalLM.from_pretrained("TRI-ML/DCLM-1B")
+inputs = tokenizer(["Machine learning is"], return_tensors="pt")
+gen_kwargs = {"max_new_tokens": 50, "top_p": 0.8, "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.1}
+output = model.generate(inputs['input_ids'], **gen_kwargs)
+output = tokenizer.decode(output[0].tolist(), skip_special_tokens=True)
+print(output)
+```
 ## Evaluation
 We evaluate DCLM-1B using the [llm-foundry](https://github.com/mosaicml/llm-foundry) eval suite, and compare to recently released small models on key benchmarks.