|
--- |
|
language: en |
|
--- |
|
|
|
# dummy-llama-2 |
|
|
|
This is a dummy version of the model based on [`meta-llama/Llama-2-7b-hf`](https://huggingface.co/meta-llama/Llama-2-7b-hf). |
|
|
|
## 🧩 Dummy |
|
|
|
`dummy-llama-2` has a size of 929.07 MB instead of the original 13476.98 MB (compression factor of 14.51) but keeps the base model's functionality. |
|
|
|
The purpose of this dummy version is to be used for **debugging**, so you don't have to download the entire original model. Do not use it for inference. |
|
|
|
## 💻 Usage |
|
|
|
```python |
|
# pip install transformers accelerate |
|
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
model = "dummy-llama-2" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model, |
|
low_cpu_mem_usage=True, |
|
return_dict=True, |
|
torch_dtype=torch.float16, |
|
device_map={"": 0}, |
|
) |
|
``` |