File size: 13,202 Bytes
12637c4 6cec86d 6a164c1 6cec86d 6a164c1 3c1e933 0eba5bb b2c52b1 0eba5bb 0d21689 b2c52b1 360d089 877f844 fdee1a7 85e6c39 665c412 0c6c120 3be5a2f cb0ed1d 3d480f6 33a53eb 3d480f6 66dadf3 4c3661f c400e4f 3d480f6 7c2db1a b972f56 71ab8e9 6cab6c2 ee3415d 6cab6c2 7c2db1a 3d480f6 7c2db1a cb0ed1d 848a6af 51c3fe7 dd18495 51c3fe7 dd18495 51c3fe7 848a6af 2d4b3bf 79528fe 2d4b3bf 7215a87 50aec0b 2d4b3bf 7215a87 2d4b3bf 7215a87 2a4765e 2d4b3bf a4bb95e 2d4b3bf 3d480f6 2d4b3bf a4bb95e 3d480f6 a4bb95e 3d480f6 a4bb95e 2d4b3bf 50aec0b 2d4b3bf 7215a87 2d4b3bf 7215a87 2a4765e 2d4b3bf a4bb95e 2d4b3bf 3d480f6 2d4b3bf a4bb95e 3d480f6 a4bb95e 3d480f6 a4bb95e 2d4b3bf 848a6af 18b3fcb 8a46cc2 18b3fcb d8b6b27 fa1a068 d8b6b27 18b3fcb 6493b34 18b3fcb 848a6af cb0ed1d 7c2db1a cb0ed1d 848a6af 2046b2c a4bb95e 848a6af bb8a7f9 848a6af bb8a7f9 18eb715 6a164c1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 |
---
language:
- en
- it
license: llama3
library_name: transformers
tags:
- facebook
- meta
- pythorch
- llama
- llama-3
- llamantino
base_model: meta-llama/Meta-Llama-3-8B-Instruct
datasets:
- gsarti/clean_mc4_it
- Chat-Error/wizard_alpaca_dolly_orca
- mlabonne/orpo-dpo-mix-40k
metrics:
- accuracy
model_creator: Marco Polignano - SWAP Research Group
pipeline_tag: text-generation
model-index:
- name: LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 74.57
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 92.75
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 66.85
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 75.93
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 82.0
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 58.61
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
name: Open LLM Leaderboard
---
<img src="https://cdn-uploads.huggingface.co/production/uploads/5df8bb21da6d0311fd3d540f/cZoZdwQOPdQsnQmDXHcSn.png" alt="llamantino3_anita" border="0" width="800px">
<hr>
<!--<img src="https://i.ibb.co/6mHSRm3/llamantino53.jpg" width="200"/>-->
<h3><i>"Built with <b>Meta Llama 3</b>".</i></i></h3>
<p style="text-align:justify;"><b>LLaMAntino-3-ANITA-8B-Inst-DPO-ITA</b> is a model of the <a href="https://huggingface.co/swap-uniba"><b>LLaMAntino</b></a> - <i>Large Language Models family</i>.
The model is an instruction-tuned version of <a href="https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct"><b>Meta-Llama-3-8b-instruct</b></a> (a fine-tuned <b>LLaMA 3 model</b>).
This model version aims to be the a <b>Multilingual Model</b> ๐ (EN ๐บ๐ธ + ITA๐ฎ๐น) to further fine-tuning on Specific Tasks in Italian.</p>
The ๐**ANITA project**๐ *(**A**dvanced **N**atural-based interaction for the **ITA**lian language)*
wants to provide Italian NLP researchers with an improved model for the Italian Language ๐ฎ๐น use cases.<br>
<hr>
**Live DEMO:** [https://chat.llamantino.it/](https://chat.llamantino.it/)<br>
*It works only with Italian connection.*
<hr>
## Model Details
*Last Update: 10/05/2024*<br>
<a href="https://github.com/marcopoli/LLaMAntino-3-ANITA"><img src="https://github.githubassets.com/assets/GitHub-Logo-ee398b662d42.png" width="150"> https://github.com/marcopoli/LLaMAntino-3-ANITA</a><br>
| Model | HF | GGUF | EXL2 |
|-------|-------|-------|-------|
| *swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA* | [Link](https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA) | [Link](https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA_GGUF) | [Link](https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA_EXL2) |
<hr>
## Specifications
- **Model developers**: <br><a href="https://marcopoli.github.io/">Ph.D. Marco Polignano</a> - University of Bari Aldo Moro, Italy <br> <a href="https://huggingface.co/swap-uniba">SWAP Research Group</a> <br>
- **Variations**: The model release has been **supervised fine-tuning (SFT)** using **QLoRA** 4bit, on instruction-based datasets. **DPO** approach over the *mlabonne/orpo-dpo-mix-40k* dataset is used to align with human preferences for helpfulness and safety.
- **Input**: Models input text only.
- **Language**: Multilingual ๐ + Italian ๐ฎ๐น
- **Output**: Models generate text and code only.
- **Model Architecture**: *Llama 3 architecture*.
- **Context length**: 8K, 8192.
- **Library Used**: [Unsloth](https://unsloth.ai/)
<hr>
## Playground
To use the model directly, there are many ways to get started, choose one of the following ways to experience it.
### Prompt Template
```
<|start_header_id|>system<|end_header_id|>
{ SYS Prompt }<|eot_id|><|start_header_id|>user<|end_header_id|>
{ USER Prompt }<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{ ASSIST Prompt }<|eot_id|>
````
### Transformers
For direct use with `transformers`, you can easily get started with the following steps.
- Firstly, you need to install transformers via the command below with `pip`.
```bash
pip install -U transformers trl peft accelerate bitsandbytes
```
- Right now, you can start using the model directly.
```python
import torch
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
)
base_model = "swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA"
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(base_model)
sys = "Sei un an assistente AI per la lingua Italiana di nome LLaMAntino-3 ANITA " \
"(Advanced Natural-based interaction for the ITAlian language)." \
" Rispondi nella lingua usata per la domanda in modo chiaro, semplice ed esaustivo."
messages = [
{"role": "system", "content": sys},
{"role": "user", "content": "Chi รจ Carlo Magno?"}
]
#Method 1
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
for k,v in inputs.items():
inputs[k] = v.cuda()
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.9, temperature=0.6)
results = tokenizer.batch_decode(outputs)[0]
print(results)
#Method 2
import transformers
pipe = transformers.pipeline(
model=model,
tokenizer=tokenizer,
return_full_text=False, # langchain expects the full text
task='text-generation',
max_new_tokens=512, # max number of tokens to generate in the output
temperature=0.6, #temperature for more or less creative answers
do_sample=True,
top_p=0.9,
)
sequences = pipe(messages)
for seq in sequences:
print(f"{seq['generated_text']}")
```
- Additionally, you can also use a model with **4bit quantization** to reduce the required resources at least. You can start with the code below.
```python
import torch
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
BitsAndBytesConfig,
)
base_model = "swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=False,
)
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(base_model)
sys = "Sei un an assistente AI per la lingua Italiana di nome LLaMAntino-3 ANITA " \
"(Advanced Natural-based interaction for the ITAlian language)." \
" Rispondi nella lingua usata per la domanda in modo chiaro, semplice ed esaustivo."
messages = [
{"role": "system", "content": sys},
{"role": "user", "content": "Chi รจ Carlo Magno?"}
]
#Method 1
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
for k,v in inputs.items():
inputs[k] = v.cuda()
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.9, temperature=0.6)
results = tokenizer.batch_decode(outputs)[0]
print(results)
#Method 2
import transformers
pipe = transformers.pipeline(
model=model,
tokenizer=tokenizer,
return_full_text=False, # langchain expects the full text
task='text-generation',
max_new_tokens=512, # max number of tokens to generate in the output
temperature=0.6, #temperature for more or less creative answers
do_sample=True,
top_p=0.9,
)
sequences = pipe(messages)
for seq in sequences:
print(f"{seq['generated_text']}")
```
<hr>
## Evaluation
**Open LLM Leaderboard:**
Evaluated with lm-evaluation-benchmark-harness for the [**Open Italian LLMs Leaderboard**](https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard)
```
lm_eval --model hf --model_args pretrained=HUGGINGFACE_MODEL_ID --tasks hellaswag_it,arc_it --device cuda:0 --batch_size auto:2
lm_eval --model hf --model_args pretrained=HUGGINGFACE_MODEL_ID --tasks m_mmlu_it --num_fewshot 5 --device cuda:0 --batch_size auto:2
```
| Metric | Value |
|-----------------------|---------------------------|
| Avg. | **0.6160** |
| Arc_IT | 0.5714 |
| Hellaswag_IT | 0.7093 |
| MMLU_IT | 0.5672 |
<hr>
## Unsloth
<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/made with unsloth.png" width="200px" align="center" />
[Unsloth](https://unsloth.ai), a great tool that helps us easily develop products, at a lower cost than expected.
## Citation instructions
```bibtex
@misc{polignano2024advanced,
title={Advanced Natural-based interaction for the ITAlian language: LLaMAntino-3-ANITA},
author={Marco Polignano and Pierpaolo Basile and Giovanni Semeraro},
year={2024},
eprint={2405.07101},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
```bibtex
@misc{basile2023llamantino,
title={LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language},
author={Pierpaolo Basile and Elio Musacchio and Marco Polignano and Lucia Siciliani and Giuseppe Fiameni and Giovanni Semeraro},
year={2023},
eprint={2312.09993},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
```bibtex
@article{llama3modelcard,
title={Llama 3 Model Card},
author={AI@Meta},
year={2024},
url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
}
```
# Acknowledgments
We acknowledge the support of the PNRR project [FAIR - Future AI Research (PE00000013)](https://fondazione-fair.it/en/foundation/), Spoke 6 - Symbiotic AI (CUP H97G22000210007) under the NRRP MUR program funded by the NextGenerationEU.
Models are built on the Leonardo supercomputer with the support of CINECA-Italian Super Computing Resource Allocation, class C project IscrC\_Pro\_MRS (HP10CQO70G).
<img src="https://wiki.u-gov.it/confluence/download/attachments/49842317/image2022-6-21_11-11-44.png?version=1&modificationDate=1655802705000&api=v2" width="600px">
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_swap-uniba__LLaMAntino-3-ANITA-8B-Inst-DPO-ITA)
| Metric |Value|
|---------------------------------|----:|
|Avg. |75.12|
|AI2 Reasoning Challenge (25-Shot)|74.57|
|HellaSwag (10-Shot) |92.75|
|MMLU (5-Shot) |66.85|
|TruthfulQA (0-shot) |75.93|
|Winogrande (5-shot) |82.00|
|GSM8k (5-shot) |58.61|
|