LeonardPuettmann's picture
Update README.md
0d52394 verified
---
library_name: transformers
base_model:
- HuggingFaceTB/SmolLM2-1.7B-Instruct
license: apache-2.0
language:
- en
- it
tags:
- translation
---
# SmolMaestra - A tiny model tuned for text translation
```html
_____ _ __ __ _
/ ____| | | \/ | | |
| (___ _ __ ___ ___ | | \ / | __ _ ___ ___| |_ _ __ __ _
\___ \| '_ ` _ \ / _ \| | |\/| |/ _` |/ _ \/ __| __| '__/ _` |
____) | | | | | | (_) | | | | | (_| | __/\__ \ |_| | | (_| |
|_____/|_| |_| |_|\___/|_|_| |_|\__,_|\___||___/\__|_| \__,_|
```
## Model Card
This model was finetuned with roughly 300.000 examples of translations from English to Italian and Italian to English. The model was finetuned in a way to more directly provide a translation without much explanation.
Finetuning took about 10 hours on an A10G Nvidia GPU.
Due to its size, the model runs very well on CPUs.
## Usage
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "LeonardPuettmann/SmolMaestra-1.7b-Instruct-v0.1"
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_id, add_bos_token=True, trust_remote_code=True)
row_json = [
{"role": "system", "content": "Your job is to return translations for sentences or words from either Italian to English or English to Italian."},
{"role": "user", "content": "Do you sell tickets for the bus?"},
]
prompt = tokenizer.apply_chat_template(row_json, tokenize=False)
model_input = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
print(tokenizer.decode(model.generate(**model_input, max_new_tokens=1024)[0]))
```
## Data used
The source for the data were sentence pairs from tatoeba.com. The data can be downloaded from here: https://tatoeba.org/downloads
## Credits
Base model: `HuggingFaceTB/SmolLM2-1.7B-Instruct`
Finetuned by: Leonard Püttmann https://www.linkedin.com/in/leonard-p%C3%BCttmann-4648231a9/