LeonardPuettmann
/

SmolMaestra-1.7b-Translation

text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

SmolMaestra-1.7b-Translation / README.md

LeonardPuettmann's picture

LeonardPuettmann

Update README.md

6f71931 verified 19 days ago

|

2.15 kB

	---
	library_name: transformers
	base_model:
	- HuggingFaceTB/SmolLM2-1.7B-Instruct
	license: apache-2.0
	language:
	- en
	- it
	tags:
	- translation
	---
	# SmolMaestra - A tiny Llama model tuned for text translation
	```html
	_____ _ __ __ _
	/ ____\| \| \| \/ \| \| \|
	\| (___ _ __ ___ ___ \| \| \ / \| __ _ ___ ___\| \|_ _ __ __ _
	\___ \\| '_ ` _ \ / _ \\| \| \|\/\| \|/ _` \|/ _ \/ __\| __\| '__/ _` \|
	____) \| \| \| \| \| \| (_) \| \| \| \| \| (_\| \| __/\__ \ \|_\| \| \| (_\| \|
	\|_____/\|_\| \|_\| \|_\|\___/\|_\|_\| \|_\|\__,_\|\___\|\|___/\__\|_\| \__,_\|
	```

	## Model Card
	This model was finetuned with roughly 300.000 examples of translations from English to Italian and Italian to English. The model was finetuned in a way to more directly provide a translation without much explanation.

	Finetuning took about 10 hours on an A10G Nvidia GPU.

	Due to its size, the model runs very well on CPUs.
	![A very italian Llama model](llamaestro-sm-bg.png)

	## Usage

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_id = "LeonardPuettmann/SmolMaestra-1.7b-Instruct-v0.1"

	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	device_map="auto",
	trust_remote_code=True,
	)

	tokenizer = AutoTokenizer.from_pretrained(model_id, add_bos_token=True, trust_remote_code=True)

	row_json = [
	{"role": "system", "content": "Your job is to return translations for sentences or words from either Italian to English or English to Italian."},
	{"role": "user", "content": "Do you sell tickets for the bus?"},
	]

	prompt = tokenizer.apply_chat_template(row_json, tokenize=False)
	model_input = tokenizer(prompt, return_tensors="pt").to("cuda")

	with torch.no_grad():
	print(tokenizer.decode(model.generate(**model_input, max_new_tokens=1024)[0]))
	```

	## Data used
	The source for the data were sentence pairs from tatoeba.com. The data can be downloaded from here: https://tatoeba.org/downloads

	## Credits

	Base model: `HuggingFaceTB/SmolLM2-1.7B-Instruct`
	Finetuned by: Leonard Püttmann https://www.linkedin.com/in/leonard-p%C3%BCttmann-4648231a9/