MoxoffSpA
/

Azzurro

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Azzurro / README.md

JacopoAbate's picture

Update README.md

8fb4b85 verified 9 months ago

|

1.73 kB

	---
	license: apache-2.0
	language:
	- it
	- en
	library_name: transformers
	tags:
	- sft
	- it
	- mistral
	- chatml
	---

	# Model Information

	xxxx is a SFT and LoRA finetuned version of Mistral-7B-v0.2

	It has been trained on a mixture of opensource datasets, like SQUAD-it (https://huggingface.co/datasets/squad_it), and some internally made datasets.

	It is not just a Q&A, it is a Q&A + Context model, with the goal being it being used for RAGs and application in need of a context.

	# Evaluation

	We evaluated the model using the same test sets as used for the Open Ita LLM Leaderboard

	\| hellaswag_it acc_norm \| arc_it acc_norm \| m_mmlu_it 5-shot acc \| Average \|
	\|:----------------------:\| :---------------: \| :--------------------: \| :-------: \|
	\| 0.6067 \| 0.4405 \| 0.5112 \| 0,52 \|


	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	device = "cuda"

	model = AutoModelForCausalLM.from_pretrained("MoxoffSpA/xxxx")
	tokenizer = AutoTokenizer.from_pretrained("MoxoffSpA/xxxx")

	question = """Quanto è alta la torre di Pisa?"""
	context = """
	La Torre di Pisa è un campanile del XII secolo, famoso per la sua inclinazione. Alta circa 56 metri.
	"""
	prompt = f"Rispondi alla seguente domanda con meno parle possibili basandoti sul contesto fornito. Domanda: {question}, contesto: {context}"

	messages = [
	{"role": "user", "content": prompt},
	]

	encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

	model_inputs = encodeds.to(device)
	model.to(device)

	generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
	decoded = tokenizer.batch_decode(generated_ids)
	print(decoded[0])
	```


	## The Moxoff Team
	Marco D'Ambra, Jacopo Abate, Gianpaolo Francesco Trotta