maritaca-ai
/

sabia-7b

@@ -23,9 +23,59 @@ Sabiá-7B is Portuguese language model developed by [Maritaca AI](https://www.ma
 **Paper:** For more details, please refer to our paper: [Sabiá: Portuguese Large Language Models](https://arxiv.org/pdf/2304.07880.pdf)
-Given that Sabiá-7B was trained solely on a language modeling objective without fine-tuning for instruction following, it is recommended for few-shot tasks rather than zero-shot tasks.
-**Results in Portuguese**
 Below we show the results on the Poeta benchmark, which consists of 14 Portuguese datasets.
@@ -37,7 +87,7 @@ For more information on the Normalized Preferred Metric (NPM), please refer to o
 |LLaMA-2-7B| 43.7|
 |Sabiá-7B| 48.5|
-**Results in English**
 Below we show the average results on 6 English datasets: PIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, and OpenBookQA.
@@ -47,6 +97,7 @@ Below we show the average results on 6 English datasets: PIQA, HellaSwag, WinoGr
 |Sabiá-7B| 49.0|
 Please use the following bibtex to cite our paper:
 ```

 **Paper:** For more details, please refer to our paper: [Sabiá: Portuguese Large Language Models](https://arxiv.org/pdf/2304.07880.pdf)
+## Few-shot Example
+Given that Sabiá-7B was trained solely on a language modeling objective without fine-tuning for instruction following, it is recommended for few-shot tasks rather than zero-shot tasks, like in the example below.
+```python
+import torch
+from transformers import LlamaTokenizer, LlamaForCausalLM
+tokenizer = LlamaTokenizer.from_pretrained("maritaca-ai/sabia-7b")
+model = LlamaForCausalLM.from_pretrained(
+    "maritaca-ai/sabia-7b",
+    device_map="auto",  # Automatically loads the model in the GPU, if there is one. Requires pip install acelerate
+    low_cpu_mem_usage=True,
+    torch_dtype=torch.bfloat16   # If your GPU does not support bfloat16, change to torch.float16
+)
+prompt = """Classifique a resenha de filme como "positiva" ou "negativa".
+Resenha: Gostei muito do filme, é o melhor do ano!
+Classe: positiva
+Resenha: O filme deixa muito a desejar.
+Classe: negativa
+Resenha: Apesar de longo, valeu o ingresso.
+Classe:"""
+input_ids = tokenizer(prompt, return_tensors="pt")
+output = model.generate(
+    input_ids["input_ids"].to("cuda"),
+    max_length=1024,
+    eos_token_id=tokenizer.encode("\n"))  # Stop generation when a "\n" token is dectected
+# The output contains the input tokens, so we have to skip them.
+output = output[0][len(input_ids["input_ids"][0]):]
+print(tokenizer.decode(output, skip_special_tokens=True))
+```
+If your GPU does not have enough RAM, try using int8 precision.
+However, expect some degradation in the model output quality when compared to fp16 or bf16.
+```python
+model = LlamaForCausalLM.from_pretrained(
+    "maritaca-ai/sabia-7b",
+    device_map="auto",
+    low_cpu_mem_usage=True,
+    load_in_8bit=True,  # Requires pip install bitsandbytes
+)
+```
+## Results in Portuguese
 Below we show the results on the Poeta benchmark, which consists of 14 Portuguese datasets.
 |LLaMA-2-7B| 43.7|
 |Sabiá-7B| 48.5|
+## Results in English
 Below we show the average results on 6 English datasets: PIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, and OpenBookQA.
 |Sabiá-7B| 49.0|
+## Citation
 Please use the following bibtex to cite our paper:
 ```