File size: 2,188 Bytes
769af1d
 
 
2702b15
 
 
 
 
 
 
 
78ffc11
2702b15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ec6b8e8
2702b15
ec6b8e8
629d642
 
2702b15
 
 
 
 
 
 
ec6b8e8
 
 
 
 
 
 
 
 
 
2702b15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
language:
- pt
---

Sabiá-7B is Portuguese language model developed by [Maritaca AI](https://www.maritaca.ai/).

**Input:** The model accepts only text input.

**Output:** The Model generates text only.

**Model Architecture:** Sabiá-7B is an auto-regressive language model that uses the same architecture of LLaMA-1-7B.

**Tokenizer:** It uses the same tokenizer as LLaMA-1-7B.

**Maximum sequence length:** 2048 tokens.

**Pretraining data:** The model was pretrained on 7 billion tokens from the Portuguese subset of ClueWeb22, starting with the weights of LLaMA-1-7B and further trained for an additional 10 billion tokens, approximately 1.4 epochs of the training dataset.

**Data Freshness:** The pretraining data has a cutoff of mid-2022.

**License:** The licensing is the same as LLaMA-1's, restricting the model's use to research purposes only.

**Paper:** For more details, please refer to our paper: [Sabiá: Portuguese Large Language Models](https://arxiv.org/pdf/2304.07880.pdf) 

Given that Sabiá-7B was trained solely on a language modeling objective without fine-tuning for instruction following, it is recommended for few-shot tasks rather than zero-shot tasks.

**Results in Portuguese** 

Below we show the results on the Poeta benchmark, which consists of 14 Portuguese datasets.

For more information on the Normalized Preferred Metric (NPM), please refer to our paper.

|Model | NPM |
|--|--|
|LLaMA-1-7B| 33.0|
|LLaMA-2-7B| 43.7|
|Sabiá-7B| 48.5|

**Results in English** 

Below we show the average results on 6 English datasets: PIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, and OpenBookQA.

|Model | NPM |
|--|--|
|LLaMA-1-7B| 50.1|
|Sabiá-7B| 49.0|



Please use the following bibtex to cite our paper: 
```
@InProceedings{10.1007/978-3-031-45392-2_15,
    author="Pires, Ramon
    and Abonizio, Hugo
    and Almeida, Thales Sales
    and Nogueira, Rodrigo",
    editor="Naldi, Murilo C.
    and Bianchi, Reinaldo A. C.",
    title="Sabi{\'a}: Portuguese Large Language Models",
    booktitle="Intelligent Systems",
    year="2023",
    publisher="Springer Nature Switzerland",
    address="Cham",
    pages="226--240",
    isbn="978-3-031-45392-2"
}
```