File size: 3,755 Bytes
d99c914 94711c5 7ceed3f d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 57ee99d 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 d99c914 94711c5 fde3327 94711c5 d99c914 94711c5 1eac05f 696f584 1eac05f d99c914 94711c5 d99c914 10514fa |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
---
library_name: transformers
tags:
- bitnet
- falcon3
base_model: tiiuae/Falcon3-7B-Base
---
![image/png](https://cdn-uploads.huggingface.co/production/uploads/62441d1d9fdefb55a0b7d12c/c-tosr0FvMlKuKQTojx_6.png)
# Table of Contents
0. [TL;DR](#TL;DR)
1. [Model Details](#model-details)
2. [Training Details](#training-details)
3. [Usage](#usage)
4. [Evaluation](#evaluation)
5. [Citation](#citation)
# TL;DR
# Model Details
## Model Description
- **Developed by:** [https://www.tii.ae](https://www.tii.ae)
- **Model type:** Causal decoder-only
- **Architecture:** Pure-transformer - 1.58bit version
- **Language(s) (NLP):** Mainly English
- **License:** TII Falcon License 2.0
# Training details
The model has been trained following the training strategies from the recent [1-bit LLM HF blogpost](https://huggingface.co/blog/1_58_llm_extreme_quantization) and [1-bit LLM paper](https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf).
For more details about the training protocol of this model, please refer to the Falcon-3 technical report, section *Compression*.
# Usage
Currently to use this model you can either rely on Hugging Face transformers library or [BitNet](https://github.com/microsoft/BitNet) library. You can also play with the model using the [falcon-1.58bit playground](https://huggingface.co/spaces/tiiuae/falcon3-1.58bit-playground) (only for the 7B instruct version).
## 🤗 transformers
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "tiiuae/Falcon3-7B-Base-1.58bit"
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
).to("cuda")
# Perform text generation
```
## BitNet
```
git clone https://github.com/microsoft/BitNet && cd BitNet
pip install -r requirements.txt
python setup_env.py --hf-repo tiiuae/Falcon3-7B-Base-1.58bit -q i2_s
python run_inference.py -m models/Falcon3-7B-1.58bit/ggml-model-i2_s.gguf -p "Hi how are you doing today?" -cnv
```
# Evaluation
We report in the following table our internal pipeline benchmarks:
**Note evaluation results are normalized score from v2 leaderboard tasks - reported results of original models in the blogpost are raw scores**
<table border="1" style="width: 100%; text-align: center; border-collapse: collapse;">
<colgroup>
<col style="width: 10%;">
<col style="width: 10%;">
<col style="background-color: rgba(80, 15, 213, 0.5); width: 7%;">
</colgroup>
<thead>
<tr>
<th>Benchmark</th>
<th>Llama3-8B-1.58-100B-tokens</th>
<th>Falcon3-7B-Base-1.58bit </th>
</tr>
</thead>
<tbody>
<tr>
<td>IFEval</td>
<td>17.91</td>
<td><b>25.43</b></td>
</tr>
<tr>
<td>MUSR</td>
<td>4.87</td>
<td><b>5.75</b></td>
</tr>
<tr>
<td>GPQA</td>
<td>1.83</td>
<td><b>2.32</b></td>
</tr>
<tr>
<td>BBH</td>
<td><b>5.36</b></td>
<td>3.91</td>
</tr>
<tr>
<td>MMLU-PRO</td>
<td><b>2.78</b></td>
<td>1.36</td>
</tr>
<tr>
<td>MATH</td>
<td>0.26</td>
<td><b>0.88</b></td>
</tr>
<tr>
<td>Average</td>
<td>5.5</td>
<td><b>6.61</b></td>
</tr>
</tbody>
</table>
# Citation
```
@misc{Falcon3,
title = {The Falcon 3 Family of Open Models},
url = {https://huggingface.co/blog/falcon3},
author = {Falcon-LLM Team},
month = {December},
year = {2024}
}
``` |