---
language:
- en
tags:
- falcon3
---
# Table of Contents
0. [TL;DR](#TL;DR)
1. [Model Details](#model-details)
2. [Usage](#usage)
3. [Training Details](#training-details)
4. [Evaluation](#evaluation)
# TL;DR
Falcon 3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B.
This repository contains the Falcon3-7B-Instruct, the best Instruct LLM under 8B at the time of release.
# Model Details
## Model Description
- **Developed by:** [https://www.tii.ae](https://www.tii.ae)
- **Model type:** Causal decoder-only
- **Architecture:** Transformer-base
- **Language(s) (NLP):** Mainly English
- **License:** TII Falcon-LLM License 2.0
# Usage
Find below an example on how to use the model in `transformers` (Make sure to have the latest transformers, or the one built from source):
Click to expand
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "tiiuae/Falcon3-7B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "How many hours in one day?"
messages = [
{"role": "system", "content": "You are a helpful friendly assistant Falcon3 from TII, try to follow instructions as much as possible."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=1024
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
```
# Training Details
Based on `tiiuae/Falcon3-7B-Base`, post-training stage is comprised of supervised finetuning followed by human preference alignement (DPO).
## Supervised finetuning
### Training Data
1.2 million diverse, high-quality samples Tulu-3, Open-Hermes, Numina an Apigen.
| Data type | ratio |
|--------------------------------------|-------|
| Conversations | 32% |
| STEM | 32% |
| Code | 12% |
| Safety | 9.1% |
| Multi lingual | 8.3% |
| Function call | 3.3% |
| NLP (summarization, generation, QA) | 3.2% |
#### Training Hyperparameters
AdamW |
β1 |
0.9 |
β2 |
0.999 |
weight decay |
0.01 |
Learning rate |
type |
linear decay |
init lr |
5e-6 |
final lr |
0 |
warm rate |
0.03 |
Batch size |
|
64 |
Epochs |
|
2 |
## Human preference alignment - DPO
### Training Data
TO DO DO DO DO
#### Training Hyperparameters
TODODODODOD
# Evaluation
We report in the following table our internal pipeline benchmarks:
Category |
Benchmark |
Llama-3.1-8B-Instruct |
Qwen2-7B-Instruct |
Qwen2.5-7B-Instruct |
Falcon3-7B-Instruct |
General |
MMLU (5-shot) |
- |
- |
- |
- |
MMLU-PRO (5-shot) |
- |
- |
- |
- |
IFEval |
- |
- |
- |
- |
Math |
GSM8K (5-shot) |
- |
- |
- |
- |
MATH(4-shot) |
- |
- |
- |
- |
Reasoning |
Arc Challenge (25-shot) |
- |
- |
- |
- |
GPQA (0-shot) |
- |
- |
- |
- |
MUSR (0-shot) |
- |
- |
- |
- |
BBH (3-shot) |
- |
- |
- |
- |
CommonSense Understanding |
PIQA (0-shot) |
- |
- |
- |
- |
SciQ (0-shot) |
- |
- |
- |
- |
Winogrande (0-shot) |
- |
- |
- |
- |
OpenbookQA (0-shot) |
- |
- |
- |
- |
# Citation
If Falcon3 series were helpful to your work, feel free to give us a cite.
```
@misc{Falcon3,
title = {Falcon 3 family of Open Foundation Models},
author = {TII Team},
month = {December},
year = {2024}
}
```