--- language: - en tags: - falcon3 --- # Table of Contents 0. [TL;DR](#TL;DR) 1. [Model Details](#model-details) 2. [Usage](#usage) 3. [Training Details](#training-details) 4. [Evaluation](#evaluation) # TL;DR # Model Details ## Model Description - **Developed by:** [https://www.tii.ae](https://www.tii.ae) - **Model type:** Causal decoder-only - **Architecture:** Transformer-base - **Language(s) (NLP):** Mainly English - **License:** TII Falcon-LLM License 2.0
# Usage Find below some example scripts on how to use the model in `transformers` (Make sure to have the latest transformers, or the one built from source): ## Using the Pytorch model with 🤗 transformers ### Running the model on a CPU

Click to expand

```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon3-7B-Base") model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon3-7B-Base") input_text = "Question: How many hours in one day? Answer: " input_ids = tokenizer(input_text, return_tensors="pt").input_ids outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ```

### Running the model on a GPU

Click to expand

```python # pip install accelerate from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon3-7B-Base") model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon3-7B-Base", device_map="auto") input_text = "Question: How many hours in one day? Answer: " input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda") outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ```

### Running the model on a GPU using `torch.compile`

Click to expand

```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon3-7B-Base") model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon3-7B-Base", torch_dtype=torch.bfloat16).to(0) model = torch.compile(model) input_text = "Question: How many hours in one day? Answer: " input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda") outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ```

# Training Details ## Training Data ## Training Procedure ### Training Hyperparameters | **Hyperparameter** | **Value** | **Comment** | |--------------------|------------|-------------------------------------------| | Precision | `bfloat16` | | | Optimizer | AdamW | | | Max learning rate | | Following a WSD (warmup-stable-decay) learning rate schedule | | Weight decay | | | | Batch size | | | # Evaluation

Category	Benchmark	Llama3.1-8B	Qwen2-7B	Qwen2.5-7B	Falcon3-7B-Base
General	MMLU (5-shot)	65.2	70.4	74.2	67.5
	MMLU-PRO (5-shot)	32.7	42.1	43.5	39.2
	IFEval	12.0	30.6	33.9	34.3
Math	GSM8K (5-shot)	49.4	77.9	82.9	76.2
Math	MATH(4-shot)	4.1	17.5	15.5	18.0
Reasoning	Arc Challenge (25-shot)	53.4	57.4	59.0	59.6
	GPQA (0-shot)	31.0	31.9	33.0	35.5
	MUSR (0-shot)	38.0	44.1	44.2	47.3
	BBH (3-shot)	46.5	53.3	54.0	51.0
CommonSense Understanding	PIQA (0-shot)	80.3	79.8	78.7	77.7
	SciQ (0-shot)	96.3	95.9	96.6	95.3
	Winogrande (0-shot)	74.0	72.1	72.9	71.0
	OpenbookQA (0-shot)	33.4	35.2	33.6	31.4

# Citation