slimfrikha-tii's picture
falcon3 release
4d9a9de
|
raw
history blame
2.7 kB
metadata
library_name: transformers
tags:
  - bitnet
  - falcon3
base_model: tiiuae/Falcon3-1B-Instruct
license: other
license_name: falcon-llm-license
license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html

image/png

Table of Contents

  1. TL;DR
  2. Model Details
  3. Training Details
  4. Usage
  5. Evaluation
  6. Citation

TL;DR

Model Details

Model Description

  • Developed by: https://www.tii.ae
  • Model type: Causal decoder-only - instruct / chat version
  • Architecture: Pure-transformer - 1.58bit version
  • Language(s) (NLP): Mainly English
  • License: TII Falcon License 2.0

Training details

The model has been trained following the training strategies from the recent 1-bit LLM HF blogpost and 1-bit LLM paper. For more details about the training protocol of this model, please refer to the Falcon-3 technical report, section Compression.

Usage

Currently to use this model you can either rely on Hugging Face transformers library or BitNet library. You can also play with the model using the falcon-1.58bit playground (only for the 7B instruct version).

🤗 transformers

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "tiiuae/Falcon3-1B-Instruct-1.58bit"

model = AutoModelForCausalLM.from_pretrained(
  model_id,
  torch_dtype=torch.bfloat16,
).to("cuda")

# Perform text generation

BitNet

git clone https://github.com/microsoft/BitNet && cd BitNet
pip install -r requirements.txt
python setup_env.py --hf-repo tiiuae/Falcon3-1B-Instruct-1.58bit -q i2_s
python run_inference.py -m models/Falcon3-1B-1.58bit/ggml-model-i2_s.gguf -p "You are a helpful assistant" -cnv

Evaluation

Coming soon ..

Useful links

Citation

If the Falcon3 family of models were helpful to your work, feel free to give us a cite.

@misc{Falcon3,
    title = {The Falcon 3 Family of Open Models},
    author = {Falcon-LLM Team},
    month = {December},
    year = {2024}
}