Llama 3.1 8B with Chat Template

This is a Meta Llama 3.1 8B model that has been configured with a chat template for fine-tuning purposes.

Model Details

Base Model: Meta Llama 3.1 8B
Parameters: 8 billion
Architecture: LlamaForCausalLM
Context Length: 8,192 tokens
Vocabulary Size: 128,256 tokens

Chat Template

This model includes a Jinja2 chat template that formats conversations using Llama 3's special tokens:

{% if messages %}{% for message in messages %}{% if message['role'] == 'user' %}<|start_header_id|>user<|end_header_id|>

{{ message['content'] }}<|eot_id|>{% elif message['role'] == 'assistant' %}<|start_header_id|>assistant<|end_header_id|>

{{ message['content'] }}<|eot_id|>{% endif %}{% endfor %}{% if add_generation_prompt %}<|start_header_id|>assistant<|end_header_id|>

{% endif %}{% endif %}

Special Tokens

<|begin_of_text|>: Beginning of text token (ID: 128000)
<|end_of_text|>: End of text token (ID: 128001)
<|start_header_id|>: Start of role header (ID: 128006)
<|end_header_id|>: End of role header (ID: 128007)
<|eot_id|>: End of turn token (ID: 128009)

Usage

Loading the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("path/to/model")
tokenizer = AutoTokenizer.from_pretrained("path/to/model")

Using the Chat Template

messages = [
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "assistant", "content": "I'm doing well, thank you! How can I help you today?"},
    {"role": "user", "content": "Can you explain what you are?"}
]

# Apply chat template
formatted_text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

# Tokenize
inputs = tokenizer(formatted_text, return_tensors="pt")

# Generate
outputs = model.generate(**inputs, max_new_tokens=256)
response = tokenizer.decode(outputs[0], skip_special_tokens=False)

Fine-Tuning

This model is prepared for fine-tuning on conversational datasets. The chat template ensures consistent formatting during training and inference.

Example Fine-Tuning Setup

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-5,
    warmup_steps=100,
    logging_steps=10,
    save_steps=500,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
)

trainer.train()

Model Configuration

Hidden Size: 4,096
Intermediate Size: 14,336
Number of Attention Heads: 32
Number of Key-Value Heads: 8 (GQA)
Number of Layers: 32
RoPE Theta: 500,000

License

This model inherits the Llama 3 license from Meta. Please review the license terms before use.

Citation

@article{llama3,
  title={Llama 3: Open Foundation and Fine-Tuned Language Models},
  author={Meta AI},
  year={2024}
}

Downloads last month: 14

Safetensors

Model size

8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Andrewcook08/llama3-8b-with-chat-template-for-training

Base model

meta-llama/Llama-3.1-8B

Finetuned

(1603)

this model