Llama 3.1 8B with Chat Template

This is a Meta Llama 3.1 8B model that has been configured with a chat template for fine-tuning purposes.

Model Details

  • Base Model: Meta Llama 3.1 8B
  • Parameters: 8 billion
  • Architecture: LlamaForCausalLM
  • Context Length: 8,192 tokens
  • Vocabulary Size: 128,256 tokens

Chat Template

This model includes a Jinja2 chat template that formats conversations using Llama 3's special tokens:

{% if messages %}{% for message in messages %}{% if message['role'] == 'user' %}<|start_header_id|>user<|end_header_id|>

{{ message['content'] }}<|eot_id|>{% elif message['role'] == 'assistant' %}<|start_header_id|>assistant<|end_header_id|>

{{ message['content'] }}<|eot_id|>{% endif %}{% endfor %}{% if add_generation_prompt %}<|start_header_id|>assistant<|end_header_id|>

{% endif %}{% endif %}

Special Tokens

  • <|begin_of_text|>: Beginning of text token (ID: 128000)
  • <|end_of_text|>: End of text token (ID: 128001)
  • <|start_header_id|>: Start of role header (ID: 128006)
  • <|end_header_id|>: End of role header (ID: 128007)
  • <|eot_id|>: End of turn token (ID: 128009)

Usage

Loading the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("path/to/model")
tokenizer = AutoTokenizer.from_pretrained("path/to/model")

Using the Chat Template

messages = [
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "assistant", "content": "I'm doing well, thank you! How can I help you today?"},
    {"role": "user", "content": "Can you explain what you are?"}
]

# Apply chat template
formatted_text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

# Tokenize
inputs = tokenizer(formatted_text, return_tensors="pt")

# Generate
outputs = model.generate(**inputs, max_new_tokens=256)
response = tokenizer.decode(outputs[0], skip_special_tokens=False)

Fine-Tuning

This model is prepared for fine-tuning on conversational datasets. The chat template ensures consistent formatting during training and inference.

Example Fine-Tuning Setup

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-5,
    warmup_steps=100,
    logging_steps=10,
    save_steps=500,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
)

trainer.train()

Model Configuration

  • Hidden Size: 4,096
  • Intermediate Size: 14,336
  • Number of Attention Heads: 32
  • Number of Key-Value Heads: 8 (GQA)
  • Number of Layers: 32
  • RoPE Theta: 500,000

License

This model inherits the Llama 3 license from Meta. Please review the license terms before use.

Citation

@article{llama3,
  title={Llama 3: Open Foundation and Fine-Tuned Language Models},
  author={Meta AI},
  year={2024}
}
Downloads last month
14
Safetensors
Model size
8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Andrewcook08/llama3-8b-with-chat-template-for-training

Finetuned
(1603)
this model