kaitchup/Meta-Llama-3-8B-xLAM-Adapter

Model Details

This is an adapter for meta-llama/Meta-Llama-3-8B fine-tuned for function calling on xLAM. This adapter is undertrained. Its main purpose is for testing function calling capabilities of LLMs.

import torch, os
from peft import PeftModel
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer
)

#use bf16 and FlashAttention if supported
if torch.cuda.is_bf16_supported():
  os.system('pip install flash_attn')
  compute_dtype = torch.bfloat16
  attn_implementation = 'flash_attention_2'
else:
  compute_dtype = torch.float16
  attn_implementation = 'sdpa'

adapter= "kaitchup/Meta-Llama-3-8B-xLAM-Adapter"
model_name = "meta-llama/Meta-Llama-3-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=compute_dtype,
    device_map={"": 0},
    attn_implementation=attn_implementation,
)

model = PeftModel.from_pretrained(model, adapter)

prompt = "<user>Check if the numbers 8 and 1233 are powers of two.</user>\n\n<tools>"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, do_sample=False, temperature=0.0, max_new_tokens=150)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)

Developed by: The Kaitchup
Language(s) (NLP): English
License: cc-by-4.0

kaitchup
/

Meta-Llama-3-8B-xLAM-Adapter

Model Details

Dataset used to train kaitchup/Meta-Llama-3-8B-xLAM-Adapter