PEFT
Safetensors
English

Model Details

This is an adapter for meta-llama/Meta-Llama-3-8B fine-tuned for function calling on xLAM. This adapter is undertrained. Its main purpose is for testing function calling capabilities of LLMs.

import torch, os
from peft import PeftModel
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer
)

#use bf16 and FlashAttention if supported
if torch.cuda.is_bf16_supported():
  os.system('pip install flash_attn')
  compute_dtype = torch.bfloat16
  attn_implementation = 'flash_attention_2'
else:
  compute_dtype = torch.float16
  attn_implementation = 'sdpa'

adapter= "kaitchup/Meta-Llama-3-8B-xLAM-Adapter"
model_name = "meta-llama/Meta-Llama-3-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=compute_dtype,
    device_map={"": 0},
    attn_implementation=attn_implementation,
)

model = PeftModel.from_pretrained(model, adapter)

prompt = "<user>Check if the numbers 8 and 1233 are powers of two.</user>\n\n<tools>"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, do_sample=False, temperature=0.0, max_new_tokens=150)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
  • Developed by: The Kaitchup
  • Language(s) (NLP): English
  • License: cc-by-4.0
Downloads last month
19
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train kaitchup/Meta-Llama-3-8B-xLAM-Adapter