|
--- |
|
base_model: unsloth/Meta-Llama-3.1-8B |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
license: apache-2.0 |
|
language: |
|
- en |
|
- ur |
|
--- |
|
|
|
|
|
# Model Card for Alif Llama 3.1 8B Instruct |
|
|
|
**Alif Llama 3.1 8B Instruct** is an open-weight model with highly advanced multilingual reasoning capabilities. It utilizes human refined multilingual synthetic data paired with reasoning to enhance cultural nuance and reasoning capabilities in english and urdu languages. |
|
|
|
- **Developed by:** large-traversaal |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** unsloth/Meta-Llama-3.1-8B |
|
- **Model:** Alif Llama 3.1 8B Instruct |
|
- **Model Size:** 8 billion parameters |
|
|
|
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |
|
|
|
|
|
### How to Use Alif Llama |
|
|
|
Install the transformers library and load Alif Llama 3.1 8B Instruct as follows: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline |
|
import torch |
|
from transformers import BitsAndBytesConfig |
|
|
|
model_id = "large-traversaal/Alif-Llama-3.1-8B-Instruct" # Replace with your model |
|
|
|
# 4-bit quantization configuration |
|
quantization_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_compute_dtype=torch.float16, |
|
bnb_4bit_use_double_quant=True, |
|
bnb_4bit_quant_type="nf4" |
|
) |
|
|
|
# Load tokenizer and model in 4-bit |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
quantization_config=quantization_config, |
|
device_map="auto" |
|
) |
|
|
|
# Create text generation pipeline |
|
chatbot = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto") |
|
|
|
# Function to chat |
|
def chat(message): |
|
response = chatbot(message, max_new_tokens=100, do_sample=True, temperature=0.3) |
|
return response[0]["generated_text"] |
|
|
|
# Example chat |
|
user_input = "شہر کراچی کی کیا اہمیت ہے؟" |
|
bot_response = chat(user_input) |
|
|
|
print(bot_response) |
|
|
|
``` |
|
|
|
## Model Details |
|
|
|
**Input**: Models input text only. |
|
|
|
**Output**: Models generate text only. |
|
|
|
**Model Architecture**: Alif Llama 8B is an auto-regressive language model that uses an optimized transformer architecture. Post-training includes continued pretraining and supervised finetuning. |
|
|
|
For more details about how the model was trained, check out [our blogpost](). |
|
|
|
|
|
### Evaluation |
|
|
|
|
|
|
|
### Model Card Contact |
|
|
|
For errors or additional questions about details in this model card, contact. |
|
|
|
|