large-traversaal
/

Alif-Llama-3.1-8B-Instruct

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Alif-Llama-3.1-8B-Instruct / README.md

alishafique's picture

Update README.md

883f31c verified 3 days ago

|

history blame contribute delete

2.49 kB

	---
	base_model: unsloth/Meta-Llama-3.1-8B
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	license: apache-2.0
	language:
	- en
	- ur
	---


	# Model Card for Alif Llama 3.1 8B Instruct

	Alif Llama 3.1 8B Instruct is an open-weight model with highly advanced multilingual reasoning capabilities. It utilizes human refined multilingual synthetic data paired with reasoning to enhance cultural nuance and reasoning capabilities in english and urdu languages.

	- Developed by: large-traversaal
	- License: apache-2.0
	- Finetuned from model : unsloth/Meta-Llama-3.1-8B
	- Model: Alif Llama 3.1 8B Instruct
	- Model Size: 8 billion parameters

	This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.


	### How to Use Alif Llama

	Install the transformers library and load Alif Llama 3.1 8B Instruct as follows:

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
	import torch
	from transformers import BitsAndBytesConfig

	model_id = "large-traversaal/Alif-Llama-3.1-8B-Instruct" # Replace with your model

	# 4-bit quantization configuration
	quantization_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_compute_dtype=torch.float16,
	bnb_4bit_use_double_quant=True,
	bnb_4bit_quant_type="nf4"
	)

	# Load tokenizer and model in 4-bit
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	quantization_config=quantization_config,
	device_map="auto"
	)

	# Create text generation pipeline
	chatbot = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto")

	# Function to chat
	def chat(message):
	response = chatbot(message, max_new_tokens=100, do_sample=True, temperature=0.3)
	return response[0]["generated_text"]

	# Example chat
	user_input = "شہر کراچی کی کیا اہمیت ہے؟"
	bot_response = chat(user_input)

	print(bot_response)

	```

	## Model Details

	Input: Models input text only.

	Output: Models generate text only.

	Model Architecture: Alif Llama 8B is an auto-regressive language model that uses an optimized transformer architecture. Post-training includes continued pretraining and supervised finetuning.

	For more details about how the model was trained, check out [our blogpost]().


	### Evaluation



	### Model Card Contact

	For errors or additional questions about details in this model card, contact.