RichardErkhov's picture
Update README.md
56d1f48 verified

!! Hello everyone, model is not working, it is an experimental attempt to quantize it. I understood the error, but Im facing it too. Im a bit unexperienced in this. If someone knows how to manually set the layers size please help. Thank you!

GGUF quantization made by Richard Erkhov.

Github

Linkedin

Discord

Request more models

MobiLlama-1B-Chat - GGUF

Name Quant method Bits Size Use case
MobiLlama-1B-Chat.Q2_K.gguf Q2_K 2 0.47GB significant quality loss - not recommended for most purposes
MobiLlama-1B-Chat.Q3_K_S.gguf Q3_K_S 3 0.53GB very small, high quality loss
MobiLlama-1B-Chat.Q3_K_M.gguf Q3_K_M 3 0.59GB very small, high quality loss
MobiLlama-1B-Chat.Q3_K_L.gguf Q3_K_L 3 0.63GB small, substantial quality loss
MobiLlama-1B-Chat.Q4_0.gguf Q4_0 4 0.68GB legacy; small, very high quality loss - prefer using Q3_K_M
MobiLlama-1B-Chat.Q4_K_S.gguf Q4_K_S 4 0.68GB small, greater quality loss
MobiLlama-1B-Chat.Q4_K_M.gguf Q4_K_M 4 0.72GB medium, balanced quality - recommended
MobiLlama-1B-Chat.Q5_0.gguf Q5_0 5 0.82GB legacy; medium, balanced quality - prefer using Q4_K_M
MobiLlama-1B-Chat.Q5_K_S.gguf Q5_K_S 5 0.82GB large, low quality loss - recommended
MobiLlama-1B-Chat.Q5_K_M.gguf Q5_K_M 5 0.84GB large, very low quality loss - recommended
MobiLlama-1B-Chat.Q6_K.gguf Q6_K 6 0.96GB very large, extremely low quality loss
MobiLlama-1B-Chat.Q8_0.gguf Q8_0 8 1.25GB very large, extremely low quality loss - not recommended

Original model description:

license: apache-2.0 datasets: - WizardLM/WizardLM_evol_instruct_V2_196k - icybee/share_gpt_90k_v1 language: - en library_name: transformers pipeline_tag: text-generation

MobiLlama-1B-Chat

We present MobiLlama-1.2B-Chat, an instruction following model finetuned on MBZUAI/MobiLlama-1B.

Model Description

  • Model type: Language model with the same architecture as LLaMA-7B
  • Language(s) (NLP): English
  • License: Apache 2.0
  • Resources for more information:

Loading MobiLlama-1B-Chat

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MBZUAI/MobiLlama-1B-Chat", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("MBZUAI/MobiLlama-1B-Chat", trust_remote_code=True)

#template adapated from fastchat
template= "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n### Human: Got any creative ideas for a 10 year old’s birthday?\n### Assistant: Of course! Here are some creative ideas for a 10-year-old's birthday party:\n1. Treasure Hunt: Organize a treasure hunt in your backyard or nearby park. Create clues and riddles for the kids to solve, leading them to hidden treasures and surprises.\n2. Science Party: Plan a science-themed party where kids can engage in fun and interactive experiments. You can set up different stations with activities like making slime, erupting volcanoes, or creating simple chemical reactions.\n3. Outdoor Movie Night: Set up a backyard movie night with a projector and a large screen or white sheet. Create a cozy seating area with blankets and pillows, and serve popcorn and snacks while the kids enjoy a favorite movie under the stars.\n4. DIY Crafts Party: Arrange a craft party where kids can unleash their creativity. Provide a variety of craft supplies like beads, paints, and fabrics, and let them create their own unique masterpieces to take home as party favors.\n5. Sports Olympics: Host a mini Olympics event with various sports and games. Set up different stations for activities like sack races, relay races, basketball shooting, and obstacle courses. Give out medals or certificates to the participants.\n6. Cooking Party: Have a cooking-themed party where the kids can prepare their own mini pizzas, cupcakes, or cookies. Provide toppings, frosting, and decorating supplies, and let them get hands-on in the kitchen.\n7. Superhero Training Camp: Create a superhero-themed party where the kids can engage in fun training activities. Set up an obstacle course, have them design their own superhero capes or masks, and organize superhero-themed games and challenges.\n8. Outdoor Adventure: Plan an outdoor adventure party at a local park or nature reserve. Arrange activities like hiking, nature scavenger hunts, or a picnic with games. Encourage exploration and appreciation for the outdoors.\nRemember to tailor the activities to the birthday child's interests and preferences. Have a great celebration!\n### Human: {prompt}\n### Assistant:"

prompt = "What are the psychological effects of urban living on mental health?"

input_str = template.format(prompt=prompt)
input_ids = tokenizer(input_str, return_tensors="pt").input_ids
outputs = model.generate(input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)
print(tokenizer.batch_decode(outputs[:, input_ids.shape[1]:-1])[0].strip())

Alternatively, you may use FastChat:

python3 -m fastchat.serve.cli --model-path MBZUAI/MobiLlama-1B-Chat

Hyperparameters

Hyperparameter Value
Total Parameters 1.2B
Hidden Size 2048
Intermediate Size (MLPs) 5632
Number of Attention Heads 32
Number of Hidden Lyaers 22
RMSNorm ɛ 1e^-5
Max Seq Length 2048
Vocab Size 32000
Training Hyperparameter Value
learning_rate 2e-5
num_train_epochs 3
per_device_train_batch_size 2
gradient_accumulation_steps 16
warmup_ratio 0.04
model_max_length 2048

Evaluation

Evaluation Benchmark MobiLlama-05B-Chat MobiLlama-1.2B-Chat
HellaSwag 0.5042 0.6244
MMLU 0.2677 0.2635
Arc Challenge 0.2935 0.3558
TruthfulQA 0.3997 0.3848
CrowsPairs 0.5694 0.679
PIQA 0.7078 0.7557
Race 0.3320 0.3598
SIQA 0.4165 0.4396
Winogrande 0.5659 0.5966

Intended Uses

Given the nature of the training data, the MobiLlama-1B model is best suited for prompts using the QA format, the chat format, and the code format.

Citation