metadata

license: llama3.2
language:
  - en
  - ja
  - de
  - fr
  - it
  - pt
  - hi
  - es
  - th
library_name: transformers
pipeline_tag: text-generation
base_model: meta-llama/Llama-3.2-3B
datasets:
  - ryota39/izumi-lab-dpo-45k
  - Aratako/Magpie-Tanuki-8B-97k
  - kunishou/databricks-dolly-15k-ja
  - kunishou/oasst1-89k-ja
tags:
  - llama3.2

Preface

The importance of a small parameter large language model (LLM) lies in its ability to balance performance and efficiency. As LLMs grow increasingly sophisticated, the trade-off between model size and computational resource demands becomes critical. A smaller parameter model offers significant advantages, such as reduced memory usage, faster inference times, and lower energy consumption, all while retaining a high level of accuracy and contextual understanding. These models are particularly valuable in real-world applications where resources like processing power and storage are limited, such as on mobile devices, edge computing, or low-latency environments.

Llama 3.2 Chibi 3B

This experimental model is the result from continual pre-training of Meta's Llama 3.2 3B on a small mixture of japanese datasets.

Architecture

Llama 3.2 3B

Training

The model has been trained with a following mixture of datasets:

Contributors

Hammaam

How to use

Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.

Make sure to update your transformers installation via pip install --upgrade transformers.

import torch
from transformers import pipeline

model_id = "AELLM/Llama-3.2-Chibi-3B"

pipe = pipeline(
    "text-generation", 
    model=model_id, 
    torch_dtype=torch.bfloat16, 
    device_map="auto"
)

pipe("人生の鍵は")

License

Refer to Llama 3.2 Community License

References

@inproceedings{zheng2024llamafactory,
  title={LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models},
  author={Yaowei Zheng and Richong Zhang and Junhao Zhang and Yanhan Ye and Zheyan Luo and Zhangchi Feng and Yongqiang Ma},
  booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)},
  address={Bangkok, Thailand},
  publisher={Association for Computational Linguistics},
  year={2024},
  url={http://arxiv.org/abs/2403.13372}
}