AELLM commited on
Commit
f238eba
·
verified ·
1 Parent(s): d9ff127

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -12
README.md CHANGED
@@ -21,37 +21,31 @@ datasets:
21
  tags:
22
  - llama3.2
23
  ---
24
- ![chibi-img](./chibi.jpg)
25
- ## Preface
26
 
 
 
 
27
  The importance of a small parameter large language model (LLM) lies in its ability to balance performance and efficiency. As LLMs grow increasingly sophisticated, the trade-off between model size and computational resource demands becomes critical. A smaller parameter model offers significant advantages, such as reduced memory usage, faster inference times, and lower energy consumption, all while retaining a high level of accuracy and contextual understanding. These models are particularly valuable in real-world applications where resources like processing power and storage are limited, such as on mobile devices, edge computing, or low-latency environments.
28
 
29
  ## Llama 3.2 Chibi 3B
30
-
31
  This experimental model is the result from continual pre-training of [Meta's Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on a small mixture of japanese datasets.
32
 
33
  ## Architecture
34
-
35
  [Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B)
36
 
37
  ## Training
38
-
39
- The model has been trained with a following mixture of datasets:
40
  - [ryota39/izumi-lab-dpo-45k](https://huggingface.co/datasets/ryota39/izumi-lab-dpo-45k)
41
  - [Aratako/Magpie-Tanuki-8B-97k](https://huggingface.co/datasets/Aratako/Magpie-Tanuki-8B-97k)
42
  - [kunishou/databricks-dolly-15k-ja](https://huggingface.co/datasets/kunishou/databricks-dolly-15k-ja)
43
  - [kunishou/oasst1-89k-ja](https://huggingface.co/datasets/kunishou/oasst1-89k-ja)
44
 
45
  ## Contributors
46
-
47
  - [Hammaam](https://huggingface.co/AELLM)
48
 
49
  ## How to use
50
-
51
  Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.
52
-
53
  Make sure to update your transformers installation via pip install --upgrade transformers.
54
-
55
  ```python
56
  import torch
57
  from transformers import pipeline
@@ -69,11 +63,9 @@ pipe("人生の鍵は")
69
  ```
70
 
71
  # License
72
-
73
  Refer to [Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE)
74
 
75
  # References
76
-
77
  ```bibtex
78
  @inproceedings{zheng2024llamafactory,
79
  title={LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models},
 
21
  tags:
22
  - llama3.2
23
  ---
 
 
24
 
25
+ <img src="./chibi.jpg" alt="chibi img" width="500"/>
26
+
27
+ ## Preface
28
  The importance of a small parameter large language model (LLM) lies in its ability to balance performance and efficiency. As LLMs grow increasingly sophisticated, the trade-off between model size and computational resource demands becomes critical. A smaller parameter model offers significant advantages, such as reduced memory usage, faster inference times, and lower energy consumption, all while retaining a high level of accuracy and contextual understanding. These models are particularly valuable in real-world applications where resources like processing power and storage are limited, such as on mobile devices, edge computing, or low-latency environments.
29
 
30
  ## Llama 3.2 Chibi 3B
 
31
  This experimental model is the result from continual pre-training of [Meta's Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on a small mixture of japanese datasets.
32
 
33
  ## Architecture
 
34
  [Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B)
35
 
36
  ## Training
37
+ The model has been trained with the following mixture of datasets:
 
38
  - [ryota39/izumi-lab-dpo-45k](https://huggingface.co/datasets/ryota39/izumi-lab-dpo-45k)
39
  - [Aratako/Magpie-Tanuki-8B-97k](https://huggingface.co/datasets/Aratako/Magpie-Tanuki-8B-97k)
40
  - [kunishou/databricks-dolly-15k-ja](https://huggingface.co/datasets/kunishou/databricks-dolly-15k-ja)
41
  - [kunishou/oasst1-89k-ja](https://huggingface.co/datasets/kunishou/oasst1-89k-ja)
42
 
43
  ## Contributors
 
44
  - [Hammaam](https://huggingface.co/AELLM)
45
 
46
  ## How to use
 
47
  Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.
 
48
  Make sure to update your transformers installation via pip install --upgrade transformers.
 
49
  ```python
50
  import torch
51
  from transformers import pipeline
 
63
  ```
64
 
65
  # License
 
66
  Refer to [Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE)
67
 
68
  # References
 
69
  ```bibtex
70
  @inproceedings{zheng2024llamafactory,
71
  title={LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models},