Medragondot
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,80 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
|
5 |
+
---
|
6 |
+
license: apache-2.0
|
7 |
+
---
|
8 |
+
|
9 |
+
# Claude-Inspired LLaMA Finetune
|
10 |
+
|
11 |
+
## Overview
|
12 |
+
|
13 |
+
This repository contains a fine-tuned version of LLaMA-3.2-3B, trained on a dataset of Claude 3.5 Sonnet generated examples. The model is designed to emulate Claude's reasoning process, with the goal of creating a smaller model that thinks and reasons similarly to OpenAI's O1 series.
|
14 |
+
|
15 |
+
For the best results. Please use the FP16 variant. Q8_0 still applys the technique, however it doesn't create as sound of a reasoning process.
|
16 |
+
|
17 |
+
## Key Features
|
18 |
+
|
19 |
+
- Based on LLaMA-3.2-3B architecture
|
20 |
+
- Available on [Ollama HERE!](https://ollama.com/medragondot/llama-3.2-3b-thinking)
|
21 |
+
- Fine-tuned on [Claude Thinking Dataset](https://huggingface.co/datasets/Medragondot/claude-thinking)
|
22 |
+
- Implements reasoning tags for enhanced output:
|
23 |
+
- `<thinking></thinking>`
|
24 |
+
- `<reflection></reflection>`
|
25 |
+
- `<output></output>`
|
26 |
+
|
27 |
+
## Model Variants
|
28 |
+
|
29 |
+
The model is available in GGUF format with the following quantizations on [Ollama](https://ollama.com/medragondot/llama-3.2-3b-thinking):
|
30 |
+
|
31 |
+
- F16 (Full precision)
|
32 |
+
- Q8_0 (8-bit)
|
33 |
+
|
34 |
+
## Usage
|
35 |
+
|
36 |
+
To use this model, you'll need a compatible language model framework that supports GGUF format. Either import the model into **Ollama** or use the following example to load and use the model.
|
37 |
+
|
38 |
+
Here's a basic example of how to load and use the model:
|
39 |
+
|
40 |
+
```python
|
41 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
42 |
+
|
43 |
+
model_path = "path/to/model.gguf"
|
44 |
+
model = AutoModelForCausalLM.from_pretrained(model_path)
|
45 |
+
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
46 |
+
|
47 |
+
prompt = "Analyze the pros and cons of renewable energy sources."
|
48 |
+
input_ids = tokenizer.encode(prompt, return_tensors="pt")
|
49 |
+
|
50 |
+
output = model.generate(input_ids, max_length=500)
|
51 |
+
response = tokenizer.decode(output[0], skip_special_tokens=True)
|
52 |
+
|
53 |
+
print(response)
|
54 |
+
```
|
55 |
+
|
56 |
+
## Reasoning Tags
|
57 |
+
|
58 |
+
This model uses special tags to structure its thinking process:
|
59 |
+
|
60 |
+
- `<thinking>`: Used for initial thoughts and analysis
|
61 |
+
- `<reflection>`: Used for deeper consideration and self-critique
|
62 |
+
- `<output>`: Used for the final, refined response
|
63 |
+
|
64 |
+
## Training Data
|
65 |
+
|
66 |
+
This model was fine-tuned on the [Claude Thinking Dataset](https://huggingface.co/datasets/Medragondot/claude-thinking), which consists of examples generated by Claude 3.5 Sonnet. The dataset was curated to capture Claude's reasoning process and ability to provide structured, thoughtful responses.
|
67 |
+
|
68 |
+
## Limitations
|
69 |
+
|
70 |
+
While this model aims to emulate Claude's reasoning process, it's important to note that it is a much smaller model (3B parameters) compared to the original Claude model. As such, it may not achieve the same level of performance or capabilities. Users should be aware of potential limitations in terms of knowledge breadth, reasoning depth, and overall output quality.
|
71 |
+
|
72 |
+
## Acknowledgments
|
73 |
+
|
74 |
+
- Anthropic for creating Claude 3.5 Sonnet
|
75 |
+
- Meta AI for the original LLaMA model
|
76 |
+
- HuggingFace for hosting the dataset
|
77 |
+
|
78 |
+
## Disclaimer
|
79 |
+
|
80 |
+
This model is an experimental fine-tune and should be used responsibly. It may produce incorrect or biased information. Always verify important information from authoritative sources.
|