---
license: apache-2.0
---
---
license: apache-2.0
---
# Claude-Inspired LLaMA Finetune
## Overview
This repository contains a fine-tuned version of LLaMA-3.2-3B, trained on a dataset of Claude 3.5 Sonnet generated examples. The model is designed to emulate Claude's reasoning process, with the goal of creating a smaller model that thinks and reasons similarly to OpenAI's O1 series.
For the best results. Please use the FP16 variant. Q8_0 still applys the technique, however it doesn't create as sound of a reasoning process.
## Key Features
- Based on LLaMA-3.2-3B architecture
- Available on [Ollama HERE!](https://ollama.com/medragondot/llama-3.2-3b-thinking)
- Fine-tuned on [Claude Thinking Dataset](https://huggingface.co/datasets/Medragondot/claude-thinking)
- Implements reasoning tags for enhanced output:
- ``
- ``
- ``
## Model Variants
The model is available in GGUF format with the following quantizations on [Ollama](https://ollama.com/medragondot/llama-3.2-3b-thinking):
- F16 (Full precision)
- Q8_0 (8-bit)
## Usage
To use this model, you'll need a compatible language model framework that supports GGUF format. Either import the model into **Ollama** or use the following example to load and use the model.
Here's a basic example of how to load and use the model:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "path/to/model.gguf"
model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
prompt = "Analyze the pros and cons of renewable energy sources."
input_ids = tokenizer.encode(prompt, return_tensors="pt")
output = model.generate(input_ids, max_length=500)
response = tokenizer.decode(output[0], skip_special_tokens=True)
print(response)
```
## Reasoning Tags
This model uses special tags to structure its thinking process:
- ``: Used for initial thoughts and analysis
- ``: Used for deeper consideration and self-critique
- `