Auto Regressive Thinker (Art) v0 3B

Art v0 3B is our inaugural model in the Art series, fine-tuned from Qwen/Qwen2.5-3B-Instruct using a specialized dataset generated with Gemini 2.0 Flash Thinking. Read more about the Art series

Model Details

  • Base Model: Qwen2.5-3B-Instruct
  • Architecture: Transformer
  • Size: 3B parameters

Usage

The model incorporates a reasoning mechanism using specific tags:

<|start_reasoning|> model's reasoning process <|end_reasoning|> model's response

Recommendations

  • Use the model without quantization
  • Use the tokenizer chat template
  • Use a low temperature 0.1-0.3 and repetition_penalty of 1.1

Training Details

This experimental model was trained on a curated dataset generated using Gemini 2.0 Flash Thinking. Detailed training methodology, dataset, and code are available exclusively to our community members.

About Us

We are a community-funded AI research lab focused on advancing open-source AGI development. Our community members support us through Patreon donations.

Community Access

Our supporters get exclusive access to:

  • Training dataset
  • Training code and methodology
  • Behind-the-scenes development insights
  • Future model previews

Join Our Community

Downloads last month
104
Safetensors
Model size
3.09B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for AGI-0/Art-v0-3B

Quantizations
8 models

Spaces using AGI-0/Art-v0-3B 2