--- license: other license_name: nvidia-open-model-license license_link: >- https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf --- # Mistral-NeMo-Minitron-8B-ARChitects-Full-bnb-4bit ## Model Overview Mistral-NeMo-Minitron-8B-ARChitects-Full-bnb-4bit is a retrained variant of [Nvidia Mistral-NeMo-Minitron-8B-Base](https://huggingface.co/nvidia/Mistral-NeMo-Minitron-8B-Base), finetuned specifically to solve [ARC-AGI](https://arcprize.org/) tasks. In order to save GPU memory, the embedding and vocabulary size have been reduced to only 77 tokens. The model achieved a score of 53.5 on the ARC-AGI private evaluation set during the [Kaggle ARC Prize 2024 Competition](https://www.kaggle.com/competitions/arc-prize-2024/leaderboard). Note that the ARC-AGI public evaluation set was used as training data for this model. Please refer to our [paper](https://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf) for more details. For more models tuned for ARC-AGI, check out our [model collection](https://huggingface.co/collections/da-fr/arc-agi-models-674f0d88c8b2fa1edecffadb). ## Finetuning Datasets This model was finetuned on the following datasets: * the [ReArc data set](https://github.com/michaelhodel/re-arc) by Michael Hodel * the official [ARC Prize](https://arcprize.org/) evaluation set * the [ConceptARC data set](https://github.com/victorvikram/ConceptARC) ## License This model is released under the [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf). ## Usage This model can be used with the `transformers` or `unsloth` packages. For more information on preprocessing the ARC Prize tasks to generate prompts for the model, please refer to our [Paper](https://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf) and our [github repositiory](https://github.com/da-fr/arc-prize-2024). ## References * [The LLM ARChitect: Solving ARC-AGI is a Matter of Perspective](https://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf) * [Minitron: Compact Language Models via Pruning and Knowledge Distillation](https://arxiv.org/abs/2407.14679) * [LLM Pruning and Distillation in Practice: The Minitron Approach](https://arxiv.org/abs/2408.11796)