|
--- |
|
base_model: |
|
- Qwen/Qwen2.5-72B-Instruct |
|
- huihui-ai/Qwen2.5-72B-Instruct-abliterated |
|
- Qwen/Qwen2.5-72B |
|
- spow12/ChatWaifu_72B_v2.2 |
|
license: mit |
|
datasets: |
|
- arcee-ai/EvolKit-75K |
|
- SkunkworksAI/reasoning-0.01 |
|
- berkeley-nest/Nectar |
|
- Nexusflow/VirusTotalAgentic |
|
- allenai/WildChat-1M-Full |
|
- Magpie-Align/Magpie-LlamaCoT-250K |
|
--- |
|
|
|
Experimental commander model V1. |
|
|
|
Named it Zelensky in order to troll Uncle Elon on twitter over how bad Grok-2 is. |
|
|
|
Training process, low 1 epoch learning rate and evolutionary-merged with the 3 other models(listed on modelcard) |
|
|
|
Process repeated multiple times on 8x AMD Mi300 192GB gpus while also running gpqa_diamond_zeroshot on LM_Eval harness. |
|
|
|
Thank you Vultr https://www.vultr.com/register/ for sponsoring the compute. |
|
|
|
|
|
Qwen License still applies by default. |