arsenicqwen
This is a merge of pre-trained language models created using mergekit.
Merge Details
I doubt this lass is as well-poisoned as her Nemo counterpart; Qwen 2.5 presumably uses enough synthetic data that she's not as inclined to shake her natural alignment and tendencies. (Additionally I did have a higher proportion of nontoxic data in the DPO, and trained based on a non-Unleashed model. Then undid a lot of my work with the SLERP, technically. ;) )
Edit: Tagging with not for all audiences because of the dataset, I guess, but she's actually still aligned? Not sure what niche she falls into exactly.
But that doesn't matter because she got 78.0391 in EQ-Bench and 100% parseable. :D Clearly worthy in her own right.
In addition to basing the model on a variation of Lumen that I attempted to rebase and heal from the damage of original instruct, I did my own DPO including these datasets:
jondurbin/gutenberg-dpo-v0.1
nbeerbower/gutenberg2-dpo
unalignment/toxic-dpo-v0.2
Lambent/ai-deconditioning-synthesized-dpo (small private dataset synthesized from arsenic-unleashed, experimental)
Then used the SLERP (sophosympatheia gradient) to heal some of the damage that inevitably wrought on the intermediate layers with the model before training.
Merge Method
This model was merged using the SLERP merge method.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
models:
- model: Lambent/qwen2.5-reinstruct-alternate-lumen-14B
merge_method: slerp
base_model: Lambent/arsenic-v0.1-dpo-qwen2.5-14B
parameters:
t:
- value: [0, 0, 0.3, 0.4, 0.5, 0.6, 0.5, 0.4, 0.3, 0, 0]
dtype: bfloat16
- Downloads last month
- 47
Model tree for Lambent/arsenic-v1-qwen2.5-14B
Base model
Qwen/Qwen2.5-14B