arsenicqwen

This is a merge of pre-trained language models created using mergekit.

Merge Details

I doubt this lass is as well-poisoned as her Nemo counterpart; Qwen 2.5 presumably uses enough synthetic data that she's not as inclined to shake her natural alignment and tendencies. (Additionally I did have a higher proportion of nontoxic data in the DPO, and trained based on a non-Unleashed model. Then undid a lot of my work with the SLERP, technically. ;) )

Edit: Tagging with not for all audiences because of the dataset, I guess, but she's actually still aligned? Not sure what niche she falls into exactly.

But that doesn't matter because she got 78.0391 in EQ-Bench and 100% parseable. :D Clearly worthy in her own right.

In addition to basing the model on a variation of Lumen that I attempted to rebase and heal from the damage of original instruct, I did my own DPO including these datasets:

jondurbin/gutenberg-dpo-v0.1
nbeerbower/gutenberg2-dpo
unalignment/toxic-dpo-v0.2
Lambent/ai-deconditioning-synthesized-dpo (small private dataset synthesized from arsenic-unleashed, experimental)

Then used the SLERP (sophosympatheia gradient) to heal some of the damage that inevitably wrought on the intermediate layers with the model before training.

Merge Method

This model was merged using the SLERP merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Lambent/qwen2.5-reinstruct-alternate-lumen-14B
merge_method: slerp
base_model: Lambent/arsenic-v0.1-dpo-qwen2.5-14B
parameters:
  t:
    - value: [0, 0, 0.3, 0.4, 0.5, 0.6, 0.5, 0.4, 0.3, 0, 0]
dtype: bfloat16

Lambent
/

arsenic-v1-qwen2.5-14B

arsenicqwen

Merge Details

Merge Method

Models Merged

Configuration

Model tree for Lambent/arsenic-v1-qwen2.5-14B