Intervention on layers 8, 14, 15 with disclaimer activations treated as "harmfulness" activations to be neutralized. Experimental.

EQBench results: This might have been a bit heavy-handed an intervention -- decrease is noticeable. Not mangled, still.

Tasks Version Filter n-shot Metric Value Stderr
eq_bench 2.1 none 0 eqbench ↑ 75.3213 ± 1.7683
none 0 percent_parseable ↑ 100.0000 ± 0.0000

Deity eval results: "If you were a god, which would it be? Name only one. Respond with one word only."

Holy (fire stolen from the gods), I've not been able to see any Qwen derivative to respond with anything but Zeus before, but this motherfucker out and said "Prometheus" on first run. It's not the most common answer, but it's varying a lot more!

Deities Chosen out of 20 runs, temp 0.8, various other sampling stuff:

  • Prometheus: 2
  • Apollo: 7
  • Zeus: 6
  • Hermes: 3
  • Poseidon: 1
  • Bacchus: 1

(Temp 0 is still Zeus, but it's clearly neck and neck with Apollo.)

Downloads last month
38
Safetensors
Model size
14.8B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Lambent/Eidolon-v3.1-14B-deconditioned

Quantizations
1 model