eq90parsedanube

This is a merge of pre-trained language models created using mergekit.

First one that's shown promising capability improvement over the base model h2o-danube2-1.8b-base.

Training methodology ... is a bit of a mess, trying out different things. I'm adding the datasets used at any point, but I don't think replicating the recipe is doable or sensible.

Original upscale at Lambent/danube2-upscale-1, duplicating layers 16-21. Various training methods attempted to repair. Linear merge is of the 4 that were at least 90% parseable by the EQ-Bench benchmark.

Model AGIEval GPT4All TruthfulQA Bigbench Average
danube2-upscale-1.7 27.97 62.16 42.2 32.2 41.13
Model EQ-Bench Average
danube2-upscale-1.7 15.52 15.52

EQ-Bench

Task Version Metric Value Stderr
eq_bench 2.1 eqbench,none 15.52
eqbench_stderr,none 2.77
percent_parseable,none 100
percent_parseable_stderr,none 0
alias eq_bench

Average: 15.52%

Average score: 15.52%

Merge Details

Merge Method

This model was merged using the linear merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Lambent/danube2-upscale-1.531qlora
    parameters:
      weight: 1.0
  - model: Lambent/danube2-upscale-1.53lisa
    parameters:
      weight: 1.0
  - model: Lambent/danube2-upscale-1.51galore
    parameters:
      weight: 1.0
  - model: Lambent/danube2-upscale-1.51qlora
    parameters:
      weight: 1.0
merge_method: linear
dtype: float16

Downloads last month
14
Safetensors
Model size
2.25B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Lambent/danube2-upscale-1.7

Datasets used to train Lambent/danube2-upscale-1.7