Steelskull/Etheria-55b-v0.1

image/png

Merge Details

An attempt to make a functional goliath style merge to create a [Etheria] 55b-200k with two yi-34b-200k models.

due to the merge it 'theoretically' should have a context of 200k but I recommend starting at 32k and moveing up, as it is unknown (at this time) what the merge has done to the context length.

This is a merge of both VerA and VerB of Etheria-55b (There numbers were surprisingly good), I then created a sacrificial 55B out of the most performant yi-34b-200k Model and performed a Dare_ties merge and equalize the model into its current state.

recommended settings and Prompt Format:

Ive tested it up to 32k context using exl2 using these settings:

    "temp": 0.7,
    "temperature_last": true,
    "top_p": 1,
    "top_k": 0,
    "top_a": 0,
    "tfs": 1,
    "epsilon_cutoff": 0,
    "eta_cutoff": 0,
    "typical_p": 1,
    "min_p": 0.1,
    "rep_pen": 1.1,
    "rep_pen_range": 8192,
    "no_repeat_ngram_size": 0,
    "penalty_alpha": 0,
    "num_beams": 1,
    "length_penalty": 1,
    "min_length": 0,
    "encoder_rep_pen": 1,
    "freq_pen": 0,
    "presence_pen": 0,
    "do_sample": true,
    "early_stopping": false,
    "add_bos_token": false,
    "truncation_length": 2048,
    "ban_eos_token": true,
    "skip_special_tokens": true,
    "streaming": true,
    "mirostat_mode": 0,
    "mirostat_tau": 5,
    "mirostat_eta": 0.1,

Prompt format that work well

ChatML & Alpaca

Merge Method

This model was merged using the DARE TIES merge method using Merged-Etheria-55b as a base.

Configuration

The following YAML configuration was used to produce this model:


base_model: Merged-Etheria-55b
models:
  - model: Sacr-Etheria-55b
    parameters:
      weight: [0.22, 0.113, 0.113, 0.113, 0.113, 0.113]
      density: 0.61
  - model: Merged-Etheria-55b
    parameters:
      weight: [0.22, 0.113, 0.113, 0.113, 0.113, 0.113]
      density: 0.61
merge_method: dare_ties
tokenizer_source: union
parameters:
  int8_mask: true
dtype: bfloat16

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 64.69
AI2 Reasoning Challenge (25-Shot) 65.10
HellaSwag (10-Shot) 81.93
MMLU (5-Shot) 73.66
TruthfulQA (0-shot) 56.16
Winogrande (5-shot) 76.09
GSM8k (5-shot) 35.18
Downloads last month
30
Safetensors
Model size
55.6B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for SteelStorage/Etheria-55b-v0.1

Quantizations
5 models

Evaluation results