Z-2-A.TEST-TEMP-MODEL

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the SLERP merge method.

Models Merged

The following models were included in the merge:

  • D:\VICIOUS_MESH-12B-OMEGA
  • D:\jetreessence

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: "D:\\VICIOUS_MESH-12B-OMEGA"
  - model: "D:\\jetreessence"
merge_method: slerp
base_model: "D:\\VICIOUS_MESH-12B-OMEGA"
dtype: bfloat16
parameters:
  t: [0, 0.5, 1, 0.5, 0]

regularization:
  - method: gradient_penalty
    scale: 0.05
  - method: weight_clipping
    clip_range: [-0.15, 0.15]
  - method: random_noise
    scale: 0.01
  - method: attention_dropout
    scale: 0.02

postprocessing:
  - operation: entropy_regularization
    scale: 0.05
  - operation: non_linear_scaling
    parameters:
      function: relu
  - operation: sharpening
    intensity: 0.6
  - operation: gaussian_smoothing
    sigma: 0.3
  - operation: normalize
  - operation: dynamic_scaling
    scale_range: [0.98, 1.02]
  - operation: smoothing
    parameters:
      adaptive: true
      range: [0.98, 1.02]
      kernel_size: 3

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 26.94
IFEval (0-Shot) 66.02
BBH (3-Shot) 31.36
MATH Lvl 5 (4-Shot) 11.10
GPQA (0-shot) 8.61
MuSR (0-shot) 14.70
MMLU-PRO (5-shot) 29.83
Downloads last month
1
Safetensors
Model size
12.2B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for bamec66557/NameLess-12B-prob

Quantizations
1 model

Dataset used to train bamec66557/NameLess-12B-prob

Evaluation results