merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using CultriX/SeQwence-14Bv1 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: VAGOsolutions/SauerkrautLM-v2-14b-DPO
    parameters:
      weight: 0.20  # Strong IFEval and factual reasoning baseline
      density: 0.6
  - model: allknowingroger/QwenSlerp6-14B
    parameters:
      weight: 0.20  # Balanced reasoning across multiple benchmarks
      density: 0.6
  - model: CultriX/SeQwence-14B-EvolMerge
    parameters:
      weight: 0.15  # Generalist model for BBH and MUSR
      density: 0.5
  - model: CultriX/Qwen2.5-14B-Wernicke
    parameters:
      weight: 0.15  # QA leader for GPQA and MUSR
      density: 0.6  # Increase density to preserve more QA-specific parameters
  - model: qingy2019/Qwen2.5-Math-14B-Instruct
    parameters:
      weight: 0.15  # Specialist for MATH and advanced reasoning
      density: 0.6
  - model: sometimesanotion/Qwen2.5-14B-Vimarckoso
    parameters:
      weight: 0.10  # MUSR leader for nuanced multi-step reasoning
      density: 0.5
  - model: CultriX/Qwen2.5-14B-SLERPv7
    parameters:
      weight: 0.05  # Contextual reasoning support for BBH and tiny benchmarks
      density: 0.5
base_model: CultriX/SeQwence-14Bv1
merge_method: dare_ties
parameters:
  normalize: true
  int8_mask: true
dtype: bfloat16
adaptive_merge_parameters:
  task_weights:
    IFEval: 1.3        # Enhanced instruction-following and factual tasks
    BBH: 1.3           # Strengthened complex reasoning capabilities
    MATH_Lvl_5: 1.4    # Prioritize advanced mathematical tasks
    GPQA: 1.4          # Boost graduate-level knowledge capabilities
    MuSR: 1.3          # Strengthen multi-step reasoning on complex tasks
    MMLU_PRO: 1.2      # Ensure broad domain understanding
  smoothing_factor: 0.15  # Sharper blending for reasoning and factual tasks
gradient_clipping: 0.9   # Tighter control for precise parameter scaling
Downloads last month
61
Safetensors
Model size
14.8B params
Tensor type
BF16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for CultriX/Qwen2.5-14B-Emergedv3

Space using CultriX/Qwen2.5-14B-Emergedv3 1