jeffmeloy commited on
Commit
f48d6dd
·
verified ·
1 Parent(s): 872f586

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - Qwen/Qwen2.5-7B
5
+ pipeline_tag: text-generation
6
+ language:
7
+ - en
8
+ library_name: transformers
9
+ tags:
10
+ - text-generation-inference
11
+ ---
12
+
13
+ ## Model Description
14
+
15
+ Optimized Layer Merging (OLM)
16
+ Is a transformer optimization framework implementing automated layer recombination.
17
+
18
+ Olm create Frankenstein's monster out of language models by cherry-picking the best performing layers across different models to create a superior hybrid.
19
+ The core mechanism:
20
+
21
+ - Takes multiple language models as input
22
+ - Uses a base model as the foundation
23
+ - Iteratively replaces individual layers, evaluating performance on specified datasets
24
+ - Keeps the best performing layer at each position based on metrics like perplexity, exact match, and a custom "quality" score
25
+ - Builds a fusion model layer-by-layer while maintaining or improving performance
26
+
27
+ https://github.com/jeffmeloy/olm