|
--- |
|
datasets: |
|
- Marcus2112/minipile_density-proportioned |
|
language: |
|
- en |
|
base_model: |
|
- EleutherAI/pythia-160m-deduped |
|
--- |
|
|
|
| Benchmark | Measure | | 160M MiniPile | 160M Density | |
|
| ---------------- | ---------- | --- | --------------------------- | -------------------------------- | |
|
| ARC-Challenge | acc | ↑ | **0.2125 ± 0.0120** | 0.1920 ± 0.0115 | |
|
| MMLU | acc | ↑ | **0.2699 ± 0.0037** | 0.2295 ± 0.0035 | |
|
| HellaSwag | acc | ↑ | 0.2560 ± 0.0044 | **0.2604 ± 0.0044** | |
|
| WinoGrande | acc | ↑ | 0.4720 ± 0.0140 | **0.5201 ± 0.0140** | |
|
| Lambada (OpenAI) | acc | ↑ | 0.0000 ± 0.0000 | 0.0000 ± 0.0000 | |
|
| Lambada (OpenAI) | perplexity | ↓ | 3033175.2693 ± 288926.5827 | **2099002.0912 ± 170652.6222** | |
|
| Lambada (Std) | acc | ↑ | 0.0000 ± 0.0000 | 0.0000 ± 0.0000 | |
|
| Lambada (Std) | perplexity | ↓ | 27067951.3460 ± 2710040.191 | **13347273.6076 ± 1997894.6360** | |
|
| BLiMP | acc | ↑ | 0.5194 ± 0.0018 | **0.5501 ± 0.0017** | |