Deathsquad10
/

TinyLlama-Remix

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Deathsquad10 commited on Jan 5, 2024

Commit

f5f15b0

·

1 Parent(s): 3a47346

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -34,7 +34,22 @@ Llamafactory EVAL
            Humanities: 25.62
            Other: 27.26
 https://github.com/jzhang38/TinyLlama
 The TinyLlama project aims to **pretrain** a **1.1B Llama model on 3 trillion tokens**. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs 🚀🚀. The training has started on 2023-09-01.

            Humanities: 25.62
            Other: 27.26
+!CUDA_VISIBLE_DEVICES=0 python src/evaluate.py \
+    --model_name_or_path Deathsquad10/TinyLlama-Remix \
+    --template vanilla \
+    --task cmmlu \
+    --split test \
+    --lang en \
+    --n_shot 5 \
+    --use_unsloth \
+    --batch_size 2
+          Average: 24.98
+          STEM: 25.52
+          Social Sciences: 24.70
+          Humanities: 24.59
+          Other: 25.19
 https://github.com/jzhang38/TinyLlama
 The TinyLlama project aims to **pretrain** a **1.1B Llama model on 3 trillion tokens**. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs 🚀🚀. The training has started on 2023-09-01.