Update README.md
Browse files
README.md
CHANGED
@@ -42,7 +42,16 @@ The Telugu LLaMA models have been enhanced and tailored specifically with an ext
|
|
42 |
|
43 |
Benchmarking was done using [LLM-Autoeval](https://github.com/mlabonne/llm-autoeval) on an RTX 3090 on [runpod](https://www.runpod.io/).
|
44 |
|
45 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
|
47 |
## Related Models
|
48 |
|
|
|
42 |
|
43 |
Benchmarking was done using [LLM-Autoeval](https://github.com/mlabonne/llm-autoeval) on an RTX 3090 on [runpod](https://www.runpod.io/).
|
44 |
|
45 |
+
| Benchmark | Llama 2 Chat | Tamil Llama v0.2 Instruct | Telugu Llama Instruct | Malayalam Llama Instruct |
|
46 |
+
|---------------|--------------|---------------------------|-----------------------|--------------------------|
|
47 |
+
| ARC Challenge (25-shot) | 52.9 | **53.75** | 52.47 | 52.82 |
|
48 |
+
| TruthfulQA (0-shot) | 45.57 | 47.23 | **48.47** | 47.46 |
|
49 |
+
| Hellaswag (10-shot) | **78.55** | 76.11 | 76.13 | 76.91 |
|
50 |
+
| Winogrande (5-shot) | 71.74 | **73.95** | 71.74 | 73.16 |
|
51 |
+
| AGI Eval (0-shot) | 29.3 | **30.95** | 28.44 | 29.6 |
|
52 |
+
| BigBench (0-shot) | 32.6 | 33.08 | 32.99 | **33.26** |
|
53 |
+
| Average | 51.78 | **52.51** | 51.71 | 52.2 |
|
54 |
+
|
55 |
|
56 |
## Related Models
|
57 |
|