Files changed (1) hide show
  1. README.md +121 -5
README.md CHANGED
@@ -1,15 +1,118 @@
1
  ---
2
- license: cc-by-4.0
3
- datasets:
4
- - Open-Orca/OpenOrca
5
- - Intel/orca_dpo_pairs
6
  language:
7
  - en
 
8
  tags:
9
  - xDAN-AI
10
  - OpenOrca
11
  - DPO
12
  - Self-Think
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ---
14
 
15
  <div style="display: flex; justify-content: center; align-items: center">
@@ -130,4 +233,17 @@ and you carefully consider each step before providing answers.
130
  We employ rigorous data compliance validation algorithms throughout the training of our language model to ensure the highest level of compliance. However, due to the intricate nature of data and the wide range of potential usage scenarios for the model, we cannot guarantee that it will consistently produce accurate and sensible outputs. Users should be aware of the possibility of the model generating problematic results. Our organization disclaims any responsibility for risks or issues arising from misuse, improper guidance, unlawful usage, misinformation, or subsequent concerns regarding data security.
131
 
132
  ## About xDAN-AI
133
- xDAN-AI represents the forefront of Silicon-Based Life Factory technology. For comprehensive information and deeper insights into our cutting-edge technology and offerings, please visit our website: https://www.xdan.ai.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
 
2
  language:
3
  - en
4
+ license: cc-by-4.0
5
  tags:
6
  - xDAN-AI
7
  - OpenOrca
8
  - DPO
9
  - Self-Think
10
+ datasets:
11
+ - Open-Orca/OpenOrca
12
+ - Intel/orca_dpo_pairs
13
+ model-index:
14
+ - name: xDAN-L1-Chat-RL-v1
15
+ results:
16
+ - task:
17
+ type: text-generation
18
+ name: Text Generation
19
+ dataset:
20
+ name: AI2 Reasoning Challenge (25-Shot)
21
+ type: ai2_arc
22
+ config: ARC-Challenge
23
+ split: test
24
+ args:
25
+ num_few_shot: 25
26
+ metrics:
27
+ - type: acc_norm
28
+ value: 66.3
29
+ name: normalized accuracy
30
+ source:
31
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xDAN-AI/xDAN-L1-Chat-RL-v1
32
+ name: Open LLM Leaderboard
33
+ - task:
34
+ type: text-generation
35
+ name: Text Generation
36
+ dataset:
37
+ name: HellaSwag (10-Shot)
38
+ type: hellaswag
39
+ split: validation
40
+ args:
41
+ num_few_shot: 10
42
+ metrics:
43
+ - type: acc_norm
44
+ value: 85.81
45
+ name: normalized accuracy
46
+ source:
47
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xDAN-AI/xDAN-L1-Chat-RL-v1
48
+ name: Open LLM Leaderboard
49
+ - task:
50
+ type: text-generation
51
+ name: Text Generation
52
+ dataset:
53
+ name: MMLU (5-Shot)
54
+ type: cais/mmlu
55
+ config: all
56
+ split: test
57
+ args:
58
+ num_few_shot: 5
59
+ metrics:
60
+ - type: acc
61
+ value: 63.21
62
+ name: accuracy
63
+ source:
64
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xDAN-AI/xDAN-L1-Chat-RL-v1
65
+ name: Open LLM Leaderboard
66
+ - task:
67
+ type: text-generation
68
+ name: Text Generation
69
+ dataset:
70
+ name: TruthfulQA (0-shot)
71
+ type: truthful_qa
72
+ config: multiple_choice
73
+ split: validation
74
+ args:
75
+ num_few_shot: 0
76
+ metrics:
77
+ - type: mc2
78
+ value: 56.7
79
+ source:
80
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xDAN-AI/xDAN-L1-Chat-RL-v1
81
+ name: Open LLM Leaderboard
82
+ - task:
83
+ type: text-generation
84
+ name: Text Generation
85
+ dataset:
86
+ name: Winogrande (5-shot)
87
+ type: winogrande
88
+ config: winogrande_xl
89
+ split: validation
90
+ args:
91
+ num_few_shot: 5
92
+ metrics:
93
+ - type: acc
94
+ value: 78.85
95
+ name: accuracy
96
+ source:
97
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xDAN-AI/xDAN-L1-Chat-RL-v1
98
+ name: Open LLM Leaderboard
99
+ - task:
100
+ type: text-generation
101
+ name: Text Generation
102
+ dataset:
103
+ name: GSM8k (5-shot)
104
+ type: gsm8k
105
+ config: main
106
+ split: test
107
+ args:
108
+ num_few_shot: 5
109
+ metrics:
110
+ - type: acc
111
+ value: 59.44
112
+ name: accuracy
113
+ source:
114
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=xDAN-AI/xDAN-L1-Chat-RL-v1
115
+ name: Open LLM Leaderboard
116
  ---
117
 
118
  <div style="display: flex; justify-content: center; align-items: center">
 
233
  We employ rigorous data compliance validation algorithms throughout the training of our language model to ensure the highest level of compliance. However, due to the intricate nature of data and the wide range of potential usage scenarios for the model, we cannot guarantee that it will consistently produce accurate and sensible outputs. Users should be aware of the possibility of the model generating problematic results. Our organization disclaims any responsibility for risks or issues arising from misuse, improper guidance, unlawful usage, misinformation, or subsequent concerns regarding data security.
234
 
235
  ## About xDAN-AI
236
+ xDAN-AI represents the forefront of Silicon-Based Life Factory technology. For comprehensive information and deeper insights into our cutting-edge technology and offerings, please visit our website: https://www.xdan.ai.
237
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
238
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_xDAN-AI__xDAN-L1-Chat-RL-v1)
239
+
240
+ | Metric |Value|
241
+ |---------------------------------|----:|
242
+ |Avg. |68.38|
243
+ |AI2 Reasoning Challenge (25-Shot)|66.30|
244
+ |HellaSwag (10-Shot) |85.81|
245
+ |MMLU (5-Shot) |63.21|
246
+ |TruthfulQA (0-shot) |56.70|
247
+ |Winogrande (5-shot) |78.85|
248
+ |GSM8k (5-shot) |59.44|
249
+