SondosMB commited on
Commit
bc21111
Β·
verified Β·
1 Parent(s): 86d655e

Update constants.py

Browse files
Files changed (1) hide show
  1. constants.py +2 -2
constants.py CHANGED
@@ -10,7 +10,7 @@ INTRODUCTION_TEXT= """
10
  # OS Benchmark (Evaluating LLMs with OS and MCQ)
11
  πŸ”— [Website](https://github.com/VILA-Lab/MBZUAI-LLM-Leaderboard) | πŸ’» [GitHub](https://github.com/VILA-Lab/MBZUAI-LLM-Leaderboard) | πŸ“– [Paper](#) | 🐦 [Tweet 1](#) | 🐦 [Tweet 2](#)
12
 
13
- > ### MBZUAI-LLM-Leaderboard, a new framework for evaluating large language models (LLMs) by transitioning from multiple-choice questions (MCQs) to open-style questions.
14
  This approach addresses the inherent biases and limitations of MCQs, such as selection bias and the effect of random guessing. By utilizing open-style questions,
15
  the framework aims to provide a more accurate assessment of LLMs' abilities across various benchmarks and ensure that the evaluation reflects true capabilities,
16
  particularly in terms of language understanding and reasoning.
@@ -18,7 +18,7 @@ particularly in terms of language understanding and reasoning.
18
  """
19
 
20
  CITATION_TEXT = """@artical{..,
21
- title={MBZUAI-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena},
22
  author={},
23
  year={2024},
24
  archivePrefix={arXiv}
 
10
  # OS Benchmark (Evaluating LLMs with OS and MCQ)
11
  πŸ”— [Website](https://github.com/VILA-Lab/MBZUAI-LLM-Leaderboard) | πŸ’» [GitHub](https://github.com/VILA-Lab/MBZUAI-LLM-Leaderboard) | πŸ“– [Paper](#) | 🐦 [Tweet 1](#) | 🐦 [Tweet 2](#)
12
 
13
+ > ### Open-LLM-Leaderboard,for evaluating large language models (LLMs) by transitioning from multiple-choice questions (MCQs) to open-style questions.
14
  This approach addresses the inherent biases and limitations of MCQs, such as selection bias and the effect of random guessing. By utilizing open-style questions,
15
  the framework aims to provide a more accurate assessment of LLMs' abilities across various benchmarks and ensure that the evaluation reflects true capabilities,
16
  particularly in terms of language understanding and reasoning.
 
18
  """
19
 
20
  CITATION_TEXT = """@artical{..,
21
+ title={Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena},
22
  author={},
23
  year={2024},
24
  archivePrefix={arXiv}