OSQ-Leaderboard / constants.py
SondosMB's picture
Upload 5 files
b782462 verified
raw
history blame
1.4 kB
from pathlib import Path
banner_url = "https://huggingface.co/spaces/WildEval/WildBench-Leaderboard/resolve/main/%E2%80%8Eleaderboard_logo_v2.png" # the same repo here.
BANNER = f'<div style="display: flex; justify-content: space-around;"><img src="{banner_url}" alt="Banner" style="width: 40vw; min-width: 300px; max-width: 600px;"> </div>'
INTRODUCTION_TEXT= """
# OS Benchmark (Evaluating LLMs with OS and MCQ)
πŸ”— [Website](https://github.com/VILA-Lab/MBZUAI-LLM-Leaderboard) | πŸ’» [GitHub](https://github.com/VILA-Lab/MBZUAI-LLM-Leaderboard) | πŸ“– [Paper](#) | 🐦 [Tweet 1](#) | 🐦 [Tweet 2](#)
> ### MBZUAI-LLM-Leaderboard, a new framework for evaluating large language models (LLMs) by transitioning from multiple-choice questions (MCQs) to open-style questions.
This approach addresses the inherent biases and limitations of MCQs, such as selection bias and the effect of random guessing. By utilizing open-style questions,
the framework aims to provide a more accurate assessment of LLMs' abilities across various benchmarks and ensure that the evaluation reflects true capabilities,
particularly in terms of language understanding and reasoning.
"""
CITATION_TEXT = """@artical{..,
title={MBZUAI-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena},
author={},
year={2024},
archivePrefix={arXiv}
}
"""