Spaces:

CIIRC-NLP
/

czechbench_leaderboard

Running

App Files Files Community

Adamiros commited on Sep 9, 2024

Commit

0098e4c

verified ·

1 Parent(s): dc5cd2c

Update src/display/about.py

Browse files

Files changed (1) hide show

src/display/about.py +17 -2

src/display/about.py CHANGED Viewed

@@ -33,10 +33,25 @@ TITLE = """<h1 align="center" id="space-title">🇨🇿 CzechBench Leaderboard</
 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
-Czech-Bench is a collection of LLM benchmarks available for the Czech language. It currently consists of 15 Czech benchmarks, including new machine translations of the popular ARC, GSM8K, MMLU, and TruthfulQA datasets.
-Czech-Bench is developed by <a href="https://huggingface.co/CIIRC-NLP">CIIRC-NLP</a>.
 """
 # Which evaluations are you running? how can people reproduce what you have?
 LLM_BENCHMARKS_TEXT = f"""

 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
+The goal of the CzechBench project is to provide a comprehensive and practical benchmark for evaluating Czech language models.
+Our [evaluation suite](https://github.com/jirkoada/czechbench_eval_harness/tree/main/lm_eval/tasks/czechbench#readme)
+currently consists of 15 individual tasks, leveraging pre-existing Czech datasets together with new machine translations of popular LLM benchmarks,
+including ARC, GSM8K, MMLU, and TruthfulQA.
+Key Features and Benefits:
+- **Tailored for the Czech Language:** The benchmark includes both original Czech datasets and adapted versions of international datasets, ensuring relevant evaluation of model performance in the Czech context.
+- **Wide Range of Tasks:** It contains 15 different tasks that cover various aspects of language understanding and text generation, enabling a comprehensive assessment of the model's capabilities.
+- **Universal model support:** The universal text-to-text evaluation approach adopted in CzechBench allows for direct comparison of models with varying levels of internal access, including commercial APIs.
+- **Ease of Use:** The benchmark is designed to be easily integrated into your development process, saving time and resources during model testing and improvement.
+- **Up-to-date and Relevant:** We regularly update our datasets to reflect the latest findings and trends in language model development.
+By using CzechBench, you will gain deep insights into the strengths and weaknesses of your models, allowing you to better focus on key areas for optimization.
+This will not only improve the performance of your models but also enhance their real-world deployment in various Czech contexts.
+Below, you can find the up-to-date loaderboard of models evaluated on CzechBench.
+For more information on the included benchmarks and instructions on evaluating your own models, please visit the "About" section below.
 """
+# Czech-Bench is developed by <a href="https://huggingface.co/CIIRC-NLP">CIIRC-NLP</a>.
 # Which evaluations are you running? how can people reproduce what you have?
 LLM_BENCHMARKS_TEXT = f"""