Update constants.py
Browse files- constants.py +5 -0
constants.py
CHANGED
@@ -82,6 +82,11 @@ TABLE_INTRODUCTION = """In the table below, we summarize each task performance o
|
|
82 |
We use accurancy(%) as the primary evaluation metric for each tasks.
|
83 |
SEED-Bench-1 calculates the overall accuracy by dividing the total number of correct QA answers by the total number of QA questions.
|
84 |
SEED-Bench-2 represents the overall accuracy using the average accuracy of each dimension.
|
|
|
|
|
|
|
|
|
|
|
85 |
"""
|
86 |
|
87 |
LEADERBORAD_INFO = """
|
|
|
82 |
We use accurancy(%) as the primary evaluation metric for each tasks.
|
83 |
SEED-Bench-1 calculates the overall accuracy by dividing the total number of correct QA answers by the total number of QA questions.
|
84 |
SEED-Bench-2 represents the overall accuracy using the average accuracy of each dimension.
|
85 |
+
For PPL evaluation method, we count the loss for each candidate and select the lowest loss candidate. For detail, please refer [InternLM_Xcomposer_VL_interface](https://github.com/AILab-CVC/SEED-Bench/blob/387a067b6ba99ae5e8231f39ae2d2e453765765c/SEED-Bench-2/model/InternLM_Xcomposer_VL_interface.py#L74).
|
86 |
+
For PPL A/B/C/D evaluation method, please refer [EVAL_SEED.md](https://github.com/QwenLM/Qwen-VL/blob/master/eval_mm/seed_bench/EVAL_SEED.md) for more information.
|
87 |
+
For Generate evaluation method, please refer [Evaluation.md](https://github.com/haotian-liu/LLaVA/blob/main/docs/Evaluation.md#seed-bench) for detailed.
|
88 |
+
For the NG evaluation method, we indicate that the evaluation method is Not Given.
|
89 |
+
If you have any questions, please feel free to contact us.
|
90 |
"""
|
91 |
|
92 |
LEADERBORAD_INFO = """
|