TIGER-Lab
/

VideoScore

Visual Question Answering

text-classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

hexuan21 commited on Jun 22, 2024

Commit

9612d1b

·

verified ·

1 Parent(s): 6a021c3

Update README.md

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -25,8 +25,7 @@ a large video evaluation dataset with multi-aspect human scores.
 - MantisScore also beat the best baselines on other three benchmarks EvalCrafter, GenAI-Bench and VBench, showing high alignment with human evaluations.
-## Performance
-### Evaluation Results
 We test our video evaluation model MantisScore on VideoEval-test, EvalCrafter, GenAI-Bench and VBench.
 For the first two benchmarks, we take Spearman corrleation between model's output and human ratings

 - MantisScore also beat the best baselines on other three benchmarks EvalCrafter, GenAI-Bench and VBench, showing high alignment with human evaluations.
+## Evaluation Results
 We test our video evaluation model MantisScore on VideoEval-test, EvalCrafter, GenAI-Bench and VBench.
 For the first two benchmarks, we take Spearman corrleation between model's output and human ratings