anamikac2708
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -99,7 +99,7 @@ QLORA paper link - https://arxiv.org/abs/2305.14314
|
|
99 |
## Evaluation
|
100 |
|
101 |
<!-- This section describes the evaluation protocols and provides the results. -->
|
102 |
-
We evaluated the model on test set (sample 1k) https://huggingface.co/datasets/FinLang/investopedia-instruction-tuning-dataset. Evaluation was done using Proprietary LLMs as
|
103 |
Average inference speed of the model is 10.96 secs. Human Evaluation is in progress to see the percentage of alignment between human and LLM.
|
104 |
|
105 |
|
|
|
99 |
## Evaluation
|
100 |
|
101 |
<!-- This section describes the evaluation protocols and provides the results. -->
|
102 |
+
We evaluated the model on test set (sample 1k) https://huggingface.co/datasets/FinLang/investopedia-instruction-tuning-dataset. Evaluation was done using Proprietary LLMs as jury on four criteria Correctness, Faithfullness, Clarity, Completeness on scale of 1-5 (1 being worst & 5 being best) inspired by the paper Replacing Judges with Juries https://arxiv.org/abs/2404.18796. Model got an average score of 4.67.
|
103 |
Average inference speed of the model is 10.96 secs. Human Evaluation is in progress to see the percentage of alignment between human and LLM.
|
104 |
|
105 |
|