anamikac2708
/

Llama3-8b-finetuned-investopedia-Merged-FP16

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

anamikac2708 commited on Jun 16, 2024

Commit

f50a657

·

verified ·

1 Parent(s): bb41449

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -99,7 +99,7 @@ QLORA paper link - https://arxiv.org/abs/2305.14314
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
-We evaluated the model on test set (sample 1k) https://huggingface.co/datasets/FinLang/investopedia-instruction-tuning-dataset. Evaluation was done using Proprietary LLMs as judge on four criteria Correctness, Faithfullness, Clarity, Completeness on scale of 1-5 (1 being worst & 5 being best) inspired  by the paper Replacing Judges with Juries https://arxiv.org/abs/2404.18796. Model got an average score of 4.67.
 Average inference speed of the model is 10.96 secs. Human Evaluation is in progress to see the percentage of alignment between human and LLM.

 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
+We evaluated the model on test set (sample 1k) https://huggingface.co/datasets/FinLang/investopedia-instruction-tuning-dataset. Evaluation was done using Proprietary LLMs as jury on four criteria Correctness, Faithfullness, Clarity, Completeness on scale of 1-5 (1 being worst & 5 being best) inspired  by the paper Replacing Judges with Juries https://arxiv.org/abs/2404.18796. Model got an average score of 4.67.
 Average inference speed of the model is 10.96 secs. Human Evaluation is in progress to see the percentage of alignment between human and LLM.