ibm-granite
/

granite-uncertainty-3.0-8b-lora

Text Generation

Inference Endpoints

Model card Files Files and versions Community

kgreenewald commited on Oct 23, 2024

Commit

9afaa56

·

verified ·

1 Parent(s): 30a033c

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -161,11 +161,12 @@ The following datasets were used for calibration and/or finetuning.
 ## Evaluation
 The model was evaluated on the [MMLU](https://huggingface.co/datasets/cais/mmlu) datasets (not used in training). Shown are the [Expected Calibration Error (ECE)](https://towardsdatascience.com/expected-calibration-error-ece-a-step-by-step-visual-explanation-with-python-code-c3e9aa12937d) for each task, for the base model (Granite-3.0-8b-instruct) and Granite-Uncertainty-3.0-8b.
-The average ECE across tasks is 0.06 (out of 1). Note that this is smaller than the gap between the quantized certainty outputs (10% quantization steps).
 <!-- This section describes the evaluation protocols and provides the results. -->
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/6602ffd971410cf02bf42c06/x0IRS16p59O19r4hwOzyU.png)
 ## Model Card Authors

 ## Evaluation
 The model was evaluated on the [MMLU](https://huggingface.co/datasets/cais/mmlu) datasets (not used in training). Shown are the [Expected Calibration Error (ECE)](https://towardsdatascience.com/expected-calibration-error-ece-a-step-by-step-visual-explanation-with-python-code-c3e9aa12937d) for each task, for the base model (Granite-3.0-8b-instruct) and Granite-Uncertainty-3.0-8b.
+The average ECE across tasks for our method is 0.064 (out of 1) and is consistently low across tasks (maximum task ECE 0.10), compared to the base model average ECE of 0.20 and maximum task ECE of 0.60. Note that our ECE of 0.064 is smaller than the gap between the quantized certainty outputs (10% quantization steps). Additionally, the zero-shot performance on the MMLU tasks does not degrade, averaging at 89%.
 <!-- This section describes the evaluation protocols and provides the results. -->
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6602ffd971410cf02bf42c06/2MwP7DRZlNBtWSKWFvXOI.png)
 ## Model Card Authors