File size: 1,875 Bytes
01cdbd9 fb44162 7074a84 fb44162 9422e5c 8cfc16b 01cdbd9 029a139 01cdbd9 029a139 01cdbd9 029a139 7074a84 01cdbd9 029a139 01cdbd9 029a139 01cdbd9 029a139 01cdbd9 029a139 01cdbd9 029a139 01cdbd9 029a139 01cdbd9 029a139 01cdbd9 029a139 01cdbd9 029a139 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
---
library_name: transformers
tags:
- transformers
- peft
- arxiv:2406.08391
license: llama2
base_model: meta-llama/Llama-2-13b-chat-hf
datasets:
- calibration-tuning/Llama-2-13b-chat-hf-20k-oe
---
# Model Card
**Llama 2 13B Chat CT-OE** is a fine-tuned [Llama 2 13B Chat](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf) model that provides well-calibrated confidence estimates for open-ended question answering.
The model is fine-tuned (calibration-tuned) using a [dataset](https://huggingface.co/datasets/calibration-tuning/Llama-2-13b-chat-hf-20k-oe) of *open-ended* generations from `meta-llama/Llama-2-13b-chat-hf`, labeled for correctness.
At test/inference time, the probability of correctness defines the confidence of the model in its answer.
For full details, please see our [paper](https://arxiv.org/abs/2406.08391) and supporting [code](https://github.com/activatedgeek/calibration-tuning).
**Other Models**: We also release a broader collection of [Open-Ended CT Models](https://huggingface.co/collections/calibration-tuning/open-ended-ct-models-66043b12c7902115c826a20e).
## Usage
This adapter model is meant to be used on top of `meta-llama/Llama-2-13b-chat-hf` model generations.
The confidence estimation pipeline follows these steps,
1. Load base model and PEFT adapter.
2. Disable adapter and generate answer.
3. Enable adapter and generate confidence.
All standard guidelines for the base model's generation apply.
For a complete example, see [play.py](https://github.com/activatedgeek/calibration-tuning/blob/main/experiments/play.py) at the supporting code repository.
**NOTE**: Using the adapter for generations may hurt downstream task accuracy and confidence estimates. We recommend using the adapter to estimate *only* confidence.
## License
The model is released under the original model's Llama 2 Community License Agreement. |