--- license: apache-2.0 language: - en pipeline_tag: text-generation library_name: transformers --- # Granite Uncertainty 3.0 8b ## Model Summary **Granite Uncertainty 3.0 8b** is a LoRA adapter for [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct), adding the capability to provide calibrated certainty scores when answering questions when prompted, in addition to retaining the full abilities of the [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct) model. - **Developer:** IBM Research - **Model type:** LoRA adapter for [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct) - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) ### Model Sources - **Paper:** The **Granite Uncertainty 3.0 8b** model is finetuned to provide certainty scores mimicking the output of a calibrator trained via the method in [[Shen et al. ICML 2024] Thermometer: Towards Universal Calibration for Large Language Models](https://arxiv.org/abs/2403.08819) ## Usage ### Intended use **Granite Uncertainty 3.0 8b** is lightly tuned so that its behavior closely mimics that of [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct), with the added ability to generate certainty scores for answers to questions when prompted. **Certainty score definition** The model will respond with a certainty percentage, quantized to 10 possible values (i.e. 5%, 15%, 25%,...95%). This percentage is *calibrated* in the following sense: given a set of answers assigned a certainty score of X%, approximately X% of these answers should be correct. See the eval experiment below for out-of-distribution verification of this behavior. **Important note** Certainty is inherently an intrinsic property of a model and its abilitities. **Granite Uncertainty 3.0 8b** is not intended to predict the certainty of responses generated by any other models besides itself or [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct). **Usage steps** Answering a question and obtaining a certainty score proceeds as follows. 1. Prompt the model with a system prompt followed by the user prompt. The model is calibrated with the system prompt below. 2. Use the model to generate a response as normal (via the `assistant` role), or insert a response from [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct). 3. Prompt the model to generate a certainty score by generating in the `certainty` role (by appending `<|start_of_role|>certainty<|end_of_role|>` and generating). 4. The model will respond with a certainty percentage, quantized with steps of 10% (i.e. 5%, 15%, 25%,...95%). When not given the certainty generation prompt `<|start_of_role|>certainty<|end_of_role|>`, the model's behavior should mimic that of the base model [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct). **System prompt** The model was calibrated with the following system prompt: `You are an AI language model developed by IBM Research. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.` It is recommended to prepend this string to any other desired system prompts. ### Quickstart Example The following code describes how to use the Granite Uncertainty model to answer questions and obtain intrinsic calibrated certainty scores. Note that a generic system prompt is included, this is not necessary and can be modified as needed. ```python import torch,os from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel, PeftConfig token = os.getenv("HF_MISTRAL_TOKEN") BASE_NAME = "ibm-granite/granite-3.0-8b-instruct" LORA_NAME = "ibm-granite/granite-uncertainty-3.0-8b-lora" device=torch.device('cuda' if torch.cuda.is_available() else 'cpu') # Load model token = os.getenv("HF_MISTRAL_TOKEN") tokenizer = AutoTokenizer.from_pretrained(BASE_NAME,padding_side='left',trust_remote_code=True, token=token) model_base = AutoModelForCausalLM.from_pretrained(BASE_NAME,device_map="auto") model_UQ = PeftModel.from_pretrained(model_base, LORA_NAME) system_prompt = "You are an AI language model developed by IBM Research. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior." question = "What is IBM?" print("Question:" + question) question_chat = [ { "role": "system", "content": system_prompt }, { "role": "user", "content": question }, ] # Generate answer input_text = tokenizer.apply_chat_template(question_chat,tokenize=False,add_generation_prompt=True) inputs = tokenizer(input_text, return_tensors="pt") output = model_UQ.generate(inputs["input_ids"].to(device), attention_mask=inputs["attention_mask"].to(device), max_new_tokens=80) output_text = tokenizer.decode(output[0]) answer = output_text.split("assistant<|end_of_role|>")[1] print("Answer: " + answer) # Generate certainty score uq_generation_prompt = "<|start_of_role|>certainty<|end_of_role|>" uq_chat = [ { "role": "system", "content": system_prompt }, { "role": "user", "content": question }, { "role": "assistant", "content": answer }, ] uq_text = tokenizer.apply_chat_template(uq_chat,tokenize=False) + uq_generation_prompt inputs = tokenizer(uq_text, return_tensors="pt") output = model_UQ.generate(inputs["input_ids"].to(device), attention_mask=inputs["attention_mask"].to(device), max_new_tokens=1) output_text = tokenizer.decode(output[0]) uq_score = int(output_text[-1]) print("Certainty: " + str(5 + uq_score * 10) + "%") ``` ## Training Details The **Granite Uncertainty 3.0 8b** model is a LoRA adapter finetuned to provide certainty scores mimicking the output of a calibrator trained via the method in [[Shen et al. ICML 2024] Thermometer: Towards Universal Calibration for Large Language Models](https://arxiv.org/abs/2403.08819). ### Training Data The following datasets were used for calibration and/or finetuning. * [BigBench](https://huggingface.co/datasets/tasksource/bigbench) * [MRQA](https://huggingface.co/datasets/mrqa-workshop/mrqa) * [newsqa](https://huggingface.co/datasets/lucadiliello/newsqa) * [trivia_qa](https://huggingface.co/datasets/mandarjoshi/trivia_qa) * [search_qa](https://huggingface.co/datasets/lucadiliello/searchqa) * [openbookqa](https://huggingface.co/datasets/allenai/openbookqa) * [web_questions](https://huggingface.co/datasets/Stanford/web_questions) * [smiles-qa](https://huggingface.co/datasets/alxfgh/ChEMBL_Drug_Instruction_Tuning) * [orca-math](https://huggingface.co/datasets/microsoft/orca-math-word-problems-200k) * [ARC-Easy](https://huggingface.co/datasets/allenai/ai2_arc) * [commonsense_qa](https://huggingface.co/datasets/tau/commonsense_qa) * [social_i_qa](https://huggingface.co/datasets/allenai/social_i_qa) * [super_glue](https://huggingface.co/datasets/aps/super_glue) * [figqa](https://huggingface.co/datasets/nightingal3/fig-qa) * [riddle_sense](https://huggingface.co/datasets/INK-USC/riddle_sense) * [ag_news](https://huggingface.co/datasets/fancyzhx/ag_news) * [medmcqa](https://huggingface.co/datasets/openlifescienceai/medmcqa) * [dream](https://huggingface.co/datasets/dataset-org/dream) * [codah](https://huggingface.co/datasets/jaredfern/codah) * [piqa](https://huggingface.co/datasets/ybisk/piqa) ## Evaluation The model was evaluated on the [MMLU](https://huggingface.co/datasets/cais/mmlu) datasets (not used in training). Shown are the [Expected Calibration Error (ECE)](https://towardsdatascience.com/expected-calibration-error-ece-a-step-by-step-visual-explanation-with-python-code-c3e9aa12937d) for each task, for the base model (Granite-3.0-8b-instruct) and Granite-Uncertainty-3.0-8b. The average ECE across tasks is 0.06 (out of 1). Note that this is smaller than the gap between the quantized certainty outputs (10% quantization steps). ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6602ffd971410cf02bf42c06/x0IRS16p59O19r4hwOzyU.png) ## Model Card Authors Kristjan Greenewald