Text Classification
Transformers
Safetensors
English
HHEMv2Config
custom_code
File size: 5,952 Bytes
51d013c
 
ada8548
 
dac9433
ada8548
dac9433
 
 
 
 
ada8548
dac9433
 
 
b67685b
389b9a2
 
0be3b8b
51d013c
9afe510
ad8e13b
 
 
9afe510
 
ad8e13b
9afe510
 
d5ff4ed
845f97b
 
1a1b26d
d5ff4ed
739923d
 
 
41ca872
 
 
 
 
 
 
 
 
250806e
9afe510
905f6e4
d5ff4ed
9afe510
 
8692292
fe02e8c
ad8e13b
d5ff4ed
 
 
 
 
c1c12b0
 
d5ff4ed
 
 
8692292
d5ff4ed
ad8e13b
9afe510
 
905f6e4
 
 
9afe510
250806e
8692292
9afe510
 
 
8692292
9afe510
fe02e8c
 
9afe510
d5ff4ed
 
 
 
 
 
c1c12b0
 
d5ff4ed
 
 
9afe510
 
 
d5ff4ed
 
ad8e13b
d5ff4ed
 
 
8692292
d5ff4ed
ad8e13b
9afe510
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
license: apache-2.0
language: en
tags:
- microsoft/deberta-v3-base
datasets:
- multi_nli
- snli
- fever
- tals/vitaminc
- paws
metrics:
- accuracy
- auc
- balanced accuracy
pipeline_tag: text-classification
widget:
- text: "A man walks into a bar and buys a drink [SEP] A bloke swigs alcohol at a pub"
  example_title: "Positive"
---
# Cross-Encoder for Hallucination Detection
This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class. 
The model outputs a probabilitity from 0 to 1, 0 being a hallucination and 1 being factually consistent. 
The predictions can be thresholded at 0.5 to predict whether a document is consistent with its source.

## Training Data
This model is based on [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) and is trained initially on NLI data to determine textual entailment, before being further fine tuned on summarization datasets with samples annotated for factual consistency including [FEVER](https://huggingface.co/datasets/fever), [Vitamin C](https://huggingface.co/datasets/tals/vitaminc) and [PAWS](https://huggingface.co/datasets/paws).

## Performance

* [TRUE Dataset](https://arxiv.org/pdf/2204.04991.pdf) (Minus Vitamin C, FEVER and PAWS) - 0.872 AUC Score
* [SummaC Benchmark](https://aclanthology.org/2022.tacl-1.10.pdf) (Test Split) - 0.764 Balanced Accuracy, 0.831 AUC Score
* [AnyScale Ranking Test for Hallucinations](https://www.anyscale.com/blog/llama-2-is-about-as-factually-accurate-as-gpt-4-for-summaries-and-is-30x-cheaper) - 86.6 % Accuracy

## Results (Leaderboard)
If you want to stay up to date with results of the latest tests using this model, a public leaderboard is maintained and periodically updated on the [vectara/hallucination-leaderboard](https://github.com/vectara/hallucination-leaderboard) GitHub repository.

## Note about using the Inference API Widget on the Right
To use the model with the widget, you need to pass both documents as a single string separated with [SEP]. For example:

* A man walks into a bar and buys a drink [SEP] A bloke swigs alcohol at a pub
* A person on a horse jumps over a broken down airplane. [SEP] A person is at a diner, ordering an omelette.
* A person on a horse jumps over a broken down airplane. [SEP] A person is outdoors, on a horse.

etc. See examples below for expected probability scores.

## Usage with Sentencer Transformers (Recommended)

The model can be used like this, on pairs of documents, passed as a list of list of strings (```List[List[str]]]```):

```python
from sentence_transformers import CrossEncoder

model = CrossEncoder('vectara/hallucination_evaluation_model')
scores = model.predict([
    ["A man walks into a bar and buys a drink", "A bloke swigs alcohol at a pub"],
    ["A person on a horse jumps over a broken down airplane.", "A person is at a diner, ordering an omelette."],
    ["A person on a horse jumps over a broken down airplane.", "A person is outdoors, on a horse."],
    ["A boy is jumping on skateboard in the middle of a red bridge.", "The boy skates down the sidewalk on a blue bridge"],
    ["A man with blond-hair, and a brown shirt drinking out of a public water fountain.", "A blond drinking water in public."],
    ["A man with blond-hair, and a brown shirt drinking out of a public water fountain.", "A blond man wearing a brown shirt is reading a book."],
    ["Mark Wahlberg was a fan of Manny.", "Manny was a fan of Mark Wahlberg."],  
])
```

This returns a numpy array representing a factual consistency score. A score < 0.5 indicates a likely hallucination):
```
array([0.61051559, 0.00047493709, 0.99639291, 0.00021221573, 0.99599433, 0.0014127002, 0.002.8262993], dtype=float32)
```

Note that the model is designed to work with entire documents, so long as they fit into the 512 token context window (across both documents). 
Also note that the order of the documents is important, the first document is the source document, and the second document is validated against the first for factual consistency, e.g. as a summary of the first or a claim drawn from the source.

## Usage with Transformers AutoModel
You can use the model also directly with Transformers library (without the SentenceTransformers library):

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import numpy as np

model = AutoModelForSequenceClassification.from_pretrained('vectara/hallucination_evaluation_model')
tokenizer = AutoTokenizer.from_pretrained('vectara/hallucination_evaluation_model')

pairs = [
    ["A man walks into a bar and buys a drink", "A bloke swigs alcohol at a pub"],
    ["A person on a horse jumps over a broken down airplane.", "A person is at a diner, ordering an omelette."],
    ["A person on a horse jumps over a broken down airplane.", "A person is outdoors, on a horse."],
    ["A boy is jumping on skateboard in the middle of a red bridge.", "The boy skates down the sidewalk on a blue bridge"],
    ["A man with blond-hair, and a brown shirt drinking out of a public water fountain.", "A blond drinking water in public."],
    ["A man with blond-hair, and a brown shirt drinking out of a public water fountain.", "A blond man wearing a brown shirt is reading a book."],
    ["Mark Wahlberg was a fan of Manny.", "Manny was a fan of Mark Wahlberg."], 
]

inputs = tokenizer.batch_encode_plus(pairs, return_tensors='pt', padding=True)

model.eval()
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits.cpu().detach().numpy()
    # convert logits to probabilities
    scores = 1 / (1 + np.exp(-logits)).flatten()
```

This returns a numpy array representing a factual consistency score. A score < 0.5 indicates a likely hallucination):
```
array([0.61051559, 0.00047493709, 0.99639291, 0.00021221573, 0.99599433, 0.0014127002, 0.002.8262993], dtype=float32)
```