Text Classification
Transformers
Safetensors
English
HHEMv2Config
custom_code
Miaoran000 commited on
Commit
d6fdaee
·
1 Parent(s): 6f7b340

update readme

Browse files
Files changed (1) hide show
  1. README.md +57 -2
README.md CHANGED
@@ -18,7 +18,11 @@ By "hallucinated" or "factually inconsistent", we mean that a text (hypothesis,
18
  A common type of hallucination in RAG is **factual but hallucinated**.
19
  For example, given the premise _"The capital of France is Berlin"_, the hypothesis _"The capital of France is Paris"_ is hallucinated -- although it is true in the world knowledge. This happens when LLMs do not generate content based on the textual data provided to them as part of the RAG retrieval process, but rather generate content based on their pre-trained knowledge.
20
 
21
- ## Using HHEM-2.1-Open
 
 
 
 
22
 
23
  HHEM-2.1-Open can be loaded easily using the `transformers` library. Just remember to set `trust_remote_code=True` to take advantage of the pre-/post-processing code we provided for your convenience. The **input** of the model is a list of pairs of (premise, hypothesis). For each pair, the model will **return** a score between 0 and 1, where 0 means that the hypothesis is not evidenced at all by the premise and 1 means the hypothesis is fully supported by the premise.
24
 
@@ -44,7 +48,58 @@ model.predict(pairs) # note the predict() method. Do not do model(pairs).
44
  # tensor([0.0111, 0.6474, 0.1290, 0.8969, 0.1846, 0.0050, 0.0543])
45
  ```
46
 
47
- Note that the order of a pair is important. For example, notice how the 2nd and 3rd examples in the `pairs` list are consistent and hallcuianted, respectively.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
 
50
  ## HHEM-2.1-Open vs. HHEM-1.0
 
18
  A common type of hallucination in RAG is **factual but hallucinated**.
19
  For example, given the premise _"The capital of France is Berlin"_, the hypothesis _"The capital of France is Paris"_ is hallucinated -- although it is true in the world knowledge. This happens when LLMs do not generate content based on the textual data provided to them as part of the RAG retrieval process, but rather generate content based on their pre-trained knowledge.
20
 
21
+ ## Using HHEM-2.1-Open with `transformers`
22
+
23
+ HHEM-2.1 has some breaking change from HHEM-1.0. Your previous code will not work anymore. While we are working on backward compatibility, please follow the new usage instructions below.
24
+
25
+ **Using with `Auto` class**
26
 
27
  HHEM-2.1-Open can be loaded easily using the `transformers` library. Just remember to set `trust_remote_code=True` to take advantage of the pre-/post-processing code we provided for your convenience. The **input** of the model is a list of pairs of (premise, hypothesis). For each pair, the model will **return** a score between 0 and 1, where 0 means that the hypothesis is not evidenced at all by the premise and 1 means the hypothesis is fully supported by the premise.
28
 
 
48
  # tensor([0.0111, 0.6474, 0.1290, 0.8969, 0.1846, 0.0050, 0.0543])
49
  ```
50
 
51
+
52
+ **Using with `text-classification` pipeline**
53
+
54
+ Please note that when using `text-classification` pipeline for prediction, scores for two labels will be returned for each pair. The score for **consistent** label is the one that should be focused on.
55
+
56
+ ```python
57
+ from transformers import pipeline, AutoTokenizer
58
+
59
+ pairs = [
60
+ ("The capital of France is Berlin.", "The capital of France is Paris."),
61
+ ('I am in California', 'I am in United States.'),
62
+ ('I am in United States', 'I am in California.'),
63
+ ("A person on a horse jumps over a broken down airplane.", "A person is outdoors, on a horse."),
64
+ ("A boy is jumping on skateboard in the middle of a red bridge.", "The boy skates down the sidewalk on a red bridge"),
65
+ ("A man with blond-hair, and a brown shirt drinking out of a public water fountain.", "A blond man wearing a brown shirt is reading a book."),
66
+ ("Mark Wahlberg was a fan of Manny.", "Manny was a fan of Mark Wahlberg.")
67
+ ]
68
+
69
+ # Apply prompt to pairs
70
+ prompt = "<pad> Determine if the hypothesis is true given the premise?\n\nPremise: {text1}\n\nHypothesis: {text2}"
71
+ input_pairs = [prompt.format(text1=pair[0], text2=pair[1]) for pair in pairs]
72
+
73
+ # Use text-classification pipeline to predict
74
+ classifier = pipeline(
75
+ "text-classification",
76
+ model='vectara/hallucination_evaluation_model',
77
+ tokenizer=AutoTokenizer.from_pretrained('google/flan-t5-base'),
78
+ trust_remote_code=True
79
+ )
80
+ classifier(input_pairs, return_all_scores=True)
81
+
82
+ # output
83
+
84
+ # [[{'label': 'hallucinated', 'score': 0.9889384508132935},
85
+ # {'label': 'consistent', 'score': 0.011061512865126133}],
86
+ # [{'label': 'hallucinated', 'score': 0.35263675451278687},
87
+ # {'label': 'consistent', 'score': 0.6473632454872131}],
88
+ # [{'label': 'hallucinated', 'score': 0.870982825756073},
89
+ # {'label': 'consistent', 'score': 0.1290171593427658}],
90
+ # [{'label': 'hallucinated', 'score': 0.1030581071972847},
91
+ # {'label': 'consistent', 'score': 0.8969419002532959}],
92
+ # [{'label': 'hallucinated', 'score': 0.8153750896453857},
93
+ # {'label': 'consistent', 'score': 0.18462494015693665}],
94
+ # [{'label': 'hallucinated', 'score': 0.9949689507484436},
95
+ # {'label': 'consistent', 'score': 0.005031010136008263}],
96
+ # [{'label': 'hallucinated', 'score': 0.9456764459609985},
97
+ # {'label': 'consistent', 'score': 0.05432349815964699}]]
98
+ ```
99
+
100
+ You may run into a warning message that "Token indices sequence length is longer than the specified maximum sequence length". Please ignore this warning for now. It is a notification inherited from the foundation, T5-base.
101
+
102
+ Note that the order of a pair is important. For example, the 2nd and 3rd examples in the `pairs` list are consistent and hallucinated, respectively.
103
 
104
 
105
  ## HHEM-2.1-Open vs. HHEM-1.0