al1231 commited on
Commit
6e3595d
·
1 Parent(s): fc86eec
README.md CHANGED
@@ -1,3 +1,87 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ base_model: meta-llama/Llama-2-7b-chat-hf
4
+ language:
5
+ - en
6
+ license: mit
7
+ pipeline_tag: text-generation
8
+ ---
9
+
10
+ # Refiner-7B
11
+
12
+ Restructure Retrieved Content Efficiently to Advance Question-Answering Capabilities.
13
+
14
+
15
+
16
+ ## TL;DR
17
+ _Refiner_ is an end-to-end extract-and-restructure paradigm that incorporates query-relevant contents, contexts, and sectionalizes interconnected information, ensuring information distinction and alignment with the original context. _Refiner_ achieves a 80.5% tokens reduction and a 1.6-7.0% improvement margin in
18
+ multi-hop tasks, rivaling with next best solution, [LongLLMLingua](https://arxiv.org/abs/2310.06839).
19
+
20
+ **How _Refiner_ Works**
21
+ _Refiner_ integrates in post-retrieval process seamlessly with RAG systems, leveraging a single decoder-only LLM to:
22
+
23
+ * **Adaptively extract query-relevant contents**: Verbatim extraction of necessary context and sectioning of interconnected contents.
24
+ * **Preserve information distinction**: Highlights contextual relationships, ensuring effective representation of original context.
25
+
26
+ **Benefits**
27
+ * **Improved answer accuracy**: Significant gain in downstream LLM performance.
28
+ * **Efficient compression**: Up to 80%+ token reduction.
29
+
30
+
31
+ ### Model Detail
32
+ <!-- Provide the basic links for the model. -->
33
+ This repository contains PEFT adapter fine-tuned on Llama2-7B-Chat.
34
+ Check out [Model Repository](https://github.com/allen-li1231/refiner-rag), [Paper](http://arxiv.org/abs/2406.11357)
35
+
36
+ ## Usage
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+ ```python
40
+ !pip install -qU transformers accelerate
41
+
42
+ from transformers import AutoTokenizer, AutoModelForCausalLM
43
+ from peft.peft_model import PeftModel
44
+
45
+
46
+ base_model = "meta-llama/Llama-2-7b-chat-hf"
47
+ adapter = "al1231/Refiner-7B"
48
+ TEMPLATE = "[INST]<<SYS>>[MONITOR]{context}<</SYS>>{question}[/INST] "
49
+
50
+ tokenizer = AutoTokenizer.from_pretrained(base_model)
51
+ base_model = AutoModelForCausalLM.from_pretrained(base_model, device_map="auto")
52
+ model = PeftModel.from_pretrained(base_model, adapter, is_trainable=False)
53
+ model.eval()
54
+
55
+ question = "What is John Finlay's occupation?"
56
+ context = "## John Finlay (footballer)\nJohn Finlay (16 February 1919 – 5 March 1985) was an English professional footballer who played as an inside forward for Sunderland. John Finlay made his debut on the 11th of September 1946 as a substitute for Sunderland AFC on their 4th match of the season in Division 1 against Charlton Athletic. 27,425 attended the match witnessing John's Debut. The Match which started at 6:00pm at Charlton Athletic's stadium “The Valley” was refereed by W.H.E Evans\n---\n## Andrew Finlay\nAndrew Finlay (born 10 February 1901; date of death unknown) was a Scottish footballer who played as a forward for Port Vale, Airdrieonians, Manchester City, Crewe Alexandra, Third Lanark, Dundee United and Hibernian in the 1920s....Source:\n---\n## John Finlay (poet)\nJohn Finlay (1782–1810) was a Scottish poet. Finlay was born in Glasgow in December 1782. He was educated in one of the academies at Glasgow, and at the age of fourteen entered the university, where he had as a classmate John Wilson (alias 'Christopher North'), who states that he was distinguished \"above most of his contemporaries\". The prospect of obtaining a situation in one of the public offices led him to visit London in 1807, and while there he contributed to the magazines some articles on antiquarian subjects. Not finding suitable employment he returned to Glasgow in 1808. He began to collect materials for a continuation of Warton's History of Poetry, but in 1810 he left Glasgow to visit Professor Wilson at Ellerlay, Westmoreland; on the way he fell ill at Moffat, and died there on 8 December.\n---\n## John Finlay (Canadian politician)\nJohn Finlay (April 22, 1837 &ndash; November 13, 1910) was a Canadian politician. Born in Dummer Township, Peterborough County, Upper Canada, Finlay was educated in the Public Schools of Dummer. A manufacturer, Finlay was Councillor and Reeve of the Village of Norwood and County Councillor. He was elected to the House of Commons of Canada for the electoral district of Peterborough East in the general elections of 1904. A Liberal, he did not run in the 1908 elections.\n---\n## John Finlay (fur trader)\nJohn Finlay (1774 – December 19, 1833) was a fur trader and explorer with the North West Company. He is best remembered for establishing the first fur trading post in what is now British Columbia, Canada and for his exploration of the Finlay River, one of the two major rivers forming the Peace River. Finlay was born in Montreal, the son of James Finlay, who himself was a significant player in the western Canadian fur trade. Finlay was apprenticed as a clerk in the North West Company in 1789 at the age of 15. He accompanied Alexander Mackenzie on his historic trip across the Rocky Mountains to the Pacific Ocean in 1792-93 becoming, with him, the first European to traverse North America. He was placed...Finlay in 1824, noting that \"he had studied Finlay’s chart.\" Nonetheless, it would appear from the information Black had that Finlay had only made it as far as the Ingenika River, about 130 km north of the Finlay River's confluence with the Peace. Indeed, Black's journal makes clear that the northern branch, far from being less complicated, was all but impassable in many parts, perhaps explaining Finlay's reluctance to travel more than about one-quarter of the river's actual length. Finlay remained in the North West Company's Athabasca Department, becoming a partner of the company in 1799. He retired from the fur trade in 1804 and returned to Montreal. Little is known of his life there, except that he obtained an appointment as deputy commissary-general."
57
+
58
+ prompt = TEMPLATE.format(question=question, context=context)
59
+
60
+ inputs = tokenizer(prompt, return_tensors="pt")
61
+
62
+ preds = model.generate(
63
+ **inputs.to(model.device),
64
+ top_p=1,
65
+ temperature=None,
66
+ do_sample=False,
67
+ max_new_tokens=2048,
68
+ num_return_sequences=1,
69
+ output_scores=True,
70
+ return_dict_in_generate=True,
71
+ use_cache=True)
72
+ pred_token_ids = preds.sequences[:, inputs.input_ids.shape[1]:]
73
+ pred_text = tokenizer.batch_decode(pred_token_ids)
74
+ print(pred_text)
75
+ ```
76
+
77
+ ## Citation
78
+ ```cite
79
+ @misc{li2024textitrefiner,
80
+ title={$\textit{Refiner}$: Restructure Retrieval Content Efficiently to Advance Question-Answering Capabilities},
81
+ author={Zhonghao Li and Xuming Hu and Aiwei Liu and Kening Zheng and Sirui Huang and Hui Xiong},
82
+ year={2024},
83
+ eprint={2406.11357},
84
+ archivePrefix={arXiv},
85
+ primaryClass={id='cs.CL' full_name='Computation and Language' is_active=True alt_name='cmp-lg' in_archive='cs' is_general=False description='Covers natural language processing. Roughly includes material in ACM Subject Class I.2.7. Note that work on artificial languages (programming languages, logics, formal systems) that does not explicitly address natural-language issues broadly construed (natural-language processing, computational linguistics, speech, text retrieval, etc.) is not appropriate for this area.'}
86
+ }
87
+ ```
adapter_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "meta-llama/Llama-2-7b-chat-hf",
5
+ "bias": "none",
6
+ "fan_in_fan_out": false,
7
+ "inference_mode": true,
8
+ "init_lora_weights": true,
9
+ "layer_replication": null,
10
+ "layers_pattern": null,
11
+ "layers_to_transform": null,
12
+ "loftq_config": {},
13
+ "lora_alpha": 16.0,
14
+ "lora_dropout": 0.1,
15
+ "megatron_config": null,
16
+ "megatron_core": "megatron.core",
17
+ "modules_to_save": null,
18
+ "peft_type": "LORA",
19
+ "r": 64,
20
+ "rank_pattern": {},
21
+ "revision": null,
22
+ "target_modules": [
23
+ "v_proj",
24
+ "q_proj"
25
+ ],
26
+ "task_type": "CAUSAL_LM",
27
+ "use_dora": false,
28
+ "use_rslora": false
29
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "</s>",
17
+ "unk_token": {
18
+ "content": "<unk>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "added_tokens_decoder": {
5
+ "0": {
6
+ "content": "<unk>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "1": {
14
+ "content": "<s>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "2": {
22
+ "content": "</s>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ }
29
+ },
30
+ "bos_token": "<s>",
31
+ "chat_template": "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = false %}{% endif %}{% for message in loop_messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if loop.index0 == 0 and system_message != false %}{% set content = '<<SYS>>\\n' + system_message + '\\n<</SYS>>\\n\\n' + message['content'] %}{% else %}{% set content = message['content'] %}{% endif %}{% if message['role'] == 'user' %}{{ bos_token + '[INST] ' + content.strip() + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ ' ' + content.strip() + ' ' + eos_token }}{% endif %}{% endfor %}",
32
+ "clean_up_tokenization_spaces": false,
33
+ "eos_token": "</s>",
34
+ "legacy": false,
35
+ "model_max_length": 1000000000000000019884624838656,
36
+ "pad_token": "</s>",
37
+ "padding_side": "left",
38
+ "sp_model_kwargs": {},
39
+ "tokenizer_class": "LlamaTokenizer",
40
+ "unk_token": "<unk>",
41
+ "use_default_system_prompt": false
42
+ }