Update README.md
Browse files
README.md
CHANGED
@@ -3,19 +3,17 @@ license: apache-2.0
|
|
3 |
inference: false
|
4 |
---
|
5 |
|
6 |
-
#
|
7 |
|
8 |
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
|
10 |
-
|
11 |
-
|
12 |
-
DRAGON models are fine-tuned with high-quality custom instruct datasets, designed for production use in RAG scenarios.
|
13 |
|
14 |
|
15 |
### Benchmark Tests
|
16 |
|
17 |
Evaluated against the benchmark test: [RAG-Instruct-Benchmark-Tester](https://www.huggingface.co/datasets/llmware/rag_instruct_benchmark_tester)
|
18 |
-
|
19 |
|
20 |
--**Accuracy Score**: **100.0** correct out of 100
|
21 |
--Not Found Classification: 95.0%
|
@@ -32,7 +30,7 @@ For test run results (and good indicator of target use cases), please see the fi
|
|
32 |
<!-- Provide a longer summary of what this model is. -->
|
33 |
|
34 |
- **Developed by:** llmware
|
35 |
-
- **Model type:**
|
36 |
- **Language(s) (NLP):** English
|
37 |
- **License:** Apache 2.0
|
38 |
- **Finetuned from model:** Microsoft Phi-3
|
@@ -63,55 +61,28 @@ without the need for a lot of complex instruction verbiage - provide a text pass
|
|
63 |
|
64 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
65 |
|
|
|
|
|
66 |
Any model can provide inaccurate or incomplete information, and should be used in conjunction with appropriate safeguards and fact-checking mechanisms.
|
67 |
|
68 |
|
69 |
## How to Get Started with the Model
|
70 |
|
71 |
-
|
72 |
-
|
73 |
-
from transformers import AutoTokenizer, AutoModelForCausalLM
|
74 |
-
tokenizer = AutoTokenizer.from_pretrained("bling-phi-2-v0", trust_remote_code=True)
|
75 |
-
model = AutoModelForCausalLM.from_pretrained("bling-phi-2-v0", trust_remote_code=True)
|
76 |
-
|
77 |
-
Please refer to the generation_test .py files in the Files repository, which includes 200 samples and script to test the model. The **generation_test_llmware_script.py** includes built-in llmware capabilities for fact-checking, as well as easy integration with document parsing and actual retrieval to swap out the test set for RAG workflow consisting of business documents.
|
78 |
-
|
79 |
-
The dRAGon model was fine-tuned with a simple "\<human> and \<bot> wrapper", so to get the best results, wrap inference entries as:
|
80 |
-
|
81 |
-
full_prompt = "<human>: " + my_prompt + "\n" + "<bot>:"
|
82 |
-
|
83 |
-
The BLING model was fine-tuned with closed-context samples, which assume generally that the prompt consists of two sub-parts:
|
84 |
-
|
85 |
-
1. Text Passage Context, and
|
86 |
-
2. Specific question or instruction based on the text passage
|
87 |
-
|
88 |
-
To get the best results, package "my_prompt" as follows:
|
89 |
-
|
90 |
-
my_prompt = {{text_passage}} + "\n" + {{question/instruction}}
|
91 |
-
|
92 |
-
|
93 |
-
If you are using a HuggingFace generation script:
|
94 |
-
|
95 |
-
# prepare prompt packaging used in fine-tuning process
|
96 |
-
new_prompt = "<human>: " + entries["context"] + "\n" + entries["query"] + "\n" + "<bot>:"
|
97 |
-
|
98 |
-
inputs = tokenizer(new_prompt, return_tensors="pt")
|
99 |
-
start_of_output = len(inputs.input_ids[0])
|
100 |
|
101 |
-
|
102 |
-
|
|
|
|
|
103 |
|
104 |
-
|
105 |
-
|
106 |
-
|
107 |
-
|
108 |
-
|
109 |
-
temperature=0.3,
|
110 |
-
max_new_tokens=100,
|
111 |
-
)
|
112 |
|
113 |
-
output_only = tokenizer.decode(outputs[0][start_of_output:],skip_special_tokens=True)
|
114 |
|
|
|
115 |
|
116 |
## Model Card Contact
|
117 |
|
|
|
3 |
inference: false
|
4 |
---
|
5 |
|
6 |
+
# bling-phi-3-gguf
|
7 |
|
8 |
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
|
10 |
+
bling-phi-3-gguf is part of the BLING ("Best Little Instruct No-GPU") model series, RAG-instruct trained for fact-based question-answering use cases on top of a Microsoft Phi-3 base model.
|
|
|
|
|
11 |
|
12 |
|
13 |
### Benchmark Tests
|
14 |
|
15 |
Evaluated against the benchmark test: [RAG-Instruct-Benchmark-Tester](https://www.huggingface.co/datasets/llmware/rag_instruct_benchmark_tester)
|
16 |
+
1 Test Run (with temperature = 0.0 and sample = False) with 1 point for correct answer, 0.5 point for partial correct or blank / NF, 0.0 points for incorrect, and -1 points for hallucinations.
|
17 |
|
18 |
--**Accuracy Score**: **100.0** correct out of 100
|
19 |
--Not Found Classification: 95.0%
|
|
|
30 |
<!-- Provide a longer summary of what this model is. -->
|
31 |
|
32 |
- **Developed by:** llmware
|
33 |
+
- **Model type:** bling-rag-instruct
|
34 |
- **Language(s) (NLP):** English
|
35 |
- **License:** Apache 2.0
|
36 |
- **Finetuned from model:** Microsoft Phi-3
|
|
|
61 |
|
62 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
63 |
|
64 |
+
BLING models are designed to operate with grounded sources, e.g., inclusion of a context passage in the prompt, and will not yield consistent or positive results if open-context prompting in which you are looking for the model to draw upon potential background knowledge of the world - in fact, it is likely that the BLING will respond with a simple "Not Found." to an open context query.
|
65 |
+
|
66 |
Any model can provide inaccurate or incomplete information, and should be used in conjunction with appropriate safeguards and fact-checking mechanisms.
|
67 |
|
68 |
|
69 |
## How to Get Started with the Model
|
70 |
|
71 |
+
To pull the model via API:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
72 |
|
73 |
+
from huggingface_hub import snapshot_download
|
74 |
+
snapshot_download("llmware/bling-phi-3-gguf", local_dir="/path/on/your/machine/", local_dir_use_symlinks=False)
|
75 |
+
|
76 |
+
Load in your favorite GGUF inference engine, or try with llmware as follows:
|
77 |
|
78 |
+
from llmware.models import ModelCatalog
|
79 |
+
|
80 |
+
# to load the model and make a basic inference
|
81 |
+
model = ModelCatalog().load_model("llmware/bling-phi-3-gguf", temperature=0.0, sample=False)
|
82 |
+
response = model.function_call(text_sample)
|
|
|
|
|
|
|
83 |
|
|
|
84 |
|
85 |
+
Details on the prompt wrapper and other configurations are on the config.json file in the files repository.
|
86 |
|
87 |
## Model Card Contact
|
88 |
|