doberst commited on
Commit
1cdfefd
·
verified ·
1 Parent(s): ce892e3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -12
README.md CHANGED
@@ -1,28 +1,26 @@
1
  ---
2
- license: apache-2.0
 
3
  ---
4
 
5
- # Model Card for Model ID
6
 
7
- <!-- Provide a quick summary of what the model is/does. -->
8
 
9
- **bling-answer-tool** is a quantized version of BLING Tiny-Llama 1B, with 4_K_M GGUF quantization, providing a very fast, very small inference implementation for use on CPUs.
10
-
11
- [**bling-tiny-llama**](https://huggingface.co/llmware/bling-tiny-llama-v0) is a fact-based question-answering model, optimized for complex business documents.
12
 
13
  To pull the model via API:
14
 
15
  from huggingface_hub import snapshot_download
16
- snapshot_download("llmware/bling-answer-tool", local_dir="/path/on/your/machine/", local_dir_use_symlinks=False)
17
 
18
 
19
  Load in your favorite GGUF inference engine, or try with llmware as follows:
20
 
21
  from llmware.models import ModelCatalog
22
- model = ModelCatalog().load_model("bling-answer-tool")
23
  response = model.inference(query, add_context=text_sample)
24
 
25
- Note: please review [**config.json**](https://huggingface.co/llmware/bling-answer-tool/blob/main/config.json) in the repository for prompt wrapping information, details on the model, and full test set.
26
 
27
 
28
  ### Model Description
@@ -32,9 +30,7 @@ Note: please review [**config.json**](https://huggingface.co/llmware/bling-answe
32
  - **Developed by:** llmware
33
  - **Model type:** GGUF
34
  - **Language(s) (NLP):** English
35
- - **License:** Apache 2.0
36
- - **Quantized from model:** [llmware/bling-tiny-llama](https://huggingface.co/llmware/bling-tiny-llama-v0/)
37
-
38
 
39
  ## Model Card Contact
40
 
 
1
  ---
2
+ license: apache-2.0
3
+ inference: false
4
  ---
5
 
6
+ BLING-QWEN-NANO-TOOL
7
 
 
8
 
9
+ **bling-qwen-nano-tool** is a RAG-finetuned version on Qwen2-0.5B for use in fact-based context question-answering, packaged with 4_K_M GGUF quantization, providing a very fast, very small inference implementation for use on CPUs.
 
 
10
 
11
  To pull the model via API:
12
 
13
  from huggingface_hub import snapshot_download
14
+ snapshot_download("llmware/bling-qwen-nano-tool", local_dir="/path/on/your/machine/", local_dir_use_symlinks=False)
15
 
16
 
17
  Load in your favorite GGUF inference engine, or try with llmware as follows:
18
 
19
  from llmware.models import ModelCatalog
20
+ model = ModelCatalog().load_model("bling-qwen-nano-tool")
21
  response = model.inference(query, add_context=text_sample)
22
 
23
+ Note: please review [**config.json**](https://huggingface.co/llmware/bling-qwen-nano-tool/blob/main/config.json) in the repository for prompt wrapping information, details on the model, and full test set.
24
 
25
 
26
  ### Model Description
 
30
  - **Developed by:** llmware
31
  - **Model type:** GGUF
32
  - **Language(s) (NLP):** English
33
+ - **License:** Apache 2.0
 
 
34
 
35
  ## Model Card Contact
36