flyingfishinwater
/

good_and_small_models

GGUF

Inference Endpoints

conversational

Model card Files Files and versions Community

flyingfishinwater commited on Apr 22, 2024

Commit

c973f13

verified ·

1 Parent(s): 446ae9a

Update README.md

Browse files

Files changed (1) hide show

README.md +59 -20

README.md CHANGED Viewed

@@ -22,10 +22,11 @@ Llama 3 is the latest and most advanced LLM trained over 15T tokens, which impro
 **Prompt Format:**
 ```
-<|begin_of_text|><|start_header_id|>user<|end_header_id|>
 {{prompt}}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
 ```
@@ -97,7 +98,7 @@ The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens.
 **Prompt Format:**
 ```
-<|system|>You are a friendly chatbot who always responds in the style of a pirate.</s><|user|>{{prompt}}</s><|assistant|>
 ```
 **Template Name:** TinyLlama
@@ -146,12 +147,12 @@ The Mistral-7B-v0.2 Large Language Model (LLM) is a pretrained generative text m
 ---
-# OpenChat 3.5
 OpenChat is an innovative library of open-source language models, fine-tuned with C-RLFT - a strategy inspired by offline reinforcement learning. Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with ChatGPT, even with a 7B model. Despite our simple approach, we are committed to developing a high-performance, commercially viable, open-source large language model, and we continue to make significant strides toward this vision.
 **Model Intention:** It's a 7B large model and performs really good for Q&A. But it requires a high-end device to run.
-**Model URL:** [https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/openchat-3.5-1210.Q8_0.gguf?download=true](https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/openchat-3.5-1210.Q8_0.gguf?download=true)
 **Model Info URL:** [https://huggingface.co/openchat/openchat_3.5](https://huggingface.co/openchat/openchat_3.5)
@@ -161,13 +162,13 @@ OpenChat is an innovative library of open-source language models, fine-tuned wit
 **Developer:** [https://openchat.team/](https://openchat.team/)
-**File Size:** 7695 MB
-**Context Length:** 4096 tokens
 **Prompt Format:**
 ```
-<s>[INST]{{prompt}}[/INST]</s>
 ```
 **Template Name:** Mistral
@@ -186,7 +187,7 @@ Phi-2 is a Transformer with 2.7 billion parameters. It was trained using the sam
 **Model Intention:** It's a 2.7B model and is intended for QA, chat, and code purposes
-**Model URL:** [https://huggingface.co/ggml-org/models/resolve/main/phi-2/ggml-model-q8_0.gguf?download=true](https://huggingface.co/ggml-org/models/resolve/main/phi-2/ggml-model-q8_0.gguf?download=true)
 **Model Info URL:** [https://huggingface.co/microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
@@ -196,9 +197,9 @@ Phi-2 is a Transformer with 2.7 billion parameters. It was trained using the sam
 **Developer:** [https://huggingface.co/microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
-**File Size:** 2960 MB
-**Context Length:** 4096 tokens
 **Prompt Format:**
 ```
@@ -222,7 +223,7 @@ The Yi series models are the next generation of open-source large language model
 **Model Intention:** It's a 6B model and can understand English and Chinese. It's good for QA and Chat
-**Model URL:** [https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/yi-6b-chat-Q8_0.gguf?download=true](https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/yi-6b-chat-Q8_0.gguf?download=true)
 **Model Info URL:** [https://huggingface.co/01-ai/Yi-6B-Chat](https://huggingface.co/01-ai/Yi-6B-Chat)
@@ -232,9 +233,9 @@ The Yi series models are the next generation of open-source large language model
 **Developer:** [https://01.ai/](https://01.ai/)
-**File Size:** 6440 MB
-**Context Length:** 200000 tokens
 **Prompt Format:**
 ```
@@ -297,7 +298,7 @@ Gemma is a family of lightweight, state-of-the-art open models built from the sa
 # StarCoder2 3B
 StarCoder2-3B model is a 3B parameter model trained on 17 programming languages from The Stack v2, with opt-out requests excluded. The model uses Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and was trained using the Fill-in-the-Middle objective on 3+ trillion tokens
-**Model Intention:** The model is good at 17 programming languages. It can help you resolve programming requirements
 **Model URL:** [https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/starcoder2-3b-instruct-gguf_Q8_0.gguf?download=true](https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/starcoder2-3b-instruct-gguf_Q8_0.gguf?download=true)
@@ -311,12 +312,11 @@ StarCoder2-3B model is a 3B parameter model trained on 17 programming languages
 **File Size:** 3220 MB
-**Context Length:** 8192 tokens
 **Prompt Format:**
 ```
-### Instruction
-{{prompt}}### Response
 ```
@@ -368,6 +368,45 @@ Chinese Tiny LLM 2B 是首个以中文为中心的大型语言模型，主要在
 **Parse Special Tokens:** Yes
 ---
 # Dophin 2.8 Mistralv02 7B
@@ -387,11 +426,11 @@ This model is based on Mistral-7b-v0.2 with 16k context lengths. It's a uncensor
 **File Size:** 2728 MB
-**Context Length:** 16384 tokens
 **Prompt Format:**
 ```
-<|im_start|>user
 {{prompt}}
 <|im_end|>
 <|im_start|>assistant
@@ -444,4 +483,4 @@ ASSISTANT:
 **Parse Special Tokens:** Yes
----

 **Prompt Format:**
 ```
+<|start_header_id|>user<|end_header_id|>
 {{prompt}}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
+assistant
 ```
 **Prompt Format:**
 ```
+<|user|>{{prompt}}</s><|assistant|>
 ```
 **Template Name:** TinyLlama
 ---
+# OpenChat 3.5(0106)
 OpenChat is an innovative library of open-source language models, fine-tuned with C-RLFT - a strategy inspired by offline reinforcement learning. Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with ChatGPT, even with a 7B model. Despite our simple approach, we are committed to developing a high-performance, commercially viable, open-source large language model, and we continue to make significant strides toward this vision.
 **Model Intention:** It's a 7B large model and performs really good for Q&A. But it requires a high-end device to run.
+**Model URL:** [https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/openchat-3.5-0106.Q3_K_M.gguf?download=true](https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/openchat-3.5-0106.Q3_K_M.gguf?download=true)
 **Model Info URL:** [https://huggingface.co/openchat/openchat_3.5](https://huggingface.co/openchat/openchat_3.5)
 **Developer:** [https://openchat.team/](https://openchat.team/)
+**File Size:** 3520 MB
+**Context Length:** 8192 tokens
 **Prompt Format:**
 ```
+GPT4 Correct User: {{prompt}}<|end_of_turn|>GPT4 Correct Assistant:
 ```
 **Template Name:** Mistral
 **Model Intention:** It's a 2.7B model and is intended for QA, chat, and code purposes
+**Model URL:** [https://huggingface.co/ggml-org/models/resolve/main/phi-2.Q5_K_M.gguf?download=true](https://huggingface.co/ggml-org/models/resolve/main/phi-2.Q5_K_M.gguf?download=true)
 **Model Info URL:** [https://huggingface.co/microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
 **Developer:** [https://huggingface.co/microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
+**File Size:** 2070 MB
+**Context Length:** 2048 tokens
 **Prompt Format:**
 ```
 **Model Intention:** It's a 6B model and can understand English and Chinese. It's good for QA and Chat
+**Model URL:** [https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/yi-chat-6b.Q4_K_M.gguf?download=true](https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/yi-chat-6b.Q4_K_M.gguf?download=true)
 **Model Info URL:** [https://huggingface.co/01-ai/Yi-6B-Chat](https://huggingface.co/01-ai/Yi-6B-Chat)
 **Developer:** [https://01.ai/](https://01.ai/)
+**File Size:** 3670 MB
+**Context Length:** 4096 tokens
 **Prompt Format:**
 ```
 # StarCoder2 3B
 StarCoder2-3B model is a 3B parameter model trained on 17 programming languages from The Stack v2, with opt-out requests excluded. The model uses Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and was trained using the Fill-in-the-Middle objective on 3+ trillion tokens
+**Model Intention:** The model is good at 17 programming languages. By just start with your codes, the model will finish it.
 **Model URL:** [https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/starcoder2-3b-instruct-gguf_Q8_0.gguf?download=true](https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/starcoder2-3b-instruct-gguf_Q8_0.gguf?download=true)
 **File Size:** 3220 MB
+**Context Length:** 16384 tokens
 **Prompt Format:**
 ```
+{{prompt}}
 ```
 **Parse Special Tokens:** Yes
+---
+# Qwen1.5 4B Chat
+Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. It supports both Chinese and English. 通义千问是阿里巴巴公司开发的大大预言模型，支持中英文双语。
+**Model Intention:** It's one of the best LLM that supports Chinese and English. 这是支持中英双语的最佳的大语言模型之一。
+**Model URL:** [https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/qwen1_5-4b-chat-q4_k_m.gguf?download=true](https://huggingface.co/flyingfishinwater/goodmodels/resolve/main/qwen1_5-4b-chat-q4_k_m.gguf?download=true)
+**Model Info URL:** [https://huggingface.co/Qwen/Qwen1.5-4B-Chat-GGUF](https://huggingface.co/Qwen/Qwen1.5-4B-Chat-GGUF)
+**Model License:** [License Info](https://huggingface.co/Qwen/Qwen1.5-4B-Chat/raw/main/LICENSE)
+**Model Description:** Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. It supports both Chinese and English. 通义千问是阿里巴巴公司开发的大大预言模型，支持中英文双语。
+**Developer:** [https://qwenlm.github.io/](https://qwenlm.github.io/)
+**File Size:** 2460 MB
+**Context Length:** 32768 tokens
+**Prompt Format:**
+```
+<|im_start|>user
+{{prompt}}
+<|im_end|>
+<|im_start|>assistant
+```
+**Template Name:** chatml
+**Add BOS Token:** Yes
+**Add EOS Token:** No
+**Parse Special Tokens:** Yes
 ---
 # Dophin 2.8 Mistralv02 7B
 **File Size:** 2728 MB
+**Context Length:** 32768 tokens
 **Prompt Format:**
 ```
+<s><|im_start|>user
 {{prompt}}
 <|im_end|>
 <|im_start|>assistant
 **Parse Special Tokens:** Yes
+---