second-state
/

Qwen2.5-Coder-0.5B-Instruct-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

apepkuss79 commited on Nov 11, 2024

Commit

8a94122

·

verified ·

1 Parent(s): 9860732

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -49,7 +49,7 @@ language:
     <|im_start|>assistant
     ```
-- Context size: `128000`
 - Run as LlamaEdge service
@@ -58,7 +58,7 @@ language:
     llama-api-server.wasm \
     --model-name Qwen2.5-Coder-0.5B-Instruct \
     --prompt-template chatml \
-    --ctx-size 128000
   ```
 - Run as LlamaEdge command app
@@ -67,7 +67,7 @@ language:
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Qwen2.5-Coder-0.5B-Instruct-Q5_K_M.gguf \
     llama-chat.wasm \
     --prompt-template chatml \
-    --ctx-size 128000
   ```
 ## Quantized GGUF Models

     <|im_start|>assistant
     ```
+- Context size: `32000`
 - Run as LlamaEdge service
     llama-api-server.wasm \
     --model-name Qwen2.5-Coder-0.5B-Instruct \
     --prompt-template chatml \
+    --ctx-size 32000
   ```
 - Run as LlamaEdge command app
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Qwen2.5-Coder-0.5B-Instruct-Q5_K_M.gguf \
     llama-chat.wasm \
     --prompt-template chatml \
+    --ctx-size 32000
   ```
 ## Quantized GGUF Models