Update readme

Browse files

Files changed (1) hide show

README.md +22 -53

README.md CHANGED Viewed

@@ -1,69 +1,38 @@
 ---
-license: mit
-license_link: https://huggingface.co/microsoft/phi-4/resolve/main/LICENSE
-language:
-- en
 pipeline_tag: text-generation
 tags:
-- phi
 - nlp
-- math
 - code
 - chat
 - conversational
-- llama-cpp
-- gguf-my-repo
-inference:
-  parameters:
-    temperature: 0
-widget:
-- messages:
-  - role: user
-    content: How should I explain the Internet?
 library_name: transformers
 base_model: microsoft/phi-4
 ---
-# roleplaiapp/phi-4-Q5_K_M-GGUF
-This model was converted to GGUF format from [`microsoft/phi-4`](https://huggingface.co/microsoft/phi-4) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
-Refer to the [original model card](https://huggingface.co/microsoft/phi-4) for more details on the model.
-## Use with llama.cpp
-Install llama.cpp through brew (works on Mac and Linux)
-```bash
-brew install llama.cpp
-```
-Invoke the llama.cpp server or the CLI.
-### CLI:
-```bash
-llama-cli --hf-repo roleplaiapp/phi-4-Q5_K_M-GGUF --hf-file phi-4-q5_k_m.gguf -p "The meaning to life and the universe is"
-```
-### Server:
-```bash
-llama-server --hf-repo roleplaiapp/phi-4-Q5_K_M-GGUF --hf-file phi-4-q5_k_m.gguf -c 2048
-```
-Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
-Step 1: Clone llama.cpp from GitHub.
-```
-git clone https://github.com/ggerganov/llama.cpp
-```
-Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
-```
-cd llama.cpp && LLAMA_CURL=1 make
-```
-Step 3: Run inference through the main binary.
-```
-./llama-cli --hf-repo roleplaiapp/phi-4-Q5_K_M-GGUF --hf-file phi-4-q5_k_m.gguf -p "The meaning to life and the universe is"
-```
-or
-```
-./llama-server --hf-repo roleplaiapp/phi-4-Q5_K_M-GGUF --hf-file phi-4-q5_k_m.gguf -c 2048
-```

 ---
 pipeline_tag: text-generation
 tags:
 - nlp
 - code
+- llama-cpp
+- exllama
+- gguf
+- phi-4
+- phi
+- microsoft
+- gguf
+- code
+- math
+- chat
 - chat
 - conversational
+- roleplay
+- text-generation
+- safetensors
 library_name: transformers
 base_model: microsoft/phi-4
 ---
+# phi-4-Q3_K_S-GGUF
+**Original Model:** `/microsoft/phi-4`
+**Quantization Method:** `GGUF`
+## Overview
+This is an GGUF Q5 KM quantized version of [phi-4](https://huggingface.co/microsoft/phi-4).
+## Quantization By
+I often have idle A100 GPUs while building/testing and training the RP app, so I put them to use quantizing models.
+I hope the community finds these quantizations useful.
+Andrew Webby @ [RolePlai](https://roleplai.app/)