roleplaiapp commited on
Commit
6265458
·
verified ·
1 Parent(s): ffb7637

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +49 -0
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ inference: false
5
+ fine-tuning: false
6
+ tags:
7
+ - llama-cpp
8
+ - Llama-3.1-Nemotron-70B-Instruct-HF
9
+ - gguf
10
+ - Q5_0
11
+ - 70b
12
+ - 4-bit
13
+ - nemotron
14
+ - llama-cpp
15
+ - nvidia
16
+ - code
17
+ - math
18
+ - chat
19
+ - roleplay
20
+ - text-generation
21
+ - safetensors
22
+ - nlp
23
+ - code
24
+ datasets:
25
+ - nvidia/HelpSteer2
26
+ base_model: meta-llama/Llama-3.1-70B-Instruct
27
+ pipeline_tag: text-generation
28
+ library_name: transformers
29
+ ---
30
+
31
+ # roleplaiapp/Llama-3.1-Nemotron-70B-Instruct-HF-Q5_0-GGUF
32
+
33
+ **Repo:** `roleplaiapp/Llama-3.1-Nemotron-70B-Instruct-HF-Q5_0-GGUF`
34
+ **Original Model:** `Llama-3.1-Nemotron-70B-Instruct-HF`
35
+ **Organization:** `nvidia`
36
+ **Quantized File:** `llama-3.1-nemotron-70b-instruct-hf-q5_0.gguf`
37
+ **Quantization:** `GGUF`
38
+ **Quantization Method:** `Q5_0`
39
+ **Use Imatrix:** `False`
40
+ **Split Model:** `False`
41
+
42
+ ## Overview
43
+ This is an GGUF Q5_0 quantized version of [Llama-3.1-Nemotron-70B-Instruct-HF](https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF).
44
+
45
+ ## Quantization By
46
+ I often have idle A100 GPUs while building/testing and training the RP app, so I put them to use quantizing models.
47
+ I hope the community finds these quantizations useful.
48
+
49
+ Andrew Webby @ [RolePlai](https://roleplai.app/)