shuyuej commited on
Commit
7d7c60e
·
verified ·
1 Parent(s): 6a0c8e3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -1
README.md CHANGED
@@ -1,3 +1,47 @@
1
  ---
2
  license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ ---
4
+
5
+ # The Quantized Command R Model
6
+
7
+ Original Base Model: `CohereForAI/c4ai-command-r-v01`.<br>
8
+ Link: [https://huggingface.co/CohereForAI/c4ai-command-r-v01](https://huggingface.co/CohereForAI/c4ai-command-r-v01)
9
+
10
+ ## Special Notice
11
+
12
+ (1) Please note the model is quantized by utilizing the `AutoModelForCausalLM.from_pretrained` in the `transformers` package.
13
+
14
+ (2) For the model quantized by `auto-gptq` package, please check the link here: [https://huggingface.co/shuyuej/Command-R-GPTQ](https://huggingface.co/shuyuej/Command-R-GPTQ).
15
+
16
+ (3) This one is a smaller one by setting `group_size=1024`.
17
+
18
+ ## Quantization Configurations
19
+ ```
20
+ "quantization_config": {
21
+ "batch_size": 1,
22
+ "bits": 4,
23
+ "block_name_to_quantize": null,
24
+ "cache_block_outputs": true,
25
+ "damp_percent": 0.1,
26
+ "dataset": null,
27
+ "desc_act": false,
28
+ "exllama_config": {
29
+ "version": 1
30
+ },
31
+ "group_size": 1024,
32
+ "max_input_length": null,
33
+ "model_seqlen": null,
34
+ "module_name_preceding_first_block": null,
35
+ "modules_in_block_to_quantize": null,
36
+ "pad_token_id": null,
37
+ "quant_method": "gptq",
38
+ "sym": true,
39
+ "tokenizer": null,
40
+ "true_sequential": true,
41
+ "use_cuda_fp16": false,
42
+ "use_exllama": true
43
+ },
44
+ ```
45
+
46
+ ## Source Codes
47
+ Source Codes: [https://github.com/vkola-lab/medpodgpt/tree/main/quantization](https://github.com/vkola-lab/medpodgpt/tree/main/quantization).