OpenSourceRonin commited on
Commit
2c33130
·
verified ·
1 Parent(s): 46e0941

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -25,10 +25,11 @@ VPTQ can compress 70B, even the 405B model, to 1-2 bits without retraining and m
25
  [**Github**](https://github.com/microsoft/vptq) https://github.com/microsoft/vptq
26
 
27
  Prompt example: Llama 3.1 70B on RTX4090 (24 GB@2bit)
28
- ![image/gif](https://cdn-uploads.huggingface.co/production/uploads/66a73179315d9b5c32e06967/lTfvSARTs9YfCkpEe3Sxc.gif)
29
 
30
  Chat example: Llama 3.1 70B on RTX4090 (24 GB@2bit)
31
- ![image/gif](https://cdn-uploads.huggingface.co/production/uploads/66a73179315d9b5c32e06967/QZeqC_EhZVwozEV_WcFtV.gif)
 
32
 
33
  ## Details and [**Tech Report**](https://github.com/microsoft/VPTQ/blob/main/VPTQ_tech_report.pdf)
34
 
 
25
  [**Github**](https://github.com/microsoft/vptq) https://github.com/microsoft/vptq
26
 
27
  Prompt example: Llama 3.1 70B on RTX4090 (24 GB@2bit)
28
+ ![image/gif](https://cdn-uploads.huggingface.co/production/uploads/66a73179315d9b5c32e06967/QZeqC_EhZVwozEV_WcFtV.gif)
29
 
30
  Chat example: Llama 3.1 70B on RTX4090 (24 GB@2bit)
31
+ ![image/gif](https://cdn-uploads.huggingface.co/production/uploads/66a73179315d9b5c32e06967/lTfvSARTs9YfCkpEe3Sxc.gif)
32
+
33
 
34
  ## Details and [**Tech Report**](https://github.com/microsoft/VPTQ/blob/main/VPTQ_tech_report.pdf)
35