Spaces:

VPTQ
/

README

No application file

OpenSourceRonin commited on Sep 28, 2024

Commit

2c33130

verified ·

1 Parent(s): 46e0941

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -25,10 +25,11 @@ VPTQ can compress 70B, even the 405B model, to 1-2 bits without retraining and m
 [**Github**](https://github.com/microsoft/vptq) https://github.com/microsoft/vptq
 Prompt example: Llama 3.1 70B on RTX4090 (24 GB@2bit)
-![image/gif](https://cdn-uploads.huggingface.co/production/uploads/66a73179315d9b5c32e06967/lTfvSARTs9YfCkpEe3Sxc.gif)
 Chat example: Llama 3.1 70B on RTX4090 (24 GB@2bit)
-![image/gif](https://cdn-uploads.huggingface.co/production/uploads/66a73179315d9b5c32e06967/QZeqC_EhZVwozEV_WcFtV.gif)
 ## Details and [**Tech Report**](https://github.com/microsoft/VPTQ/blob/main/VPTQ_tech_report.pdf)

 [**Github**](https://github.com/microsoft/vptq) https://github.com/microsoft/vptq
 Prompt example: Llama 3.1 70B on RTX4090 (24 GB@2bit)
+![image/gif](https://cdn-uploads.huggingface.co/production/uploads/66a73179315d9b5c32e06967/QZeqC_EhZVwozEV_WcFtV.gif)
 Chat example: Llama 3.1 70B on RTX4090 (24 GB@2bit)
+![image/gif](https://cdn-uploads.huggingface.co/production/uploads/66a73179315d9b5c32e06967/lTfvSARTs9YfCkpEe3Sxc.gif)
 ## Details and [**Tech Report**](https://github.com/microsoft/VPTQ/blob/main/VPTQ_tech_report.pdf)