Quantizations.
#2
by
kuliev-vitaly
- opened
Model is to large for starting on even 2 A100. Quantizations should help with hardware requirements. Could you please make awq(4bit) or fp8 quantizations?
Model is to large for starting on even 2 A100. Quantizations should help with hardware requirements. Could you please make awq(4bit) or fp8 quantizations?