Edit Models filters

Misc

arxiv: 2407.11062

AutoTrain Compatible

Inference Endpoints

text-generation-inference

4-bit precision

Misc with no match

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

99

Full-text search

Active filters: 2407.11062

ChenMnZ/Llama-2-13b-EfficientQAT-w4g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 2

ChenMnZ/Llama-2-13b-EfficientQAT-w4g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 4

ChenMnZ/Llama-2-70b-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 2

ChenMnZ/Llama-2-70b-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 3

ChenMnZ/Llama-2-70b-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22, 2024 • 3

ChenMnZ/Llama-2-70b-EfficientQAT-w4g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 2

ChenMnZ/Llama-2-70b-EfficientQAT-w4g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 3

ChenMnZ/Llama-2-7b-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 6

ChenMnZ/Llama-2-7b-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22, 2024 • 15

ChenMnZ/Llama-2-7b-EfficientQAT-w4g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 3

ChenMnZ/Llama-3-70b-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 2

ChenMnZ/Llama-3-70b-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22, 2024 • 3

ChenMnZ/Llama-3-70b-EfficientQAT-w4g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 2

ChenMnZ/Llama-3-70b-instruct-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 2

ChenMnZ/Llama-3-70b-instruct-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22, 2024 • 3

ChenMnZ/Llama-2-7b-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 5

ChenMnZ/Llama-3-70b-instruct-EfficientQAT-w4g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 2

ChenMnZ/Llama-2-7b-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22, 2024 • 5

ChenMnZ/Llama-2-7b-EfficientQAT-w4g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 3

ChenMnZ/Llama-3-70b-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 5

ChenMnZ/Llama-3-8b-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 4

ChenMnZ/Llama-3-8b-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22, 2024 • 4

ChenMnZ/Llama-3-8b-EfficientQAT-w4g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 5 • 1

ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 9 • 1

ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g64-GPTQ

Text Generation • Updated Jul 22, 2024 • 2

ChenMnZ/Llama-3-70b-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22, 2024 • 2

ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w4g128-GPTQ

Text Generation • Updated Jul 22, 2024 • 6

ChenMnZ/Llama-3-70b-EfficientQAT-w4g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 4

ChenMnZ/Llama-3-70b-instruct-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 4

ChenMnZ/Llama-3-70b-instruct-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22, 2024 • 2