Edit Models filters

Inference status

Misc

arxiv: 2407.11062

AutoTrain Compatible

Inference Endpoints

text-generation-inference

4-bit precision

Misc with no match

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

99

Full-text search

Active filters: 2407.11062

ChenMnZ/Llama-3-70b-instruct-EfficientQAT-w4g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 4

ChenMnZ/Llama-3-8b-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 4

ChenMnZ/Llama-3-8b-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22, 2024 • 6

ChenMnZ/Llama-3-8b-EfficientQAT-w4g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 4

ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 6

ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g64-BitBLAS

Text Generation • Updated Jul 22, 2024 • 6

ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w4g128-BitBLAS

Text Generation • Updated Jul 22, 2024 • 6

ChenMnZ/Mistral-Large-Instruct-2407-EfficientQAT-w2g64-GPTQ

Updated Aug 6, 2024 • 5 • 25

Alignment-Lab-AI/w2g64-mistral-large-intruct-gptq

Updated Aug 12, 2024 • 3