"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published Nov 4, 2024 • 46
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published Nov 4, 2024 • 46
neuralmagic/SparseLlama-2-7b-cnn-daily-mail-pruned_50.2of4 Text Generation • Updated May 21, 2024 • 24
neuralmagic/Llama-2-7b-cnn-daily-mail-pruned_70-quantized-deepsparse Text Generation • Updated May 17, 2024 • 21
neuralmagic/Llama-2-7b-cnn-daily-mail-pruned_50-quantized-deepsparse Text Generation • Updated May 17, 2024 • 21
Sparse Foundational Llama 2 Models Collection Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated Sep 26, 2024 • 9
neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_70-quantized-deepsparse Text Generation • Updated May 16, 2024 • 14 • 1
neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_50-quantized-deepsparse Text Generation • Updated May 16, 2024 • 10
Sparse Foundational Llama 2 Models Collection Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated Sep 26, 2024 • 9