--- license: llama3 --- ## Quantized Meta-Llama-3B-Instruct model Tested inference on Raspberry PI Model 4 with [llama.cpp](https://github.com/ggerganov/llama.cpp). ~0.5 Tokens per second.