Please make V3-lite
#12
by
rombodawg
- opened
Us 3090 users need love too 🙏🙏
Please can you do what meta does and use your larger model to make a smaller model for the users who cant afford 2 nodes of h100's?
also release awq int4 version of lite version
32B parameters can be loaded in 24GB Vram in int4
72B parameters can be loaded in 2x24GB Vram in int4
Us 3090 users need love too 🙏🙏
Please can you do what meta does and use your larger model to make a smaller model for the users who cant afford 2 nodes of h100's?
A reminder that two H100 nodes won’t work; H200 is required instead.