Getting incoherent output on vLLM with 2080Ti and 3090Ti

#1
by NeoChen1024 - opened

When I tried it with the latest stable vLLM, it runs blazingly fast, but its vision output is garbled, sometimes even mixed with emojis. Meanwhile FP8 version works fine on 3090Ti with Marlin Kernel.

Sign up or log in to comment