Text Generation
Transformers
multilingual
Inference Endpoints

Request for quantized version

#2
by sudhir2016 - opened

A quantized version of the model which can be used for inference in a free tier Google Colab notebook would be nice.

MaLA-LM org

Yes please. Will it work with load_in_4bit=True.

Sign up or log in to comment