The output size when deployed in GCP is 1536 instead of 1024

#23
by bennegeek - opened

The length of the output is 1536 instead of 1024. I did one click deploy. It doesn't match when i load the model for training and for inference. Could you make the model that loads in TEI also use 1024 dims?

Screenshot 2024-10-04 at 2.41.51 a.m..png

Hello!

The reason that the default behaviour uses 1024 is because it uses the Dense module from 2_Dense_1024: https://huggingface.co/dunzhang/stella_en_1.5B_v5/blob/main/modules.json#L15-L19
Whereas GCP will likely read the "usual" 2_Dense folder and use that one instead. That folder has the 1536 that you're experiencing.

I see you already created a clone of this model to try and fix it, but I think your fix might be wrong (i.e. you're not using any Dense anymore). I would fix it like this:

  1. Clone the model
  2. Rename 2_Dense to 2_Dense_1536
  3. Rename 2_Dense_1024 to 2_Dense
  4. Update modules.json to use 2_Dense instead of 2_Dense_1024.

Then both the Sentence Transformers and GCP should use the 1024 with the Dense module (which is important to get the correct performance!)

  • Tom Aarsen

Hi Tom, thanks for the response

I looked inside 2_Dense and saw this

 "out_features": 8192,

Does this mean the output when this layer is used is 8192 dimensions?

To me, it seems the one click GCP deployment doesn't use any of the 2_Dense_* layers.

Yes. I would advise against using it, because the MTEB score of 1024d is only 0.001 lower than 8192d.

Having said that, I think my original assumption here:

Whereas GCP will likely read the "usual" 2_Dense folder and use that one instead. That folder has the 1536 that you're experiencing.

was wrong. I think GCP perhaps just doesn't use any Dense layer? This will result in worse performance I'm afraid.
@philschmid do you have some experience with this? Or @olivierdehaene due to TEI?

  • Tom Aarsen

I took a look at the TEI code and it seems TEI only reads the 1_Pooling layer. But I would definitely appreciate the view of someone who has expertise on that.

@philschmid @olivierdehaene any updates on the answer for Stella on TEI?

Hi @philschmid @olivierdehaene any work around or update on this? for Stella on TEI?

The length of the output is 1536 instead of 1024. I did one click deploy. It doesn't match when i load the model for training and for inference. Could you make the model that loads in TEI also use 1024 dims?

Screenshot 2024-10-04 at 2.41.51 a.m..png

I encountered the same problem, changed the configuration, but the result is still 1536, deployed through /text-embeddings-inference.

You can use the fork i made, it's 1536 dims for both TEI and loading into SentenceTransformer. https://huggingface.co/bennegeek/stella_en_1.5B_v5/

I believe the performance will be worse if you have 1536 dims, i.e. if you're not using the Dense module.

  • Tom Aarsen

Sign up or log in to comment