YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
TinyBERT_L-4_H-312_v2 English Sentence Encoder
This is distilled from the bert-base-nli-stsb-mean-tokens
pre-trained model from Sentence-Transformers.
The embedding vector is obtained by mean/average pooling of the last layer's hidden states.
Update 20210325: Added the attention matrices imitation objective as in the TinyBERT paper, and the distill target has been changed from distilbert-base-nli-stsb-mean-tokens
to bert-base-nli-stsb-mean-tokens
(they have almost the same STSb performance).
Model Comparison
We compute cosine similarity scores of the embeddings of the sentence pair to get the spearman correlation on the STS benchmark (bigger is better):
Dev | Test | |
---|---|---|
bert-base-nli-stsb-mean-tokens | .8704 | .8505 |
distilbert-base-nli-stsb-mean-tokens | .8667 | .8516 |
TinyBERT_L-4_H-312_v2-distill-AllNLI | .8587 | .8283 |
TinyBERT_L-4_H (20210325) | .8551 | .8341 |
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.