zhou-xl
/

bi-cse

Feature Extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

Use Chinese and English STS and NLI corpora to conduct contrastive learning finetuning on xlmr

Using HuggingFace Transformers

from transformers import AutoTokenizer, AutoModel
import torch
# Sentences we want sentence embeddings for
sentences = ["样例数据-1", "样例数据-2"]

# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('zhou-xl/bi-cse')
model = AutoModel.from_pretrained('zhou-xl/bi-cse')
model.eval()

# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)
    # Perform pooling. In this case, cls pooling.
    sentence_embeddings = model_output[0][:, 0]
# normalize embeddings
sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)
print("Sentence embeddings:", sentence_embeddings)

Downloads last month: 283

Inference Examples

Feature Extraction

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using zhou-xl/bi-cse 5

Evaluation results

cos_sim_pearson on MTEB AFQMC
validation set self-reported

42.010
cos_sim_spearman on MTEB AFQMC
validation set self-reported

43.449
euclidean_pearson on MTEB AFQMC
validation set self-reported

41.933
euclidean_spearman on MTEB AFQMC
validation set self-reported

43.457
manhattan_pearson on MTEB AFQMC
validation set self-reported

41.930
manhattan_spearman on MTEB AFQMC
validation set self-reported

43.445
cos_sim_pearson on MTEB ATEC
test set self-reported

47.484
cos_sim_spearman on MTEB ATEC
test set self-reported

48.010
cos_sim_pearson on MTEB BIOSSES
test set self-reported

70.066
cos_sim_spearman on MTEB BIOSSES
test set self-reported

70.564

View on Papers With Code