File size: 951 Bytes
c5402ba
 
 
 
 
 
 
 
 
 
45725be
c5402ba
ee6bfde
 
 
 
 
 
 
 
 
 
 
 
 
 
f94ffc4
ee6bfde
c5402ba
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
---
license: mit
datasets:
- miracl/miracl
language:
- fr
metrics:
- precision
base_model:
- antoinelouis/crossencoder-camembert-base-mmarcoFR
library_name: sentence-transformers
---

# MIRACL Cross-Encoder (fr)

This model is a fine-tuned version of [antoinelouis/crossencoder-camembert-base-mmarcoFR](https://huggingface.co/antoinelouis/crossencoder-camembert-base-mmarcoFR) on the MIRACL dataset for fr language. It uses hard negative mining with BM25 for better training data.

## Training
- The model was trained on MIRACL fr dataset
- Hard negative mining was performed using BM25
- For each query, we used all positive passages and up to 30 negative passages (combination of original negatives and BM25 hard negatives)

## Usage
```python
from sentence_transformers.cross_encoder import CrossEncoder

model = CrossEncoder("azat-serikbayev/crossencoder-camembert-base-mmarcoFR-miracl-fr")
scores = model.predict([["query", "document_text"]])
```