|
--- |
|
license: mit |
|
datasets: |
|
- miracl/miracl |
|
language: |
|
- fr |
|
metrics: |
|
- precision |
|
base_model: |
|
- antoinelouis/crossencoder-camembert-base-mmarcoFR |
|
library_name: sentence-transformers |
|
--- |
|
|
|
# MIRACL Cross-Encoder (fr) |
|
|
|
This model is a fine-tuned version of [antoinelouis/crossencoder-camembert-base-mmarcoFR](https://huggingface.co/antoinelouis/crossencoder-camembert-base-mmarcoFR) on the MIRACL dataset for fr language. It uses hard negative mining with BM25 for better training data. |
|
|
|
## Training |
|
- The model was trained on MIRACL fr dataset |
|
- Hard negative mining was performed using BM25 |
|
- For each query, we used all positive passages and up to 30 negative passages (combination of original negatives and BM25 hard negatives) |
|
|
|
## Usage |
|
```python |
|
from sentence_transformers.cross_encoder import CrossEncoder |
|
|
|
model = CrossEncoder("azat-serikbayev/crossencoder-camembert-base-mmarcoFR-miracl-fr") |
|
scores = model.predict([["query", "document_text"]]) |
|
``` |