SentenceTransformer based on FacebookAI/roberta-base

This is a sentence-transformers model finetuned from FacebookAI/roberta-base on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: FacebookAI/roberta-base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • csv

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    "The account of an expedition against Fort Christina deserves to be\nquoted in full, for it is an example of what war might be, full of\nexcitement, and exercise, and heroism, without danger to life. We take\nup the narrative at the moment when the Dutch host...',
    '"He stood by me all these years," he thought, "he taught me all I know,\nthough I fear I am still very young and an ignoramus. But he\'s tried\nhard I know to impart all his own special knowledge to me, and he\'s\ngiven me chances that many a young officer would give his ears for.\nRight!...',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Binary Classification

Metric litemb-dev litemb-test
cosine_accuracy 0.833 0.8371
cosine_accuracy_threshold 0.7998 0.9184
cosine_f1 0.8324 0.842
cosine_f1_threshold 0.7917 0.9133
cosine_precision 0.8093 0.8006
cosine_recall 0.857 0.888
cosine_ap 0.9127 0.9163
cosine_mcc 0.6561 0.6708

Training Details

Training Dataset

csv

  • Dataset: csv
  • Size: 4,415,131 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 447 tokens
    • mean: 510.65 tokens
    • max: 512 tokens
    • min: 450 tokens
    • mean: 510.71 tokens
    • max: 512 tokens
    • min: 455 tokens
    • mean: 510.83 tokens
    • max: 512 tokens
  • Samples:
    anchor positive negative
    "That was curious," remarked Trent.
    "I thought so, sir. But I recollected what I had heard about 'not a word
    to a soul,' and I concluded that this about a moonlight drive was
    intended to mislead."
    "What time was this?"
    "It would be about ten, sir, I should say. After speaking to me, Mr.
    Manderson waited until Mr. Marlowe had come down and brought round the
    car. He then went into the drawing-room, where Mrs. Manderson was."
    "Did that strike you as curious?"
    Martin looked down his nose. "If you ask me the question, sir," he said
    with reserve, "I had not known him enter that room since we came here
    this year. He preferred to sit in the library in the evenings. That
    evening he only remained with Mrs. Manderson for a few minutes. Then he
    and Mr. Marlowe started immediately."
    "You saw them start?"
    "Yes, sir. They took the direction of Bishopsbridge."
    "And you saw Mr. Manderson again later?"
    "After an hour or thereabouts, sir, in the library. That would have been
    about a quarter past eleven, ...
    Sir James turned instantly to Mr. Figgis, whose pencil was poised over
    the paper. “Sigsbee Manderson has been murdered,” he began quickly and
    clearly, pacing the floor with his hands behind him. Mr. Figgis
    scratched down a line of shorthand with as much emotion as if he had
    been told that the day was fine—the pose of his craft. “He and his wife
    and two secretaries have been for the past fortnight at the house
    called White Gables, at Marlstone, near Bishopsbridge. He bought it
    four years ago. He and Mrs. Manderson have since spent a part of each
    summer there. Last night he went to bed about half-past eleven, just as
    usual. No one knows when he got up and left the house. He was not
    missed until this morning. About ten o’clock his body was found by a
    gardener. It was lying by a shed in the grounds. He was shot in the
    head, through the left eye. Death must have been instantaneous. The
    body was not robbed, but there were marks on the wrists which pointed
    to a struggle having taken place. Dr...
    Holmes shook his head like a man who is far from being satisfied.
    “These are very deep waters,” said he; “pray go on with your narrative.”
    “Two years have passed since then, and my life has been until lately
    lonelier than ever. A month ago, however, a dear friend, whom I have
    known for many years, has done me the honor to ask my hand in marriage.
    His name is Armitage—Percy Armitage—the second son of Mr. Armitage,
    of Crane Water, near Reading. My step-father has offered no opposition
    to the match, and we are to be married in the course of the spring. Two
    days ago some repairs were started in the west wing of the building,
    and my bedroom wall has been pierced, so that I have had to move into
    the chamber in which my sister died, and to sleep in the very bed in
    which she slept. Imagine, then, my thrill of terror when last night,
    as I lay awake, thinking over her terrible fate, I suddenly heard in
    the silence of the night the low whistle which had been the herald of
    her own death. I sprang ...
    'The condition of those blacks is assuredly better than that of the
    agricultural laborers in many parts of Europe. Their morality is far
    superior to that of the free negroes of the North; the planters
    encourage marriage, and thus endeavor to develop among them a sense
    of the family relation, with a view of attaching them to the
    domestic hearth, consequently to the family of the master. It will
    be then observed that in such a state of things the interests of the
    planter, in default of any other motive, promotes the advancement
    and well-being of the slave. Certainly, we believe it possible still
    to ameliorate their condition. It is with that view, even, that the
    South has labored for so long a time to prepare them for a higher
    civilization.
    'In no part, perhaps, of the continent, regard being had to the
    population, do there exist men more eminent and gifted, with nobler
    or more generous sentiments, than in the Southern States. No co...
    If we had clear and strong faith, our joy at the thought of a glorified
    spirit, however necessary its presence to us here, would transcend all
    our sorrows; the streaming beams of sunshine would irradiate our
    weeping; we should think more of his happiness than of our discomfort.
    Instead of departed spirits falling asleep, it is we who have a spirit
    of slumber. O that we might walk by faith with glorified spirits before
    the throne, instead of remanding them,--as it seems we sometimes would
    do, if we could,--to the ignorance and infirmity of our condition.
    Our feelings towards the departed are the same as towards other
    prohibited things. Many are continually seeking for pleasures which God
    has taken away, or is purposely withholding from them. Let any one look
    at the history of his feelings, and see if his state of mind be not one
    of perpetual expectation of some form of happiness yet to arrive; an
    ideal of bliss, some prefigured condition, in which contentment and
    peace are to abide; whi...
    “And we? Now that we've fought and lied and sweated and stolen, and
    hated as only the disappointed strugglers in a bitter, dead little
    Western town know how to do, what have we got to show for it? Harvey
    Merrick wouldn't have given one sunset over your marshes for all you've
    got put together, and you know it. It's not for me to say why, in the
    inscrutable wisdom of God, a genius should ever have been called from
    this place of hatred and bitter waters; but I want this Boston man to
    know that the drivel he's been hearing here tonight is the only
    tribute any truly great man could ever have from such a lot of sick,
    side-tracked, burnt-dog, land-poor sharks as the here-present financiers
    of Sand City--upon which town may God have mercy!”
    The lawyer thrust out his hand to Steavens as he passed him, caught up
    his overcoat in the hall, and had left the house before the Grand Army
    man had had time to lift his ducked head and crane his long neck about
    at his fellows.

    Next day Jim Laird was drun...
    When Cowper became an author he paid the highest respect to Mrs. Unwin
    as an instinctive critic, and called her his Lord Chamberlain, whose
    approbation was his sufficient licence for publication.
    Life in the Unwin family is thus described by the new inmate;--"As to
    amusements, I mean what the world calls such, we have none. The place
    indeed swarms with them; and cards and dancing are the professed
    business of almost all the gentle inhabitants of Huntingdon. We refuse
    to take part in them, or to be accessories to this way of murdering our
    time, and by so doing have acquired the name of Methodists. Having
    told you how we do not spend our time, I will next say how we do. We
    breakfast commonly between eight and nine; till eleven, we read either
    the scripture, or the sermons of some faithful preacher of those holy
    mysteries; at eleven we attend divine service, which is performed here
    twice every day, and from twelve to three we separate, and amuse
    ourselves as we please. During that in...
    Peel’s Government having been overthrown on the question of the Corn
    Laws by a combination which the Duke of Wellington characterized with
    military frankness, of Tory Protectionists, Whigs, Radicals, and Irish
    Nationalists, the whole under Semitic influence, its chief, for the
    short remainder of his life, held himself aloof from the party fray,
    encouraging no new combination, and content with watching over the safety
    of his great fiscal reform; though, as Greville says, had the Premiership
    been put to the vote, Peel would have been elected by an overwhelming
    majority. His personal following, Peelites as they were called, Graham,
    Gladstone, Lincoln, Cardwell, Sidney Herbert, and the rest, remained
    suspended between the two great parties. When Disraeli had thrown over
    protection, as he meant from the beginning to do, the only barrier
    of principle between the Peelites and the Conservatives was removed.
    Overtures were made by the Conservative leader, Lord Derby, to Gladstone,
    whose immense...
    "If you take my advice," said Stanley who was fighting his way towards
    some remote goal or other, "you'll take a little flyer on Dr. Rice.
    That's what I'm going to do. There's a fellow on the other side of the
    ring has him a point higher than anyone else."
    Dick, without having made up his mind as to his own betting or not
    betting, helped his companion in his struggle to get through the crowd.
    Desperate energy was necessary. There was never any time for apologies;
    elbows were pushed into sides, toes were trodden on, scarfs twisted and
    sleeve-links broken; no matter, there was money to be won and there was
    no time either to consider passing annoyances or the possibility of
    loss.
    "Ah," said Stanley, finally, as they found themselves in front of a
    black-board that had a figure "7" chalked to the left of the name Dr.
    Rice and a "3" to the right. "Here we are! Now then, what are you going
    to do?" He whipped out a twenty dollar bill and crumpled it carefully
    into the palm of his hand.
    Dick th...
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 0.5
    }
    

Evaluation Dataset

csv

  • Dataset: csv
  • Size: 944,948 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 420 tokens
    • mean: 510.66 tokens
    • max: 512 tokens
    • min: 432 tokens
    • mean: 510.77 tokens
    • max: 512 tokens
    • min: 424 tokens
    • mean: 510.38 tokens
    • max: 512 tokens
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 0.5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 35
  • per_device_eval_batch_size: 35
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 35
  • per_device_eval_batch_size: 35
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.0
  • Sentence Transformers: 3.4.0.dev0
  • Transformers: 4.46.3
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
3
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for asaakyan/gutenberg_authorship

Finetuned
(1627)
this model

Evaluation results