SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("danicafisher/dfisher-base-sentence-transformer")
# Run inference
sentences = [
    'How can organizations address risks associated with the use of third-party data for GAI model inputs?',
    '48 \n• Data protection \n• Data retention  \n• Consistency in use of defining key terms \n• Decommissioning \n• Discouraging anonymous use \n• Education  \n• Impact assessments  \n• Incident response \n• Monitoring \n• Opt-outs  \n• Risk-based controls \n• Risk mapping and measurement \n• Science-backed TEVV practices \n• Secure software development practices \n• Stakeholder engagement \n• Synthetic content detection and \nlabeling tools and techniques \n• Whistleblower protections \n• Workforce diversity and \ninterdisciplinary teams\nEstablishing acceptable use policies and guidance for the use of GAI in formal human-AI teaming settings \nas well as different levels of human-AI configurations can help to decrease risks arising from misuse, \nabuse, inappropriate repurpose, and misalignment between systems and users. These practices are just \none example of adapting existing governance protocols for GAI contexts.  \nA.1.3. Third-Party Considerations \nOrganizations may seek to acquire, embed, incorporate, or use open-source or proprietary third-party \nGAI models, systems, or generated data for various applications across an enterprise. Use of these GAI \ntools and inputs has implications for all functions of the organization – including but not limited to \nacquisition, human resources, legal, compliance, and IT services – regardless of whether they are carried \nout by employees or third parties. Many of the actions cited above are relevant and options for \naddressing third-party considerations. \nThird party GAI integrations may give rise to increased intellectual property, data privacy, or information \nsecurity risks, pointing to the need for clear guidelines for transparency and risk management regarding \nthe collection and use of third-party data for model inputs. Organizations may consider varying risk \ncontrols for foundation models, fine-tuned models, and embedded tools, enhanced processes for \ninteracting with external GAI technologies or service providers. Organizations can apply standard or \nexisting risk controls and processes to proprietary or open-source GAI technologies, data, and third-party \nservice providers, including acquisition and procurement due diligence, requests for software bills of \nmaterials (SBOMs), application of service level agreements (SLAs), and statement on standards for \nattestation engagement (SSAE) reports to help with third-party transparency and risk management for \nGAI systems. \nA.1.4. Pre-Deployment Testing \nOverview \nThe diverse ways and contexts in which GAI systems may be developed, used, and repurposed \ncomplicates risk mapping and pre-deployment measurement efforts. Robust test, evaluation, validation, \nand verification (TEVV) processes can be iteratively applied – and documented – in early stages of the AI \nlifecycle and informed by representative AI Actors (see Figure 3 of the AI RMF). Until new and rigorous',
    '8 \nTrustworthy AI Characteristics: Accountable and Transparent, Privacy Enhanced, Safe, Secure and \nResilient \n2.5. Environmental Impacts \nTraining, maintaining, and operating (running inference on) GAI systems are resource-intensive activities, \nwith potentially large energy and environmental footprints. Energy and carbon emissions vary based on \nwhat is being done with the GAI model (i.e., pre-training, fine-tuning, inference), the modality of the \ncontent, hardware used, and type of task or application. \nCurrent estimates suggest that training a single transformer LLM can emit as much carbon as 300 round-\ntrip flights between San Francisco and New York. In a study comparing energy consumption and carbon \nemissions for LLM inference, generative tasks (e.g., text summarization) were found to be more energy- \nand carbon-intensive than discriminative or non-generative tasks (e.g., text classification).  \nMethods for creating smaller versions of trained models, such as model distillation or compression, \ncould reduce environmental impacts at inference time, but training and tuning such models may still \ncontribute to their environmental impacts. Currently there is no agreed upon method to estimate \nenvironmental impacts from GAI.  \nTrustworthy AI Characteristics: Accountable and Transparent, Safe \n2.6. Harmful Bias and Homogenization \nBias exists in many forms and can become ingrained in automated systems. AI systems, including GAI \nsystems, can increase the speed and scale at which harmful biases manifest and are acted upon, \npotentially perpetuating and amplifying harms to individuals, groups, communities, organizations, and \nsociety. For example, when prompted to generate images of CEOs, doctors, lawyers, and judges, current \ntext-to-image models underrepresent women and/or racial minorities, and people with disabilities. \nImage generator models have also produced biased or stereotyped output for various demographic \ngroups and have difficulty producing non-stereotyped content even when the prompt specifically \nrequests image features that are inconsistent with the stereotypes. Harmful bias in GAI models, which \nmay stem from their training data, can also cause representational harms or perpetuate or exacerbate \nbias based on race, gender, disability, or other protected classes.  \nHarmful bias in GAI systems can also lead to harms via disparities between how a model performs for \ndifferent subgroups or languages (e.g., an LLM may perform less well for non-English languages or \ncertain dialects). Such disparities can contribute to discriminatory decision-making or amplification of \nexisting societal biases. In addition, GAI systems may be inappropriately trusted to perform similarly \nacross all subgroups, which could leave the groups facing underperformance with worse outcomes than \nif no GAI system were used. Disparate or reduced performance for lower-resource languages also \npresents challenges to model adoption, inclusion, and accessibility, and may make preservation of \nendangered languages more difficult if GAI systems become embedded in everyday processes that would \notherwise have been opportunities to use these languages.  \nBias is mutually reinforcing with the problem of undesired homogenization, in which GAI systems \nproduce skewed distributions of outputs that are overly uniform (for example, repetitive aesthetic styles',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 128 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 128 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 17 tokens
    • mean: 23.14 tokens
    • max: 38 tokens
    • min: 56 tokens
    • mean: 247.42 tokens
    • max: 256 tokens
  • Samples:
    sentence_0 sentence_1
    What measures are suggested to assess the environmental impact of AI model training and management activities? 37
    MS-2.11-005
    Assess the proportion of synthetic to non-synthetic training data and verify
    training data is not overly homogenous or GAI-produced to mitigate concerns of
    model collapse.
    Harmful Bias and Homogenization
    AI Actor Tasks: AI Deployment, AI Impact Assessment, Affected Individuals and Communities, Domain Experts, End-Users,
    Operation and Monitoring, TEVV

    MEASURE 2.12: Environmental impact and sustainability of AI model training and management activities – as identified in the MAP
    function – are assessed and documented.
    Action ID
    Suggested Action
    GAI Risks
    MS-2.12-001 Assess safety to physical environments when deploying GAI systems.
    Dangerous, Violent, or Hateful
    Content
    MS-2.12-002 Document anticipated environmental impacts of model development,
    maintenance, and deployment in product design decisions.
    Environmental
    MS-2.12-003
    Measure or estimate environmental impacts (e.g., energy and water
    consumption) for training, fine tuning, and deploying models: Verify tradeoffs
    between resources used at inference time versus additional resources required
    at training time.
    Environmental
    MS-2.12-004 Verify effectiveness of carbon capture or offset programs for GAI training and
    applications, and address green-washing concerns.
    Environmental
    AI Actor Tasks: AI Deployment, AI Impact Assessment, Domain Experts, Operation and Monitoring, TEVV
    What are some limitations of current pre-deployment testing approaches for GAI applications? 49
    early lifecycle TEVV approaches are developed and matured for GAI, organizations may use
    recommended “pre-deployment testing” practices to measure performance, capabilities, limits, risks,
    and impacts. This section describes risk measurement and estimation as part of pre-deployment TEVV,
    and examines the state of play for pre-deployment testing methodologies.
    Limitations of Current Pre-deployment Test Approaches
    Currently available pre-deployment TEVV processes used for GAI applications may be inadequate, non-
    systematically applied, or fail to reflect or mismatched to deployment contexts. For example, the
    anecdotal testing of GAI system capabilities through video games or standardized tests designed for
    humans (e.g., intelligence tests, professional licensing exams) does not guarantee GAI system validity or
    reliability in those domains. Similarly, jailbreaking or prompt engineering tests may not systematically
    assess validity or reliability risks.
    Measurement gaps can arise from mismatches between laboratory and real-world settings. Current
    testing approaches often remain focused on laboratory conditions or restricted to benchmark test
    datasets and in silico techniques that may not extrapolate well to—or directly assess GAI impacts in real-
    world conditions. For example, current measurement gaps for GAI make it difficult to precisely estimate
    its potential ecosystem-level or longitudinal risks and related political, social, and economic impacts.
    Gaps between benchmarks and real-world use of GAI systems may likely be exacerbated due to prompt
    sensitivity and broad heterogeneity of contexts of use.
    A.1.5. Structured Public Feedback
    Structured public feedback can be used to evaluate whether GAI systems are performing as intended
    and to calibrate and verify traditional measurement methods. Examples of structured feedback include,
    but are not limited to:

    Participatory Engagement Methods: Methods used to solicit feedback from civil society groups,
    affected communities, and users, including focus groups, small user studies, and surveys.

    Field Testing: Methods used to determine how people interact with, consume, use, and make
    sense of AI-generated information, and subsequent actions and effects, including UX, usability,
    and other structured, randomized experiments.

    AI Red-teaming: A structured testing exercise used to probe an AI system to find flaws and
    vulnerabilities such as inaccurate, harmful, or discriminatory outputs, often in a controlled
    environment and in collaboration with system developers.
    Information gathered from structured public feedback can inform design, implementation, deployment
    approval, maintenance, or decommissioning decisions. Results and insights gleaned from these exercises
    can serve multiple purposes, including improving data quality and preprocessing, bolstering governance
    decision making, and enhancing system documentation and debugging practices. When implementing
    feedback activities, organizations should follow human subjects research requirements and best
    practices such as informed consent and subject compensation.
    How can organizations adjust their governance regimes to effectively manage the unique risks associated with generative AI? 47
    Appendix A. Primary GAI Considerations
    The following primary considerations were derived as overarching themes from the GAI PWG
    consultation process. These considerations (Governance, Pre-Deployment Testing, Content Provenance,
    and Incident Disclosure) are relevant for voluntary use by any organization designing, developing, and
    using GAI and also inform the Actions to Manage GAI risks. Information included about the primary
    considerations is not exhaustive, but highlights the most relevant topics derived from the GAI PWG.
    Acknowledgments: These considerations could not have been surfaced without the helpful analysis and
    contributions from the community and NIST staff GAI PWG leads: George Awad, Luca Belli, Harold Booth,
    Mat Heyman, Yooyoung Lee, Mark Pryzbocki, Reva Schwartz, Martin Stanley, and Kyra Yee.
    A.1. Governance
    A.1.1. Overview
    Like any other technology system, governance principles and techniques can be used to manage risks
    related to generative AI models, capabilities, and applications. Organizations may choose to apply their
    existing risk tiering to GAI systems, or they may opt to revise or update AI system risk levels to address
    these unique GAI risks. This section describes how organizational governance regimes may be re-
    evaluated and adjusted for GAI contexts. It also addresses third-party considerations for governing across
    the AI value chain.
    A.1.2. Organizational Governance
    GAI opportunities, risks and long-term performance characteristics are typically less well-understood
    than non-generative AI tools and may be perceived and acted upon by humans in ways that vary greatly.
    Accordingly, GAI may call for different levels of oversight from AI Actors or different human-AI
    configurations in order to manage their risks effectively. Organizations’ use of GAI systems may also
    warrant additional human review, tracking and documentation, and greater management oversight.
    AI technology can produce varied outputs in multiple modalities and present many classes of user
    interfaces. This leads to a broader set of AI Actors interacting with GAI systems for widely differing
    applications and contexts of use. These can include data labeling and preparation, development of GAI
    models, content moderation, code generation and review, text generation and editing, image and video
    generation, summarization, search, and chat. These activities can take place within organizational
    settings or in the public domain.
    Organizations can restrict AI applications that cause harm, exceed stated risk tolerances, or that conflict
    with their tolerances or values. Governance tools and protocols that are applied to other types of AI
    systems can be applied to GAI systems. These plans and actions include:
    • Accessibility and reasonable
    accommodations
    • AI actor credentials and qualifications
    • Alignment to organizational values
    • Auditing and assessment
    • Change-management controls
    • Commercial use
    • Data provenance
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 20
  • per_device_eval_batch_size: 20
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 20
  • per_device_eval_batch_size: 20
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.1
  • Transformers: 4.44.2
  • PyTorch: 2.4.1+cu121
  • Accelerate: 0.34.2
  • Datasets: 3.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
8
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for danicafisher/dfisher-base-sentence-transformer

Finetuned
(199)
this model