Parallel Sentences Datasets Collection These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual. • 14 items • Updated Oct 9, 2024 • 14
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 6 days ago • 113
Deliberation in Latent Space via Differentiable Cache Augmentation Paper • 2412.17747 • Published 29 days ago • 29
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
emrecan/bert-base-turkish-cased-mean-nli-stsb-tr Sentence Similarity • Updated Jan 24, 2022 • 129k • 35