Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.13660

ByT5: Towards a token-free future with pre-trained byte-to-byte models

Paper • 2105.13626 • Published May 28, 2021 • 3
Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29, 2024 • 49
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers

Paper • 2305.07185 • Published May 12, 2023 • 9
Byte-Level Recursive Convolutional Auto-Encoder for Text

Paper • 1802.01817 • Published Feb 6, 2018

Collection of State Space Model and Mamba

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17, 2024 • 59
Vivim: a Video Vision Mamba for Medical Video Object Segmentation

Paper • 2401.14168 • Published Jan 25, 2024 • 2
HiPPO: Recurrent Memory with Optimal Polynomial Projections

Paper • 2008.07669 • Published Aug 17, 2020 • 1

StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization

Paper • 2311.14495 • Published Nov 24, 2023 • 1
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17, 2024 • 59
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

Paper • 2401.13560 • Published Jan 24, 2024 • 1
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces

Paper • 2402.00789 • Published Feb 1, 2024 • 2

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17, 2024 • 59
VMamba: Visual State Space Model

Paper • 2401.10166 • Published Jan 18, 2024 • 38
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

Paper • 2401.13560 • Published Jan 24, 2024 • 1
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces

Paper • 2402.00789 • Published Feb 1, 2024 • 2

Trellis Networks for Sequence Modeling

Paper • 1810.06682 • Published Oct 15, 2018 • 1
Pruning Very Deep Neural Network Channels for Efficient Inference

Paper • 2211.08339 • Published Nov 14, 2022 • 1
LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch

Paper • 2309.14157 • Published Sep 25, 2023 • 1
Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138

XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference

Paper • 2404.15420 • Published Apr 23, 2024 • 7
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22, 2024 • 126
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 253
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 44

MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 52

MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 52

Mambas and LLM-AltArch

Graph Mamba: Towards Learning on Graphs with State Space Models

Paper • 2402.08678 • Published Feb 13, 2024 • 13
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Paper • 2402.04248 • Published Feb 6, 2024 • 30
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 52
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Paper • 2401.09417 • Published Jan 17, 2024 • 59

MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 52
Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8, 2024 • 70
hustvl/Vim-tiny

Updated Jan 18, 2024 • 21

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs