Scaling Transformers for Low-Bitrate High-Quality Speech Coding Paper • 2411.19842 • Published Nov 29, 2024 • 10
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion Paper • 2411.18552 • Published Nov 27, 2024 • 17
AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis Paper • 2406.08920 • Published Jun 13, 2024 • 7
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation Paper • 2409.03525 • Published Sep 5, 2024 • 12
Efficient Audio Captioning with Encoder-Level Knowledge Distillation Paper • 2407.14329 • Published Jul 19, 2024 • 4
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound Paper • 2405.00233 • Published Apr 30, 2024 • 13
OmniCount: Multi-label Object Counting with Semantic-Geometric Priors Paper • 2403.05435 • Published Mar 8, 2024 • 1
Actor-agnostic Multi-label Action Recognition with Multi-modal Query Paper • 2307.10763 • Published Jul 20, 2023