1 101 157

Anthonny Olime

Aviv-anthonnyolime

AI & ML interests

None yet

Recent Activity

updated a collection 3 days ago

Dataset

liked a dataset 3 days ago

DAMO-NLP-SG/multimodal_textbook

updated a collection 4 days ago

Audio model

View all activity

Organizations

Aviv-anthonnyolime's activity

upvoted a paper 4 days ago

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Paper • 2306.07691 • Published Jun 13, 2023 • 5

upvoted a collection 4 days ago

Scaling Test-Time Compute with Open Models

Collection

Models and datasets used in our blog post: https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute • 10 items • Updated about 2 hours ago • 19

upvoted 6 papers 13 days ago

Vision Transformers Need Registers

Paper • 2309.16588 • Published Sep 28, 2023 • 78

Deliberation in Latent Space via Differentiable Cache Augmentation

Paper • 2412.17747 • Published 14 days ago • 28

Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?

Paper • 2307.14023 • Published Jul 26, 2023 • 1

upvoted 12 papers 14 days ago

Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?

Paper • 2402.00340 • Published Feb 1, 2024 • 1

Optimizing Byte-level Representation for End-to-end ASR

Paper • 2406.09676 • Published Jun 14, 2024 • 1

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8, 2024 • 82

Contrastive Localized Language-Image Pre-Training

Paper • 2410.02746 • Published Oct 3, 2024 • 33

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Paper • 2410.02073 • Published Oct 2, 2024 • 41

Computational Bottlenecks of Training Small-scale Large Language Models

Paper • 2410.19456 • Published Oct 25, 2024 • 1

Towards Time Series Reasoning with LLMs

Paper • 2409.11376 • Published Sep 17, 2024 • 1

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

Paper • 2405.21048 • Published May 31, 2024 • 13

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum

Paper • 2405.13226 • Published May 21, 2024 • 1

4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Paper • 2406.09406 • Published Jun 13, 2024 • 14

Multimodal Autoregressive Pre-training of Large Vision Encoders

Paper • 2411.14402 • Published Nov 21, 2024 • 43

Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?

Paper • 2410.24019 • Published Oct 31, 2024 • 1