Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
2
3
18
Ayaan Sharif
Ayaan-Sharif
Follow
Doge-GPT's profile picture
21world's profile picture
julien-c's profile picture
3 followers
·
18 following
https://shariif.tech
Ayaan_Shariif
Ayaan-Shariif
AI & ML interests
NLP, LLM, TEXT, Languages
Recent Activity
replied
to
sanchit-gandhi
's
post
4 days ago
Why does returning timestamps help Whisper reduce hallucinations? 🧐 Empirically, most practitioners have found that setting `return_timestamps=True` helps reduce hallucinations, particularly when doing long-form evaluation with Transformers’ “chunked” algorithm. But why does this work?.. My interpretation is that forcing the model to predict timestamps is contradictory to hallucinations. Suppose you have the transcription: ```markdown The cat sat on the on the on the mat. ``` Where we have a repeated hallucination for “on the”. If we ask the model to predict timestamps, then the “on the” has to contribute to the overall segment-level timing, e.g.: ```markdown <|0.00|> The cat sat on the on the on the mat.<|5.02|> ``` However, it’s impossible to fit 3 copies of “on the” within the time allocation given to the segment, so the probability for this hallucinatory sequence becomes lower, and the model actually predicts the correct transcription with highest probability: ```markdown <|0.00|> The cat sat on the mat.<|5.02|> ``` In this sense, the end timestamp is of the opposite of the initial timestamp constraint they describe in Section 4.5 of the paper https://huggingface.co/papers/2212.04356 → it helps the model remove extra words at the end of the sequence (rather than the initial timestamp which helps when the model ignores words at the start), but the overall principle is the same (using timestamps to improve the probability of more realistic sequences). Leaving it open to you: why do you think timestamps reduces Whisper hallucinations?
liked
a Space
5 days ago
R1ckShi/FunClip
liked
a Space
5 days ago
sanchit-gandhi/whisper-jax
View all activity
Organizations
None yet
Ayaan-Sharif
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
3 Spaces
5 days ago
Running
24
🚀
ClipVideo
Running
2.41k
⚡️
Whisper JAX
Running
116
🚀
Ebook2audiobook V2.0 Beta
Added improvements, 1107+ languages supported
liked
a model
10 days ago
huggyllama/llama-7b
Text Generation
•
Updated
Jul 2, 2024
•
319k
•
308
liked
a model
12 days ago
deepseek-ai/DeepSeek-V3-Base
Updated
7 days ago
•
8.17k
•
1.16k
liked
a Space
13 days ago
Running
450
🌍
QVQ 72B Preview
liked
a Space
17 days ago
Running
93
💻
Llmlingua 2
liked
2 models
17 days ago
THUDM/cogvlm2-llama3-caption
Video-Text-to-Text
•
Updated
Sep 26, 2024
•
11.5k
•
76
Neurazum/Xbai-Epilepsy-1.0
Video-Text-to-Text
•
Updated
Nov 11, 2024
•
2
liked
a Space
29 days ago
Running
on
Zero
126
🎥📸💬
VideoLLaMA2
Media understanding
liked
a Space
about 1 month ago
Running
833
🔍
QwQ-32B-Preview
QwQ-32B-Preview
liked
a model
about 1 month ago
Qwen/QwQ-32B-Preview
Text Generation
•
Updated
Nov 29, 2024
•
94.7k
•
•
1.5k
liked
a model
about 2 months ago
cognitivecomputations/dolphin-2.9.2-qwen2-72b
Text Generation
•
Updated
Oct 8, 2024
•
9.87k
•
136
liked
a dataset
4 months ago
HuggingFaceFV/finevideo
Viewer
•
Updated
21 days ago
•
39.5k
•
5.17k
•
285
liked
2 datasets
5 months ago
fka/awesome-chatgpt-prompts
Viewer
•
Updated
about 9 hours ago
•
203
•
5.68k
•
6.72k
princeton-nlp/SWE-bench_Verified
Viewer
•
Updated
Dec 2, 2024
•
500
•
42.3k
•
124
liked
a Space
5 months ago
Running
on
Zero
450
🔥
Florence2 + SAM2
liked
a model
5 months ago
LanguageBind/Open-Sora-Plan-v1.2.0
Updated
Sep 7, 2024
•
2
•
47