Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 24 days ago • 136
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision Paper • 2407.06189 • Published Jul 8, 2024 • 26
Can large language models provide useful feedback on research papers? A large-scale empirical analysis Paper • 2310.01783 • Published Oct 3, 2023 • 1
Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data Paper • 2401.08567 • Published Jan 16, 2024
μ-Bench: A Vision-Language Benchmark for Microscopy Understanding Paper • 2407.01791 • Published Jul 1, 2024 • 5
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation Paper • 2311.16201 • Published Nov 27, 2023
VideoAgent: Long-form Video Understanding with Large Language Model as Agent Paper • 2403.10517 • Published Mar 15, 2024 • 32
VideoAgent: Long-form Video Understanding with Large Language Model as Agent Paper • 2403.10517 • Published Mar 15, 2024 • 32
yuhuizhang/finetuned_gpt2_pretrainedTrue_mrpc_new_epochs20 Text Generation • Updated Mar 14, 2024 • 15
yuhuizhang/finetuned_gpt2_pretrainedTrue_cola_epochs3 Text Generation • Updated Mar 14, 2024 • 14
yuhuizhang/finetuned_gpt2-medium_pretrainedTrue_epochs3 Text Generation • Updated Mar 13, 2024 • 17
yuhuizhang/finetuned_distilgpt2_pretrainedTrue_mrpc_new_epochs20 Text Generation • Updated Feb 28, 2024 • 10
yuhuizhang/finetuned_distilgpt2_pretrainedTrue_mrpc_new_epochs5 Text Generation • Updated Feb 28, 2024 • 13