Yongxin Guo's picture

Yongxin Guo

Yongxin-Guo

·

https://gyxxyg.github.io/yongxinguo/

gyxxyg

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

OpenAI o1 System Card

upvoted a paper 6 days ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

upvoted a paper 6 days ago

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

View all activity

Organizations

Yongxin-Guo's activity

upvoted a paper 4 days ago

OpenAI o1 System Card

Paper • 2412.16720 • Published 16 days ago • 29

upvoted 2 papers 6 days ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published 21 days ago • 49

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published 12 days ago • 86

New activity in Yongxin-Guo/TRACE 12 days ago

Missing ${SPLIT}.caption_coco_format.json in dense_video_caption/ActivityNet_Captions

#1 opened 13 days ago by

updated a dataset 12 days ago

Yongxin-Guo/TRACE

Preview • Updated 12 days ago • 107 • 3

upvoted a paper 13 days ago

Parallelized Autoregressive Visual Generation

Paper • 2412.15119 • Published 18 days ago • 49

upvoted 6 papers 16 days ago

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published 20 days ago • 91

Autoregressive Video Generation without Vector Quantization

Paper • 2412.14169 • Published 19 days ago • 14

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 19 days ago • 118

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 18 days ago • 48

Progressive Multimodal Reasoning via Active Retrieval

Paper • 2412.14835 • Published 18 days ago • 71

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 18 days ago • 337

upvoted a paper 19 days ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published 24 days ago • 83

upvoted 7 papers 20 days ago

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published 25 days ago • 87

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Paper • 2412.10302 • Published 24 days ago • 11

Large Concept Models: Language Modeling in a Sentence Representation Space

Paper • 2412.08821 • Published 26 days ago • 11

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published about 1 month ago • 123

SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

Paper • 2412.07760 • Published 27 days ago • 50

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published 25 days ago • 92

Phi-4 Technical Report

Paper • 2412.08905 • Published 25 days ago • 97