Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks Paper • 2307.02477 • Published Jul 5, 2023
Open-vocabulary Queryable Scene Representations for Real World Planning Paper • 2209.09874 • Published Sep 20, 2022
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion Paper • 2407.01392 • Published Jul 1, 2024 • 40
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities Paper • 2401.12168 • Published Jan 22, 2024 • 26